### Exemplar Retrieval Model(ERM) - Sieve 2001
* Step1: **Encode cues**: profile of the stimulus examined
* Step2: **Exemplar retrieval**: 
    - $$ similarity(j, k) = \prod s^{d_k}$$
        - $k$ from 1 to $j-1$;
        - similarity: a multiplicative combination along various dimensions; 
        - $d_k$ the number of mismatching features between $j$ and $k$;
        - $s$ the similarity of mismatching values for each feature (measuring the degree to which respondants fail to notice mismatching values: if $s=1$, the mismatch is not noticed and the similarity is not influenced by that dimension; if $s=0$, the mismatch in this dimension overrules all other dimensions and nullifies the similarity regardless of how many other matching features there are)).
    - The probability of a previous instance $k$ is retrived given the new stimulus $j$ is observed
$$p(retrieve\_exemplar\_K\ |\ new\_stimulus\_J) = \frac{similarity(j,k)}{\sum_k{similarity(j,k)}}$$

    - The total probability of any previous TYPICAL instances are retrived given the new stimulus $j$ is observed
$$p(k \in T\ |\ new\_stimulus\_J) = \frac{\sum_{k\in T}{similarity(j,k)}}{\sum_{k\in T}{similarity(j,k)}+\sum_{k\notin T}{similarity(j,k)}}$$
    
* Step3: **Balance assessment**:
    - $$ S_N = \sum^N_{i=1}{X_i}$$
        - $X_i$: each of the outcomes in the sample of retrieved cases (eg: $X_i = 1$ if the $i$th exemplar is TYPICAL, and $X_i = -1$ if otherwise);
        - $N$: constant representing the number of past cases that were retrieved on each trial at the time of the choice response
* Step4: **Choice**: The respondent chooses the category that is favored in Step3.
* Step5: **Probability Judgement**:
    - $$F_{T,N} = \frac{\eta + N +S_N}{\eta+\theta +2N}$$
        - $\frac{\eta}{\eta+\theta}$: personal probability distribution prior to any retrieval

### Exemplar Model (Nilsson 2008)
* The probability that object $t$ belongs to Category $A$:
    - $$p(A) = \frac{\sum_{t}{similarity(t\ |\ x_i)}*c(x_i)}{\sum_t{similarity(t\ |\ x_i)}}$$
        - $x_i$: exemplars from memory $i = 1, 2, ..., I$
        - $c(x_i) = 1$ if $x_1$ belong to $A$; $c(x_i)= 0$ otherwise.
        

#### Simulation data from Nilsson 2008

In [59]:
file = open("data/Sim_Nilsson_2008.txt")
title = file.readline()
col_title = file.readline().strip().split(" ")
exemplar_list = []
for line in file:
    line = line.strip()
    if (line):
        exemplar = line.split(" ")
        exemplar_list.append(exemplar)
print(title)
print(col_title)
for exemplar in exemplar_list:
    print(exemplar)

Category structure with the 12 unique exemplars and their presentation frequency in each category

['E', 'C1', 'C2', 'C3', 'C4', 'FreqA', 'FreqB']
['E1', '0', '0', '0', '0', '0', '14']
['E2', '0', '0', '0', '1', '0', '6']
['E3', '0', '0', '1', '0', '0', '1']
['E4', '0', '1', '0', '0', '0', '1']
['E5', '1', '0', '0', '0', '5', '1']
['E6', '0', '1', '1', '0', '1', '1']
['E7', '1', '0', '0', '1', '1', '1']
['E8', '0', '1', '1', '1', '6', '0']
['E9', '1', '0', '1', '1', '1', '0']
['E10', '1', '1', '0', '1', '1', '0']
['E11', '1', '1', '1', '0', '1', '5']
['E12', '1', '1', '1', '1', '14', '0']


The two most important feature combinations are E5 and E11 (further on referred to as critical exemplars). The critical exemplars share a low number of features with the members of the category they belong more often to (0.96 features on average) and a high number of features with the category they belong less often to (2.53 features on average). Remember, a representativeness effect occurs when an object is judged as belonging to a category to which it seldom belongs only because it shares a high number of features with the members of that category. Accordingly, representativeness effects are hypothesized to occur in the probability judgments of the critical exemplars.

In [76]:
import random
import numpy as np
import matplotlib as plt
from datetime import datetime
import copy

In [77]:
# Step -1: Read files
def read_features(filename):
    file = open(filename)
    features = []
    for line in file:
        features.append(line.strip())
    return features

def read_distribution(filename):
    file = open(filename)
    categories = file.readline().strip().split(",")
    distribution = []
    for line in file:
        prob = line.strip().split(",")
        for i in range(len(prob)):
            prob[i] = float(prob[i])
        prob = np.array(prob)
        distribution.append(prob)
    distribution = np.array(distribution)    
    return categories, distribution

In [78]:
features = read_features("data/linda_adjectives.txt")
categories, distribution = read_distribution("data/linda_distribution.txt")
print(features)
print(categories)
print(distribution)

['Around 30 years old', 'Single', 'Outspoken', 'Intelligent', 'Humanities major', 'Concerned with discrimination and social justice', 'Participated in demonstrations', 'Female']
['B', 'F', 'NotB_NotF', 'BF']
[[ 0.7  0.4  0.3  0.7]
 [ 0.4  0.8  0.5  0.8]
 [ 0.3  0.8  0.4  0.8]
 [ 0.5  0.7  0.5  0.7]
 [ 0.2  0.8  0.5  0.8]
 [ 0.4  0.9  0.4  0.9]
 [ 0.2  0.8  0.2  0.8]
 [ 0.8  0.9  0.5  0.9]]


In [79]:
# Step 1: Generate exemplars
def generate_exemplars(features, distribution, category_index, numExemplars):
    distr_category = distribution[:,category_index]
    exemplars = []
    for i in range(numExemplars):
        exemplar = np.zeros(len(features))        
        for j in range(len(distr_category)):
            random.seed(datetime.now())
            n = random.random()
            if (n >= (1-distr_category[j])):
                exemplar[j] = 1
        exemplars.append(exemplar)
    exemplars = np.array(exemplars)
    return exemplars

In [106]:
bank_teller_exemplars = generate_exemplars(features, distribution, 0, 3)
feminist_exemplars = generate_exemplars(features, distribution, 1, 2)
not_bank_not_feminist_exemplars = generate_exemplars(features, distribution, 2, 5)
feminist_banker_exemplars = generate_exemplars(features, distribution, 3, 1)
print("B:",bank_teller_exemplars)
print("F:",feminist_exemplars)
print("NBNF:",not_bank_not_feminist_exemplars)
print("FB:",feminist_banker_exemplars)

B: [[ 1.  1.  0.  0.  0.  0.  0.  0.]
 [ 1.  0.  1.  1.  0.  1.  1.  1.]
 [ 1.  0.  0.  0.  0.  0.  1.  1.]]
F: [[ 0.  0.  1.  1.  1.  1.  0.  0.]
 [ 0.  1.  1.  1.  1.  1.  1.  0.]]
NBNF: [[ 0.  0.  0.  0.  0.  0.  1.  0.]
 [ 0.  1.  0.  0.  0.  0.  1.  1.]
 [ 0.  0.  0.  0.  1.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  1.]]
FB: [[ 0.  1.  1.  1.  1.  1.  0.  1.]]


In [107]:
all_exemplars = np.concatenate((bank_teller_exemplars, feminist_exemplars, not_bank_not_feminist_exemplars, feminist_banker_exemplars))
print(all_exemplars)

[[ 1.  1.  0.  0.  0.  0.  0.  0.]
 [ 1.  0.  1.  1.  0.  1.  1.  1.]
 [ 1.  0.  0.  0.  0.  0.  1.  1.]
 [ 0.  0.  1.  1.  1.  1.  0.  0.]
 [ 0.  1.  1.  1.  1.  1.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.]
 [ 0.  1.  0.  0.  0.  0.  1.  1.]
 [ 0.  0.  0.  0.  1.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  1.]
 [ 0.  1.  1.  1.  1.  1.  0.  1.]]


In [109]:
# step 2: exemplar retrieval
def similarity(j,k,feat_weights):
    simil = 1
    for i in range(len(j)):
        # let s for every dimension be 0.5 so every dim is equal for now
        # for now, each dimension only has one feature which is binary
        s = feat_weights[i]
        if j[i] == k[i]:
            s = s**0
        simil = simil * s
    return simil

$$ similarity(j, k) = \prod s^{d_k}$$

In [110]:
feat_weights = [0.5,0.5,0.5,0.5]
print(similarity(['0','0','0','0'],['0','0','0','1'],feat_weights))
print(similarity(['0','0','0','0'],['0','0','1','1'],feat_weights))

0.5
0.25


In [134]:
def prob_retrieve_k(exemplar_list, j, k, feat_weights):
    # probability of retrieve a single instance
    inMem = 0
    for exemplar in exemplar_list:
        if np.array_equal(k, exemplar):
            inMem = 1
            break
    if inMem == 0:
        return 0
    total = 0
    sim_jk = similarity(j,k,feat_weights)
    for i in exemplar_list:
        sim_ji = similarity(j,i,feat_weights)
        total += sim_ji
    prob = sim_jk/total
    
    return prob

$$p(retrieve\_exemplar\_K\ |\ new\_stimulus\_J) = \frac{similarity(j,k)}{\sum_k{similarity(j,k)}}$$

In [137]:
feat_weights = [0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5]
test = np.array([ 1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.])
print(prob_retrieve_k(bank_teller_exemplars, test, test, feat_weights))
print(prob_retrieve_k(feminist_exemplars, test, test, feat_weights))
print(prob_retrieve_k(all_exemplars, test, test, feat_weights))


total = 0
for exemplar in all_exemplars:
    total += prob_retrieve_k(all_exemplars, test, exemplar, feat_weights)
print(total)

0.8767123287671232
0
0.6153846153846154
0.9999999999999997
