### Exemplar Retrieval Model(ERM) - Sieve 2001
* Step1: **Encode cues**: profile of the stimulus examined
* Step2: **Exemplar retrieval**: 
    - $$ similarity(j, k) = s^{d_k}$$
        - $k$ from 1 to $j-1$;
        - similarity: a multiplicative combination along various dimensions; 
        - $d_k$ the number of mismatching features between $j$ and $k$;
        - $s$ the similarity of mismatching values for each feature (measuring the degree to which respondants fail to notice mismatching values: if $s=1$, the mismatch is not noticed and the similarity is not influenced by that dimension; if $s=0$, the mismatch in this dimension overrules all other dimensions and nullifies the similarity regardless of how many other matching features there are)).
    - The probability of a previous instance $k$ is retrived given the new stimulus $j$ is observed
$$p(retrieve\_exemplar\_K\ |\ new\_stimulus\_J) = \frac{similarity(j,k)}{\sum_k{similarity(j,k)}}$$

    - The total probability of any previous TYPICAL instances are retrived given the new stimulus $j$ is observed
$$p(k \in T\ |\ new\_stimulus\_J) = \frac{\sum_{k\in T}{similarity(j,k)}}{\sum_{k\in T}{similarity(j,k)}+\sum_{k\notin T}{similarity(j,k)}}$$
    
* Step3: **Balance assessment**:
    - $$ S_N = \sum^N_{i=1}{X_i}$$
        - $X_i$: each of the outcomes in the sample of retrieved cases (eg: $X_i = 1$ if the $i$th exemplar is TYPICAL, and $X_i = -1$ if otherwise);
        - $N$: constant representing the number of past cases that were retrieved on each trial at the time of the choice response
* Step4: **Choice**: The respondent chooses the category that is favored in Step3.
* Step5: **Probability Judgement**:
    - $$F_{T,N} = \frac{\eta + N +S_N}{\eta+\theta +2N}$$
        - $\frac{\eta}{\eta+\theta}$: personal probability distribution prior to any retrieval

### Exemplar Model (Nilsson 2008)
* The probability that object $t$ belongs to Category $A$:
    - $$p(A) = \frac{\sum_{t}{similarity(t\ |\ x_i)}*c(x_i)}{\sum_t{similarity(t\ |\ x_i)}}$$
        - $x_i$: exemplars from memory $i = 1, 2, ..., I$
        - $c(x_i) = 1$ if $x_1$ belong to $A$; $c(x_i)= 0$ otherwise.
        

#### Simulation data from Nilsson 2008

In [59]:
file = open("data/Sim_Nilsson_2008.txt")
title = file.readline()
col_title = file.readline().strip().split(" ")
exemplar_list = []
for line in file:
    line = line.strip()
    if (line):
        exemplar = line.split(" ")
        exemplar_list.append(exemplar)
print(title)
print(col_title)
for exemplar in exemplar_list:
    print(exemplar)

Category structure with the 12 unique exemplars and their presentation frequency in each category

['E', 'C1', 'C2', 'C3', 'C4', 'FreqA', 'FreqB']
['E1', '0', '0', '0', '0', '0', '14']
['E2', '0', '0', '0', '1', '0', '6']
['E3', '0', '0', '1', '0', '0', '1']
['E4', '0', '1', '0', '0', '0', '1']
['E5', '1', '0', '0', '0', '5', '1']
['E6', '0', '1', '1', '0', '1', '1']
['E7', '1', '0', '0', '1', '1', '1']
['E8', '0', '1', '1', '1', '6', '0']
['E9', '1', '0', '1', '1', '1', '0']
['E10', '1', '1', '0', '1', '1', '0']
['E11', '1', '1', '1', '0', '1', '5']
['E12', '1', '1', '1', '1', '14', '0']


The two most important feature combinations are E5 and E11 (further on referred to as critical exemplars). The critical exemplars share a low number of features with the members of the category they belong more often to (0.96 features on average) and a high number of features with the category they belong less often to (2.53 features on average). Remember, a representativeness effect occurs when an object is judged as belonging to a category to which it seldom belongs only because it shares a high number of features with the members of that category. Accordingly, representativeness effects are hypothesized to occur in the probability judgments of the critical exemplars.

In [60]:
import random
import numpy as np
import matplotlib as plt

In [67]:
# Step 1: encode exemplars
# proportion: sample out for testing
def encodeExemplar(exemplar_list):
    exemplar_encoded = [[],[]]
    for exemplar in exemplar_list:
        dim = exemplar[1:5]
        freq = exemplar[-2:]
        for i in range(int(freq[0])):
            exemplar_encoded[0].append(dim)
        for i in range(int(freq[1])):
            exemplar_encoded[1].append(dim)
    return exemplar_encoded

In [70]:
exemplar_encoded = encodeExemplar(exemplar_list)
print("Encoded:", exemplar_encoded)

Encoded: [[['1', '0', '0', '0'], ['1', '0', '0', '0'], ['1', '0', '0', '0'], ['1', '0', '0', '0'], ['1', '0', '0', '0'], ['0', '1', '1', '0'], ['1', '0', '0', '1'], ['0', '1', '1', '1'], ['0', '1', '1', '1'], ['0', '1', '1', '1'], ['0', '1', '1', '1'], ['0', '1', '1', '1'], ['0', '1', '1', '1'], ['1', '0', '1', '1'], ['1', '1', '0', '1'], ['1', '1', '1', '0'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1'], ['1', '1', '1', '1']], [['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '0'], ['0', '0', '0', '1'

In [72]:
proportion = 0.2
exemplars_test = random.sample(exemplar_encoded, int(np.round(len(exemplar_list)*proportion)))
print("Testing:", exemplar_testing)

Testing: [['0', '1', '1', '0'], ['1', '1', '0', '1']]


In [74]:
# step 2: exemplar retrieval
def similarity(j,k):
    simil = 1
    for i in range(len(j)):
        # let s for every dimension be 0.5 so every dim is equal for now
        # for now, each dimension only has one feature which is binary
        s = 0.5
        if j[i] == k[i]:
            s = s**0
        simil = simil * s
    return simil

In [96]:
def prob_retrieve_k(exemplar_encoded, j, k):
    # probability of retrieve a single instance
    count = exemplar_encoded[0].count(k)+exemplar_encoded[1].count(k)
    total = 0
    sim_jk = similarity(j,k)
    for category in exemplar_encoded:
        for i in category:
            total += similarity(j,i)
    prob = (count*sim_jk)/total
    
    return prob

In [97]:
print(similarity(['0','0','0','0'],['0','0','0','1']))
print(similarity(['0','0','0','0'],['0','0','1','1']))

0.5
0.25


In [98]:
print(prob_retrieve_k(exemplar_encoded, ['0','0','0','0'], ['0','0','0','0']))
total = 0
for exemplar in exemplar_list:
    k = exemplar[1:5]
    total += prob_retrieve_k(exemplar_encoded, ['0','1','1','1'], k)
print(total)

0.5685279187817259
1.0000000000000002
