In [None]:
# default_exp model_fitting

# CMR-DE Generalization Tests
Does adding a differential encoding mechanism improve CMR's capacity to generalize across conditions of the Lohnas (2014) dataset? To find out, we'll fit CMR and CMR-DE at the group and individual level to conditions of the Lohnas (2014) dataset and apply the fitted models' to predict data from held-out conditions. 

The Lohnas (2014) dataset contains data from the following conditions:

1. Control lists that contained all once-presented items;  
2. pure massed lists containing all twice-presented items; 
3. pure spaced lists consisting of items presented twice at lags 1-8, where lag is defined as the number of intervening items between a repeated item's presentations; 
4. mixed lists consisting of once presented, massed and spaced items. Within each session, subjects encountered three lists of each of these four types. 

In each list there were 40 presentation positions, such that in the control lists each position was occupied by a unique list item, and in the pure massed and pure spaced lists, 20 unique words were presented twice to occupy the 40 positions. In the mixed lists 28 once-presented and six twice-presented words occupied the 40 positions. In the pure spaced lists, spacings of repeated items were chosen so that each of the lags 1-8 occurred with equal probability. In the mixed lists, massed repetitions (lag=0) and spaced repetitions (lags 1-8) were chosen such that each of the 9 lags of 0-8 were used exactly twice within each session. The order of presentation for the different list types was randomized within each session. For the first session, the first four lists were chosen so that each list type was presented exactly once. An experimenter sat in with the subject for these first four lists, though no subject had difficulty understanding the task.

If, as we predict, CMR-DE generalizes better across Lohnas (2015) conditions than the original model, that will be powerful evidence that the original model isn't a sufficient account of repetition and spacing effects in free recall, if paired with an analysis of precisely how the model fails and exploration of the extended model's novel predictions.

## Initial Fitting Between Subjects
For each condition and subject, find the best fit parameters for the CMR model.

In [None]:
from compmemlearn.fitting import apply_and_concatenate, cmr_rep_objective_function
from compmemlearn.datasets import prepare_lohnas2014_data, simulate_data
from compmemlearn.analyses import recall_probability_by_lag
import pandas as pd
from psifr import fr
from scipy.optimize import differential_evolution
from numba.typed import List
import numpy as np

trials, events, list_length, presentations, list_types, rep_data, subjects = prepare_lohnas2014_data(
    '../../data/repFR.mat')

events.head()

Unnamed: 0,subject,list,item,input,output,study,recall,repeat,intrusion,condition
0,1,1,0,1,1.0,True,True,0,False,4
1,1,1,1,2,2.0,True,True,0,False,4
2,1,1,2,3,3.0,True,True,0,False,4
3,1,1,3,4,4.0,True,True,0,False,4
4,1,1,4,5,5.0,True,True,0,False,4


In [None]:
from scipy.optimize import differential_evolution
from numba.typed import List

free_parameters = [
    'encoding_drift_rate',
    'start_drift_rate',
    'recall_drift_rate',
    'shared_support',
    'item_support',
    'learning_rate',
    'primacy_scale',
    'primacy_decay',
    'stop_probability_scale',
    'stop_probability_growth',
    'choice_sensitivity',
    'familiarity_scale']

lb = np.finfo(float).eps
ub = 1-np.finfo(float).eps

bounds = [
    (lb, ub),
    (lb, ub),
    (lb, ub),
    (lb, ub),
    (lb, ub),
    (lb, ub),
    (lb, 100),
    (lb, 100),
    (lb, ub),
    (lb, 10),
    (lb, 10),
    (lb, 10),
]