# Faster CMR
Is it possible to make an even faster version of our CMR likelihood function? One that can access parallelization, the GPU, and other features of numba? To find out, we'll develop a purely functional version of CMR.

## Toy Example
Let's start by exploring whether it's possible to parallelize a function that wraps use of a jit class. We'll define a simple class that initializes with two numbers and supports a function that produces the sum of them.

In [None]:
import numpy as np
from numba import float64
from numba.experimental import jitclass
from numba import njit, prange

toy_spec = [
    ('A', float64[:]),
]

@jitclass(toy_spec)
class Toy:
    
    def __init__(self, experiment_count):
        
        self.A = np.zeros(experiment_count)
        
    def toy_function(self, index, value):
        self.A[index] = value

In [None]:
@njit(parallel=True)
def test(experiment_count):
    
    toyinstance = Toy(experiment_count)
    for i in prange(experiment_count):
        for j in prange(experiment_count):
            toyinstance.toy_function(i, j)
            
test(1)

With regular njit, 1000 iterations takes 99us. When I turn on prange, speed actually falls to 213 us. When I don't use jitclass, parallel still doesn't walk, but speed improves to 133 ns. 

For some reason, parallelization is slowing down code I expected to be sped up. But the relationship changes when I avoid break statements and whatnot. Maybe there's a cost to parallelizing that I need to have a reason to pay.

14.1 us with parallelization. 676 without.

So parallelization can optimize code that uses jit classes, even within nested loops.

This means that a parallelized compute_likelihood is probably possible, even if a version of CMR that avoids jitclass is impractical.

In [None]:
from instance_cmr.models import *
import numpy as np
from numba import njit
from numba.typed import List

@njit(parallel=True)
def cmr_likelihood(data_to_fit, item_counts, encoding_drift_rate, start_drift_rate, recall_drift_rate, shared_support,
        item_support, learning_rate, primacy_scale, primacy_decay, stop_probability_scale, stop_probability_growth, 
        choice_sensitivity):
    """
    Generalized cost function for fitting the InstanceCMR model optimized using the numba library.
    
    Output scales inversely with the likelihood that the model and specified parameters would generate the specified
    trials. For model fitting, is usually wrapped in another function that fixes and frees parameters for optimization.
    
    **Arguments**:
    - trials: int64-array where rows identify a unique trial of responses and columns corresponds to a unique recall
      index.  
    - A configuration for each parameter of `InstanceCMR` as delineated in `Formal Specification`.
    
    **Returns** the negative sum of log-likelihoods across specified trials conditional on the specified parameters and
    the mechanisms of InstanceCMR.
    """
    
    result = 0.0
    for i in range(len(item_counts)):
        item_count = item_counts[i]
        trials = data_to_fit[i]
        
        true_model = CMR(item_count, item_count, encoding_drift_rate, 
                    start_drift_rate, recall_drift_rate, shared_support,
                    item_support, learning_rate, primacy_scale, 
                    primacy_decay, stop_probability_scale, 
                    stop_probability_growth, choice_sensitivity)

        true_model.experience(np.eye(item_count, item_count))

        likelihood = np.ones((len(trials), item_count))

        for trial_index in range(len(trials)):
            
            model = true_model.copy()
            
            trial = trials[trial_index]

            model.force_recall()
            for recall_index in range(len(trial) + 1):

                # identify index of item recalled; if zero then recall is over
                if recall_index == len(trial) and len(trial) < item_count:
                    recall = 0
                else:
                    recall = trial[recall_index]

                # store probability of and simulate recalling item with this index
                likelihood[trial_index, recall_index] = \
                    model.outcome_probabilities(model.context)[recall]

                if recall == 0:
                    break
                model.force_recall(recall)

            # reset model to its pre-retrieval (but post-encoding) state
            model.force_recall(0)
        
        result -= np.sum(np.log(likelihood))
        
    return result

def cmr_objective_function(data_to_fit, fixed_parameters, free_parameters):
    """
    Configures cmr_likelihood for parameter search over specified free and fixed parameters.
    """
    return lambda x: cmr_likelihood(data_to_fit, **{**fixed_parameters, **{
        free_parameters[i]:x[i] for i in range(len(x))}})

Using these functions, we'll search for and visualize a parameter fit of the CMR model to a slice of data sampled from the classic Murdock (1962) study demonstrating the serial position curve, a pattern where early and later presented items tend to be recalled more often than middle items in a list-learning experiment. The data associated with the study is located at `data/MurdData_clean.mat`.

In [None]:
from instance_cmr.model_analysis import *

murd_trials0, murd_events0, murd_length0 = prepare_murddata(
    '../data/MurdData_clean.mat', 0)
print(murd_length0, np.shape(murd_trials0))

murd_trials1, murd_events1, murd_length1 = prepare_murddata(
    '../data/MurdData_clean.mat', 1)
print(murd_length1, np.shape(murd_trials1))

murd_events1.head()

20 (1200, 15)
30 (1200, 15)


Unnamed: 0,subject,list,item,input,output,study,recall,repeat,intrusion
0,1,1,1,1,,True,False,0,False
1,1,1,2,2,,True,False,0,False
2,1,1,3,3,,True,False,0,False
3,1,1,4,4,,True,False,0,False
4,1,1,5,5,,True,False,0,False


|    |   subject |   list |   item |   input |   output | study   | recall   |   repeat | intrusion   |
|---:|----------:|-------:|-------:|--------:|---------:|:--------|:---------|---------:|:------------|
|  0 |         1 |      1 |      1 |       1 |        5 | True    | True     |        0 | False       |
|  1 |         1 |      1 |      2 |       2 |        7 | True    | True     |        0 | False       |
|  2 |         1 |      1 |      3 |       3 |      nan | True    | False    |        0 | False       |
|  3 |         1 |      1 |      4 |       4 |      nan | True    | False    |        0 | False       |
|  4 |         1 |      1 |      5 |       5 |      nan | True    | False    |        0 | False       |

First, we'll make sure cmr_likelihood returns valid values and has adequate performance.

In [None]:
lb = np.finfo(float).eps
hand_fit_parameters = {
    'item_counts': List([murd_length0, murd_length1]),
    'encoding_drift_rate': .8,
    'start_drift_rate': .7,
    'recall_drift_rate': .8,
    'shared_support': 0.01,
    'item_support': 1.0,
    'learning_rate': .3,
    'primacy_scale': 1,
    'primacy_decay': 1,
    'stop_probability_scale': 0.01,
    'stop_probability_growth': 0.3,
    'choice_sensitivity': 2
}
cmr_likelihood(List([murd_trials0[:80], murd_trials1[:80]]), **hand_fit_parameters)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
[1m[1mUnknown attribute 'copy' of type instance.jitclass.CMR#207ff7fdee0<item_count:int32,encoding_drift_rate:float64,start_drift_rate:float64,recall_drift_rate:float64,shared_support:float64,item_support:float64,learning_rate:float64,primacy_scale:float64,primacy_decay:float64,stop_probability_scale:float64,stop_probability_growth:float64,choice_sensitivity:float64,context:array(float64, 1d, C),preretrieval_context:array(float64, 1d, C),recall:array(float64, 1d, C),retrieving:bool,recall_total:int32,primacy_weighting:array(float64, 1d, C),probabilities:array(float64, 1d, C),mfc:array(float64, 2d, C),mcf:array(float64, 2d, C),encoding_index:int32,items:array(float64, 2d, C)>
[1m
File "<ipython-input-3-2c96165513ff>", line 42:[0m
[1mdef cmr_likelihood(data_to_fit, item_counts, encoding_drift_rate, start_drift_rate, recall_drift_rate, shared_support,
    <source elided>
            
[1m            model = true_model.copy()
[0m            [1m^[0m[0m
[0m
[0m[1mDuring: typing of get attribute at <ipython-input-3-2c96165513ff> (42)[0m
[1m
File "<ipython-input-3-2c96165513ff>", line 42:[0m
[1mdef cmr_likelihood(data_to_fit, item_counts, encoding_drift_rate, start_drift_rate, recall_drift_rate, shared_support,
    <source elided>
            
[1m            model = true_model.copy()
[0m            [1m^[0m[0m


In [None]:
%%timeit
cmr_likelihood(List([murd_trials0[:80], murd_trials1[:80]]), **hand_fit_parameters)

Baseline is 4.26. 

What if I turn off nogil? It's about the same

What if I turn off fastmath? It's slightly but not meaningfully slower.

And now parallel. It's slower. 6.02 ms. 

Let's add some pranges. I'd have to put model initialization within the trial loop for that to make sense. Without prange, that slows down the likelihood function by a factor of 10 (to 41.2). Enabling parallelization gets it back down to 11.9.

If I can find some way to simulate by trial in parallel without initializing many times, that might be great.

Otherwise, I can still wield parallelization to speed up code that _must_ re-initialize the model over and over again.