# N Back Task

Theoretical description of the modelling framework: N-Back task.

------------



## Define Statespace

Define also possible variables in the model, formalising the task as a probabilistic mathematical construct.

----------

## Action Space: Output Space

The $\mathcal{A}: Action \; Space$ is the number of possible actions an agent can take. The action sequence in the *N-Back* task can be encoded in binary:

$$: 
\begin{equation}
    \mathcal{A} := 
    \begin{cases}
      0, & \text{if}\  \text{agent} \; \mathcal{a} \; \text{signals an nback match} \\
      1, & \text{otherwise}
    \end{cases}
  \end{equation}
$$

----------

## State Space: Input Space

The $\mathcal{S}: State \; Space$ represents the encoding of the environment. It should be rich enough to capture the complexity of the problem, but not so rich as to totally defeat the possibility of learning a functional approximation given the available data,

In our task, there are $15$ unique letters, encoded:


$$\{A, B, C, D, ..., O\}  \; \rightarrow \; \{1,2,3,...,15\}$$


#### Covariates

With this encoding, there are $4$ covariates:

$$
\begin{eqnarray} 
\phi_0 &:=& \text{current card} \\
\phi_1 &:=& \text{1 card back} \\
\phi_2 &:=& \text{2 cards back} \\
\phi_3 &:=& \text{3 cards back} \\
\tau &:=& \text{number of occurances of the current card} 
\end{eqnarray}
$$

Where

$$\phi_j \in [1:15]$$


----------


## Reduced Space

This may provide a level of complexity that is undesirable & adds little explanatory benefits. Instead we encode the $trailing \; 3$ cards in binary, where:

$$: 
\begin{equation}
    phi_j := 
    \begin{cases}
      1, & \text{if card }\ i  \text{ equals the current card.} \\
      0, & \text{otherwise}
    \end{cases}
  \end{equation}
$$



$for \ j \in [1,2,3]$ - note it is also no longer neccessary to keep track of the current cards value. By using this reduced encoding we are assuming individual's will perform similarly if the same experimental instance was applied with different cards - which is a plausible assumption.


----------


## Final State Space

The reduced may lack some necessary complexity, suppose:


> The previous cards do not match the current card, but do match each other.

This could easily confuse the candidate, however under the reduced form representation there is no way for the model to discern this from the case where all cards are different. It is still not necessary to capture unique letter types, so to add this capability we simply need $4$ encoding possibilities:

$$: 
\begin{equation}
    \phi_j := 
    \begin{cases}
      \mathcal{a}, & \text{if card }\ j  \text{ matches the current card.} \\
      \mathcal{b}, & \text{the first unique card} \\
      \mathcal{c}, & \text{the second unique card} \\
      \mathcal{d}, & \text{the third unique card} 
    \end{cases}
  \end{equation}
$$



----------

## Final Covariates 

Leveraging this reduced form, our final model has $4$ explantory variables:

$$
\begin{eqnarray} 
\phi_1 &:=& \text{1 card back} \\
\phi_2 &:=& \text{2 cards back} \\
\phi_3 &:=& \text{3 cards back} \\
\tau &:=& \text{number of occurances of the current card} 
\end{eqnarray}
$$


----------

## Choice Probability

Theres covariates need to capture the probability of taking an action (signalling an $n-back$ match). The data is encoded in the design matrix $X$, with columns:


$$
\begin{eqnarray} 
\{x_1, x_2, x_3\} &  \rightarrow & \{\mathcal{a,b,c,d}\} \\
x_4     &:=& \text{number of occurances of the current card} 
\end{eqnarray}
$$

#### Interaction Terms

One might expect interaction between terms at certain sequence pairs may confuse candidates. $\{\phi_4, \phi_5, \phi_6\}$ are added as interaction terms.


#### Choice Probability

Our choice is binary, thus we can denote the probability of signalling a match $P[a=1]$ as a, letting $p \in [0,1]$:


$$
\begin{eqnarray} 
  ZX                &=& \phi_0 + \phi_1 x_1 + \phi_2 x_2 + \phi_3 x_3 + 
                        \phi_4 x_1 x_2 + \phi_5 x_1 x_3 + \phi_6 x_2 x_3 + \tau x_4 \\
  log\frac{p}{1-p}  &=& ZX \\
  \frac{p}{1-p}     &=& exp\{ ZX \} \\
  p                 &=& \frac{exp\{ ZX \}}{1 + exp\{ ZX \}} \\
  p                 &=& \frac{1}{1 + exp\{ -ZX \}} \\
  p                 &=& \sigma(ZX) 
\end{eqnarray}
$$

This represents the model for a single individual. More flexible function approximates may be tested, but are probably not neccessary.

**_Interaction terms should be added to the $ZX$ formulation._**


---------- 

## Individual Models: Bayesian Model

Each participant should have an individual model (unique parameters) that will be regularised in a Bayesian fashion. Thus each parameter needs to be index by:

$$
\begin{eqnarray} 
i &:=& \text{participant i} \\
t &:=& \text{time t} 
\end{eqnarray}
$$

----------


## Additions

A number of additions are to be added:
- Bayesian hierarchical framework to regulate variation across individuals
- Fitts law parameterisation
- Corsi parameterisation
- Navon parameterisation
- WCST parameterisation (possibly)


```
author: Zach Wolpe
email:  zachcolinwolpe@gmail.com
date:   22 June 2021
```

# Final Draft: NBack

Modelled as a Q learning instance:

$$actions: a:=\{\}$$


In [1]:
# !conda activate dynocog
# !conda init
# !conda install pandas -y
# !conda install -c pytorch pytorch -y
# !conda install -c conda-forge numpyro
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import re
import torch
import sys
from tqdm import tqdm
import pickle
import plotly.express as px
import plotly.graph_objects as go

# ---- load data module ----x
import sys
sys.path.append('../')
import process_data.process_raw_data as prd
from process_data.process_raw_data import batch_processing 

In [6]:
# ---- reprocess raw data ----x
path  = '../data/data_sample/'
path2 = '../data/data_samples_pandas/'
bp    = prd.batch_processing(path)

bp.create_wcst_data()
bp.create_navon_data()
bp.create_nback_data()
bp.create_corsi_data()
bp.create_fitts_data()
bp.convert_data_to_int()
bp.write_to_pickle(path2)
bp.read_from_pickle(path2)
bp.write_class_to_pickle(path2)



        ------------------------------------------------------------------
                                WCST data created
        ------------------------------------------------------------------

        


        ------------------------------------------------------------------
                                Navon data created
        ------------------------------------------------------------------

        


        ------------------------------------------------------------------
                                N back data created
        ------------------------------------------------------------------

        


        ------------------------------------------------------------------
                                Corsi data created
        ------------------------------------------------------------------

        


        ------------------------------------------------------------------
                                Fitts data created
        ------------

In [7]:
# ---- fetch data object ----x
with open('../data/data_samples_pandas/batch_processing_object.pkl', 'rb') as file2:
    bp = pickle.load(file2)
bp.__dict__.keys()

dict_keys(['path', 'mapping', 'data_times', 'participants', 'parti_code', 'n', 'wcst_paths', 'nback_paths', 'corsi_paths', 'fitts_paths', 'navon_paths', 'wcst_data', 'nback_data', 'corsi_data', 'fitts_data', 'navon_data'])

In [15]:
sub = bp.nback_data.loc[bp.nback_data.participant ==851366.0, ]

In [17]:
sub.head()

Unnamed: 0,participant,participant_code,block_number,score,status,miss,false_alarm,reaction_time_ms,match,stimuli,stimuli_n_1,stimuli_n_2
0,851366.0,s.32ff642a-efe0-436f-8075-fa703d677fed.txt,1,1,0,1,0,0,0,3000,1,14
1,851366.0,s.32ff642a-efe0-436f-8075-fa703d677fed.txt,1,2,0,1,0,0,0,3000,2,13
2,851366.0,s.32ff642a-efe0-436f-8075-fa703d677fed.txt,1,3,0,1,0,0,0,3000,3,10
3,851366.0,s.32ff642a-efe0-436f-8075-fa703d677fed.txt,1,4,1,0,0,1,0,3000,1,13
4,851366.0,s.32ff642a-efe0-436f-8075-fa703d677fed.txt,1,5,0,0,0,0,1,867,2,12
