# How to fit a model

To fit a model, there are 3 main things to specify:

**1. Data:**
The `data`, which should be in the form of a pandas DataFrame.

Different model classes might require different columns in the data. You should check in the [API Reference](https://rlssm.readthedocs.io/en/latest/models/model_classes.html) of each model class (or using `model.fit?`) what the required data columns are.

**2. The priors (optional):**
You can decide whether to use the default priors (which you can see after initializing the model) or whether you want to change the mean or SD of the prior or hyper-prior distributions. Whether you changed the priors or not, they are always printed out when the model starts fitting.

**3. Sampling parameters:**
The sampling parameters **(number of chains, iterations, warmups, thinning, etc.)** are the arguments to the `pystan.StanModel.sampling()` function, and we simply refer to the [pystan documentation](https://pystan.readthedocs.io/) for a better overview.

**Additional learning parameters:**
While all sequential sampling models (DDM and race models) **without a learning component** only require a `data` argument, all models with a learning components (RL models, RLDDMs, and RL+race models) also require a `K` argument, which is the total number of different options in a learning block (note that this can be different from the number of options presented in each trial), and `initial_value_learning`, which is the initial Q value (before learning).

In [1]:
from rlssm.model.models_DDM import DDModel
from rlssm.model.models_RL import RLModel_2A
from rlssm.utility.load_data import load_example_dataset
from rlssm.utility.utils import load_model_results

## Non-learning example (non-hierarchical, simulated data)

In [2]:
model_ddm = DDModel(hierarchical_levels=1)

Using cached StanModel


In [3]:
# simulate some DDM data:
from rlssm.random.random_DDM import simulate_ddm
data_ddm = simulate_ddm(
    n_trials=400, 
    gen_drift=.8, 
    gen_threshold=1.3, 
    gen_ndt=.23)

For the simple, non-hierarchical DDM, it is only necessary to have `rt` and `accuracy` columns:

- *rt*, response times in seconds.

- *accuracy*, 0 if the incorrect option was chosen, 1 if the correct option was chosen.

In [4]:
data_ddm.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,drift,rel_sp,threshold,ndt,rt,accuracy
participant,trial,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,1,0.8,0.5,1.3,0.23,0.319,1.0
1,2,0.8,0.5,1.3,0.23,0.858,0.0
1,3,0.8,0.5,1.3,0.23,0.401,1.0
1,4,0.8,0.5,1.3,0.23,0.416,1.0
1,5,0.8,0.5,1.3,0.23,0.522,1.0


In [5]:
# Run 2 chains, with 2000 samples each, 1000 of which warmup, with custom priors:
model_fit_ddm = model_ddm.fit(
    data_ddm,
    drift_priors={'mu':.5, 'sd':1},
    threshold_priors={'mu':0, 'sd':.5},
    ndt_priors={'mu':0, 'sd':.1},
    chains=2,
    iter_warmup=1000,
    iter_sampling=2000,
    thin=1)

10:22:50 - cmdstanpy - INFO - CmdStan start processing


Fitting the model using the priors:
drift_priors {'mu': 0.5, 'sd': 1}
threshold_priors {'mu': 0, 'sd': 0.5}
ndt_priors {'mu': 0, 'sd': 0.1}


chain 1 |          | 00:00 Status

chain 2 |          | 00:00 Status

                                                                                                                                                                

10:23:02 - cmdstanpy - INFO - CmdStan done processing.
Exception: wiener_lpdf: Random variable  = 0.319, but must be greater than nondecision time = 0.578729 (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/DDM/DDM.stan', line 43, column 1 to column 59)
	Exception: wiener_lpdf: Random variable  = 0.319, but must be greater than nondecision time = 0.648211 (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/DDM/DDM.stan', line 43, column 1 to column 59)
	Exception: wiener_lpdf: Random variable  = 0.319, but must be greater than nondecision time = 1.17118 (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/DDM/DDM.stan', line 43, column 1 to column 59)
	Exception: wiener_lpdf: Random variable  = 0.319, but must be greater than nondecision time = 1.18819 (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/DDM/DDM.stan', line 43, column 1 to column 59)
	Exception: wiener_lpdf: Random variable  = 0.28, but must be greater than nondecision time = 0.280395 (in '/hom


Checks MCMC diagnostics:
n_eff / iter looks reasonable for all parameters
0 of 4000 iterations saturated the maximum tree depth of 10 (0.0%)
E-BFMI indicated no pathological behavior
0.0 of 4000 iterations ended with a divergence (0.0%)


## Learning example (hierarchical, real data)

In [6]:
model_rl = RLModel_2A(hierarchical_levels = 2)

10:23:33 - cmdstanpy - INFO - compiling stan file /home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A.stan to exe file /home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A
10:23:46 - cmdstanpy - INFO - compiled model executable: /home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A
--- Translating Stan model to C++ code ---
bin/stanc  --o=/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A.hpp /home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A.stan
    of arrays by placing brackets after a variable name is deprecated and
    will be removed in Stan 2.32.0. Instead use the array keyword before the
    type. This can be changed automatically using the auto-format flag to
    stanc
    of arrays by placing brackets after a variable name is deprecated and
    will be removed in Stan 2.32.0. Instead use the array keyword before the
    type. This can be changed automatically using the auto-format flag to
    

In [7]:
# import some example data:
data_rl = load_example_dataset(hierarchical_levels = 2)

data_rl.head()

Unnamed: 0,participant,block_label,trial_block,f_cor,f_inc,cor_option,inc_option,times_seen,rt,accuracy,feedback_type
0,1,1,1,43,39,2,1,1,1.244082,0,0
1,1,1,2,60,50,4,3,1,1.101821,1,0
2,1,1,3,44,36,4,2,2,1.029923,0,0
3,1,1,4,55,55,4,3,2,1.368007,0,0
4,1,1,5,52,49,4,3,3,1.039329,1,0


Since this learning model is only fit on choices, `rt` are not required.

Other columns/indexes that should be included are:

- *accuracy*, 0 if the incorrect option was chosen, 1 if the correct option was chosen.

- *trial_block*, the number of trial in a learning session. Should be integers starting from 1.

- *f_cor*, the output from the correct option in the presented pair (the option with higher outcome on average).

- *f_inc*, the output from the incorrect option in the presented pair (the option with lower outcome on average).

- *cor_option*, the number identifying the correct option in the presented pair (the option with higher outcome on average).

- *inc_option*, the number identifying the incorrect option in the presented pair(the option with lower outcome on average).

- *block_label*, the number identifying the learning session. Should be integers starting from 1. Set to 1 in case there is only one learning session.

If the model is hierarchical, also include:

- *participant*, the participant number. Should be integers starting from 1.

If increasing_sensitivity is True, also include:

- *times_seen*, average number of times the presented options have been seen in a learning session.

In [8]:
data_rl.head()

Unnamed: 0,participant,block_label,trial_block,f_cor,f_inc,cor_option,inc_option,times_seen,rt,accuracy,feedback_type
0,1,1,1,43,39,2,1,1,1.244082,0,0
1,1,1,2,60,50,4,3,1,1.101821,1,0
2,1,1,3,44,36,4,2,2,1.029923,0,0
3,1,1,4,55,55,4,3,2,1.368007,0,0
4,1,1,5,52,49,4,3,3,1.039329,1,0


In [9]:
# Run 2 chains, with 3000 samples each, 1000 of which warmup, with thinning and custom priors:
model_fit_rl = model_rl.fit(
    data_rl,
    K=4,
    initial_value_learning=27.5,
    alpha_priors={'mu_mu':-.3, 'sd_mu':.1, 'mu_sd':0, 'sd_sd':.1},
    sensitivity_priors={'mu_mu':-.1, 'sd_mu':.1, 'mu_sd':0, 'sd_sd':.1},
    chains=2,
    iter_sampling=3000,
    iter_warmup=1000,
    print_diagnostics=False, # (not suggested, see below)
    thin=2)

10:23:46 - cmdstanpy - INFO - CmdStan start processing


Fitting the model using the priors:
alpha_priors {'mu_mu': -0.3, 'sd_mu': 0.1, 'mu_sd': 0, 'sd_sd': 0.1}
sensitivity_priors {'mu_mu': -0.1, 'sd_mu': 0.1, 'mu_sd': 0, 'sd_sd': 0.1}


chain 1 |          | 00:00 Status

chain 2 |          | 00:00 Status

                                                                                                                                                                

10:33:50 - cmdstanpy - INFO - CmdStan done processing.
Exception: bernoulli_lpmf: Probability parameter[1] is -nan, but must be in the interval [0, 1] (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A.stan', line 93, column 1 to column 36)
Exception: bernoulli_lpmf: Probability parameter[1] is -nan, but must be in the interval [0, 1] (in '/home/andrei/PycharmProjects/rlssm/rlssm/stan_models/RL_2A/hierRL_2A.stan', line 93, column 1 to column 36)
Consider re-running with show_console=True if the above output is unclear!





## Diagnostics

As you can see, the MCMC diagnostics are already printed by default (if you do not want this, you can set `print_diagnostics` to `False`). I refer to https://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html for an excellent explanation of what these diagnostics actually mean and how to assess them.

On top of these, you can also check the convergence of the chains and the WAIC:

In [10]:
model_fit_ddm.rhat

Unnamed: 0_level_0,rhat,variable
name,Unnamed: 1_level_1,Unnamed: 2_level_1
lp__,1.000950,lp__
drift,0.999803,drift
threshold,1.000970,threshold
ndt,1.001540,ndt
drift_ll[1],0.999803,drift_ll[1]
...,...,...
log_lik[396],0.999742,log_lik[396]
log_lik[397],0.999549,log_lik[397]
log_lik[398],1.000550,log_lik[398]
log_lik[399],1.000480,log_lik[399]


In [11]:
model_fit_rl.rhat.describe()

Unnamed: 0,rhat
count,12814.0
mean,1.000063
std,0.000642
min,0.999334
25%,0.999588
50%,0.999881
75%,1.00033
max,1.00398


In [12]:
model_fit_ddm.waic

{'lppd': -227.17666381208397,
 'p_waic': 3.154874729405787,
 'waic': 460.6630770829795,
 'waic_se': 47.04475174479631}

In [13]:
model_fit_rl.waic

{'lppd': -2632.7741983632436,
 'p_waic': 53.07031584968287,
 'waic': 5371.689028425853,
 'waic_se': 94.01724051075693}

If you want to also see the point-wise WAIC, you can set `pointwise_waic` to `True`.

## Save the results

By default, the model fit results are saved in the same folder, using the `model_label` as filename. you can specify a different location using the `filename` argument.

In [14]:
model_fit_ddm.to_pickle()

Saving file as: /home/andrei/PycharmProjects/rlssm/docs/notebooks/DDM.pkl


In [15]:
model_fit_rl.to_pickle()

Saving file as: /home/andrei/PycharmProjects/rlssm/docs/notebooks/hierRL_2A.pkl


## Re-load previously saved results

In [16]:
model_fit_ddm = load_model_results('DDM.pkl')

In [17]:
model_fit_rl = load_model_results('hierRL_2A.pkl')

The data the model was fit on are stored in `data_info`:

In [18]:
model_fit_rl.data_info['data']

Unnamed: 0,index,participant,block_label,trial_block,f_cor,f_inc,cor_option,inc_option,times_seen,rt,accuracy,feedback_type
0,0,1,1,1,43,39,2,1,1,1.244082,0,0
1,1,1,1,2,60,50,4,3,1,1.101821,1,0
2,2,1,1,3,44,36,4,2,2,1.029923,0,0
3,3,1,1,4,55,55,4,3,2,1.368007,0,0
4,4,1,1,5,52,49,4,3,3,1.039329,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...
6459,6459,27,3,76,37,36,2,1,39,1.875327,1,0
6460,6460,27,3,77,58,41,4,2,39,1.696957,1,0
6461,6461,27,3,78,64,49,4,3,38,2.059956,1,0
6462,6462,27,3,79,44,37,3,1,39,1.623731,1,0


The priors are stored in `priors`:

In [19]:
model_fit_ddm.priors

{'drift_priors': {'mu': 0.5, 'sd': 1},
 'threshold_priors': {'mu': 0, 'sd': 0.5},
 'ndt_priors': {'mu': 0, 'sd': 0.1}}

In [20]:
model_fit_rl.priors

{'alpha_priors': {'mu_mu': -0.3, 'sd_mu': 0.1, 'mu_sd': 0, 'sd_sd': 0.1},
 'sensitivity_priors': {'mu_mu': -0.1, 'sd_mu': 0.1, 'mu_sd': 0, 'sd_sd': 0.1}}

And different parameter information are stored in `parameter_info`:

In [21]:
model_fit_ddm.parameters_info

{'hierarchical_levels': 1,
 'n_parameters_individual': 3,
 'n_parameters_trial': 0,
 'n_posterior_samples': 2000,
 'parameters_names': ['drift', 'threshold', 'ndt'],
 'parameters_names_transf': ['transf_drift', 'transf_threshold', 'transf_ndt'],
 'parameters_names_all': ['drift', 'threshold', 'ndt']}

In [22]:
model_fit_rl.waic

{'lppd': -2632.7741983632436,
 'p_waic': 53.07031584968287,
 'waic': 5371.689028425853,
 'waic_se': 94.01724051075693}

And, of course, you can inspect the model's posteriors, see [How to inspect a model](https://rlssm.readthedocs.io/en/latest/notebooks/inspect_model.html) for more details on this.