# Tutorial on Parameter defaults

As of version `0.9.8`, `HDDM` doesn't expect that you always explicitly want to fit the `v`, `a` and `t` parameters. You are now allowed to fix any of these parameters to any default you like. In this tutorial we show how to fit any given subset of parameters of a model, while supplying (user picked) default values for the remaining parameters.

## Install (colab)

In [None]:
# package to help train networks
# !pip install git+https://github.com/AlexanderFengler/LANfactory

# package containing simulators for ssms
# !pip install git+https://github.com/AlexanderFengler/ssm_simulators

# packages related to hddm
# !pip install cython
# !pip install pymc==2.3.8
# !pip install git+https://github.com/hddm-devs/kabuki
# !pip install git+https://github.com/hddm-devs/hddm

## Load Modules

In [1]:
# MODULE IMPORTS ----

# warning settings
import warnings

warnings.simplefilter(action="ignore", category=FutureWarning)

# Data management
import pandas as pd
import numpy as np
import pickle

# Plotting
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns

# Stats functionality
from statsmodels.distributions.empirical_distribution import ECDF

# HDDM
import hddm
from hddm.simulators.hddm_dataset_generators import simulator_h_c

## Example Models

### `HDDM()`
#### Simulate Data

In [37]:
# from hddm.simulators.hddm_dataset_generators import simulator_h_c
from hddm.simulators.basic_simulator import simulator
from hddm.simulators.hddm_dataset_generators import hddm_preprocess

model = "ddm_hddm_base"

data = simulator(theta=[1.0, 1.0, 0.5, 0.5], model=model, n_samples=500)

data = hddm_preprocess(data)

#### Model and  Sample

Let's first fit all parameters.

In [38]:
hddm_model = hddm.HDDM(
    data,
    include=["v", "a", "t", "z"],
    informative=False,
    is_group_model=False,
)

No model attribute --> setting up standard HDDM
Set model to ddm


In [39]:
hddm_model.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 8.5 sec

<pymc.MCMC.MCMC at 0x7ff3014fc210>

In [40]:
hddm_model.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
a,0.996413,0.022212,0.949557,0.983287,0.995682,1.011092,1.040408,0.001349
v,1.150145,0.122256,0.925338,1.072607,1.14161,1.225006,1.435445,0.007297
t,0.501954,0.003282,0.495503,0.500019,0.502353,0.504147,0.507753,0.000225
z,0.488272,0.015967,0.45511,0.477788,0.489069,0.498717,0.519293,0.001011


Now we **fix `a` to it's default** as per the `HDDM`-supplied `model_config` dictionary. As shown below,
this sets `a = 2.` which corresponds to an overestimation. We expect that, having fixed `a` at such value, we will correspondingly overestimate `v` to compensate (however the fit will end up worse in general).

In [41]:
hddm.model_config.model_config["ddm_hddm_base"]

{'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.',
 'params': ['v', 'a', 'z', 't'],
 'params_trans': [0, 0, 1, 0],
 'params_std_upper': [1.5, 1.0, None, 1.0],
 'param_bounds': [[-5.0, 0.1, 0.05, 0], [5.0, 5.0, 0.95, 3.0]],
 'boundary': <function ssms.basic_simulators.boundary_functions.constant(t=0)>,
 'params_default': [0.0, 2.0, 0.5, 0],
 'hddm_include': ['v', 'a', 't', 'z'],
 'choices': [0, 1],
 'slice_widths': {'v': 1.5,
  'v_std': 1,
  'a': 1,
  'a_std': 1,
  'z': 0.1,
  'z_trans': 0.2,
  't': 0.01,
  't_std': 0.15}}

In [42]:
hddm_model_no_a = hddm.HDDM(
    data,
    include=["v", "t", "z"],
    informative=False,
    is_group_model=False,
)

No model attribute --> setting up standard HDDM
Set model to ddm


 Your include statement misses either the v, a or t parameters. 
Parameters not explicitly included will be set to the defaults, 
which you can find in the model_config dictionary!
  "Parameters not explicitly included will be set to the defaults, \n" + \


In [43]:
hddm_model_no_a.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 5.6 sec

<pymc.MCMC.MCMC at 0x7ff301546b50>

In [44]:
hddm_model_no_a.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
v,2.077952,0.146506,1.806508,1.977741,2.075691,2.164686,2.385388,0.011489
t,0.335338,0.009595,0.315019,0.330216,0.335275,0.341575,0.353736,0.000553
z,0.36132,0.022235,0.319808,0.346506,0.360754,0.376082,0.406029,0.001815


As predicted, `v` is now overestimated as well.

Let's now try to set `a` to a default of our liking. We will set it to the ground-truth and again not include it in the parameters to estimate. To do so, we supply our own `model_config` to the `HDDM()` class.

In [47]:
from copy import deepcopy

# copy model_config dictionary so we can change it
my_model_config = deepcopy(hddm.model_config.model_config["ddm_hddm_base"])

# setting 'a' to 1.
my_model_config["params_default"][1] = 1.0

hddm_model_no_a_2 = hddm.HDDM(
    data,
    include=["v", "t", "z"],
    informative=False,
    is_group_model=False,
    model_config=my_model_config,
)

Custom model config supplied as: 

{'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.', 'params': ['v', 'a', 'z', 't'], 'params_trans': [0, 0, 1, 0], 'params_std_upper': [1.5, 1.0, None, 1.0], 'param_bounds': [[-5.0, 0.1, 0.05, 0], [5.0, 5.0, 0.95, 3.0]], 'boundary': <function constant at 0x7ff31c1fab90>, 'params_default': [0.0, 1.0, 0.5, 0], 'hddm_include': ['v', 'a', 't', 'z'], 'choices': [0, 1], 'slice_widths': {'v': 1.5, 'v_std': 1, 'a': 1, 'a_std': 1, 'z': 0.1, 'z_trans': 0.2, 't': 0.01, 't_std': 0.15}}
No model attribute --> setting up standard HDDM
Set model to ddm


 Your include statement misses either the v, a or t parameters. 
Parameters not explicitly included will be set to the defaults, 
which you can find in the model_config dictionary!
  "Parameters not explicitly included will be set to the defaults, \n" + \


In [48]:
hddm_model_no_a_2.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 5.2 sec

<pymc.MCMC.MCMC at 0x7ff30157b150>

In [49]:
hddm_model_no_a_2.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
v,1.171748,0.118515,0.935492,1.094611,1.170236,1.251087,1.425975,0.006572
t,0.501641,0.002593,0.496291,0.499915,0.50178,0.503453,0.506346,0.000121
z,0.486188,0.016581,0.450828,0.476066,0.486156,0.497147,0.518931,0.000963


As we see, in this case `v` is estimated appropriately again.

##### Let's compare DICs

In [50]:
print("Standard: ", hddm_model.dic)
print("No a with HDDM default: ", hddm_model_no_a.dic)
print("No a with a set to ground truth: ", hddm_model_no_a_2.dic)

Standard:  -7.05123814817064
No a with HDDM default:  562.273161208081
No a with a set to ground truth:  -9.028954442474097


### HDDMnn()

Let's repeat this with another model via the `HDDMnn()` class.
We will pick the `HDDM`-supplied `angle` model.

#### Simulate Data

In [52]:
model = "angle"
theta = [1.0, 1.5, 0.5, 0.5, 0.2]  # v, a, z, t, theta
data_angle = simulator(theta=theta, model="angle", n_samples=500)
data_angle = hddm_preprocess(data_angle, keep_negative_responses=True)

#### Model and Sample

In [53]:
model_angle = hddm.HDDMnn(
    data_angle, model="angle", include=["v", "a", "t", "z", "theta"]
)

Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10


In [54]:
model_angle.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 52.0 sec

<pymc.MCMC.MCMC at 0x7ff301575390>

In [56]:
model_angle.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
v,1.020563,0.083431,0.852936,0.964464,1.020123,1.074769,1.178448,0.006605
a,1.584719,0.098083,1.419463,1.511524,1.576579,1.651892,1.787859,0.009077
z,0.526568,0.025441,0.475032,0.510146,0.527315,0.543314,0.577972,0.002245
t,0.494598,0.037365,0.420589,0.470531,0.495294,0.521575,0.561596,0.003361
theta,0.270865,0.050656,0.177934,0.236454,0.269616,0.304024,0.377811,0.004407


Again we will now leave out one parameter (let's pick `theta` this time). As we can see from the printed `model_config` below, the default that will be chosen for this parameter is to set it to `0` in this case.

In [60]:
hddm.model_config.model_config

{'ddm_hddm_base': {'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.',
  'params': ['v', 'a', 'z', 't'],
  'params_trans': [0, 0, 1, 0],
  'params_std_upper': [1.5, 1.0, None, 1.0],
  'param_bounds': [[-5.0, 0.1, 0.05, 0], [5.0, 5.0, 0.95, 3.0]],
  'boundary': <function ssms.basic_simulators.boundary_functions.constant(t=0)>,
  'params_default': [0.0, 2.0, 0.5, 0],
  'hddm_include': ['v', 'a', 't', 'z'],
  'choices': [0, 1],
  'slice_widths': {'v': 1.5,
   'v_std': 1,
   'a': 1,
   'a_std': 1,
   'z': 0.1,
   'z_trans': 0.2,
   't': 0.01,
   't_std': 0.15}},
 'full_ddm_hddm_base': {'doc': 'Model used internally for simulation purposes. Do NOT use with the LAN extension.',
  'params': ['v', 'a', 'z', 't', 'sz', 'sv', 'st'],
  'params_trans': [0, 0, 1, 0, 0, 0, 0],
  'params_std_upper': [1.5, 1.0, None, 1.0, 0.1, 0.5, 0.1],
  'param_bounds': [[-5.0, 0.1, 0.3, 0.25, 0, 0, 0],
   [5.0, 5.0, 0.7, 2.25, 0.25, 4.0, 0.25]],
  'boundary': <function ssms.b

In [57]:
model_angle_no_theta = hddm.HDDMnn(
    data_angle, model="angle", include=["v", "a", "t", "z"]
)

Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10


In [58]:
model_angle_no_theta.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 47.4 sec

<pymc.MCMC.MCMC at 0x7ff30160bed0>

In [62]:
model_angle_no_theta.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
v,1.124278,0.09761,0.928012,1.052883,1.128692,1.191886,1.310813,0.008418
a,1.363875,0.053492,1.259736,1.327325,1.364278,1.396537,1.468327,0.004211
z,0.49972,0.030672,0.436555,0.477589,0.498887,0.522161,0.558534,0.002693
t,0.536048,0.033127,0.464361,0.515404,0.53657,0.558055,0.595229,0.002959


Again we observe how the parameter estimates are affected by the *wrong choice of `theta`. The model tries to compensate for the parallel bounds (no collapse), implied by the `theta` default, by decreasing `a` and slightly increasing `v`. Let's now try again, but this time we set `theta` fixed to the actual *ground truth*.

In [63]:
# copy out the model_config dictionary for the angle model
my_model_config_angle = deepcopy(hddm.model_config.model_config["angle"])
# set theta default to the ground truth defined above
my_model_config_angle["params_default"][4] = 0.2

model_angle_no_theta_2 = hddm.HDDMnn(
    data_angle,
    model="angle",
    include=["v", "a", "t", "z"],
    model_config=my_model_config_angle,
)

Using default priors: Uninformative
Supplied model_config specifies params_std_upper for  z as  None.
Changed to 10


In [64]:
model_angle_no_theta_2.sample(1000, burn=500)

 [-----------------100%-----------------] 1000 of 1000 complete in 53.4 sec

<pymc.MCMC.MCMC at 0x7ff301652c90>

In [65]:
model_angle_no_theta_2.gen_stats()

Unnamed: 0,mean,std,2.5q,25q,50q,75q,97.5q,mc err
v,1.020949,0.089617,0.838443,0.959232,1.019338,1.087935,1.188067,0.007229
a,1.46397,0.049143,1.363087,1.429878,1.465694,1.49972,1.554566,0.003623
z,0.527111,0.027281,0.471521,0.508124,0.527366,0.547015,0.581392,0.002391
t,0.528561,0.029786,0.470851,0.50755,0.527586,0.547936,0.588767,0.002522


As we see, fixing `theta` to the actual ground truth, corrects the parameter estimates of the remaining parameters to be much more accurate again.

##### Let's compare DICs

In [67]:
print("Standard: ", model_angle.dic)
print("theta set to model_config default: ", model_angle_no_theta.dic)
print("theta set to ground truth: ", model_angle_no_theta_2.dic)

Standard:  1059.453694824219
theta set to model_config default:  1066.945202636719
theta set to ground truth:  1058.248090332031


We observe in this case, that fixing `theta` to `0` instead of `0.2`, didn't do too much damage as far as the DICs are concerned. Nevertheless, the *explicitly wrong* model performs worst as per this metric. 

##### END

Hopefully this was helpful.