### Quick Start

The `ssms` package serves two purposes. 

1. Easy access to *fast simulators of sequential sampling models*
   
2. Support infrastructure to construct training data for various approaches to likelihood / posterior amortization

We provide two minimal examples here to illustrate how to use each of the two capabilities.

Let's start with *installing* the `ssms` package.

You can do so by typing,

`pip install git+https://github.com/AlexanderFengler/ssm_simulators`

in your terminal.


In [1]:
# Import necessary packages
import numpy as np
import pandas as pd
import ssms

#### Using the Simulators

Let's start with using the basic simulators. 
You access the main simulators through the  `ssms.basic_simulators.simulator` function.

To get an idea about the models included in `ssms`, use the `config` module.
The central dictionary with metadata about included models sits in `ssms.config.model_config`. 

In [12]:
# Check included models
list(ssms.config.model_config.keys())[:10]

['ddm',
 'ddm_legacy',
 'angle',
 'weibull',
 'levy',
 'levy_angle',
 'full_ddm',
 'ornstein',
 'ornstein_angle',
 'ddm_sdv']

In [9]:
# Take an example config for a given model
ssms.config.model_config['ddm']

{'name': 'ddm',
 'params': ['v', 'a', 'z', 't'],
 'param_bounds': [[-3.0, 0.3, 0.1, 0.0], [3.0, 2.5, 0.9, 2.0]],
 'boundary': <function ssms.basic_simulators.boundary_functions.constant(t=0)>,
 'n_params': 4,
 'default_params': [0.0, 1.0, 0.5, 0.001],
 'hddm_include': ['z'],
 'nchoices': 2}

**Note:**
The usual structure of these models includes,

- Parameter names (`'params'`)
- Bounds on the parameters (`'param_bounds'`)
- A function that defines a boundary for the respective model (`'boundary'`)
- The number of parameters (`'n_params'`)
- Defaults for the parameters (`'default_params'`)
- The number of choices the process can produce (`'nchoices'`)

The `'hddm_include'` key concerns information useful for integration with the [hddm](https://github.com/hddm-devs/hddm) python package, which facilitates hierarchical bayesian inference for sequential sampling models. It is not important for the present tutorial.

In [18]:
from ssms.basic_simulators import simulator
sim_out = simulator(model = 'ddm', 
                    theta = [0, 1, 0.5, 0.5],
                    n_samples = 1000)

The output of the simulator is a `dictionary` with three elements.

1. `rts` (array)
2. `choices` (array)
3. `metadata` (dictionary)

The `metadata` includes the named parameters, simulator settings, and more.

#### 

In [7]:
# TEST STANDARD TRAINING DATA GENERATOR
# my_data_config = config.data_generator_config['lan']['cnn']
my_data_config = ssms.config.data_generator_config['lan']['mlp']
my_data_config['dgp_list'] = 'ddm'
my_data_config['n_parameter_sets'] = 100
my_data_config['n_samples'] = 1000

In [8]:
my_model_config = ssms.config.model_config['ddm']

In [9]:
my_dataset_generator = ssms.dataset_generators.data_generator(generator_config = my_data_config,
                                                              model_config = my_model_config)

n_cpus:  6
checking:  data/lan_mlp/


In [11]:
x = my_dataset_generator.generate_data_training_uniform(save = False)

simulation round: 1  of 10
simulation round: 2  of 10
simulation round: 3  of 10
simulation round: 4  of 10
simulation round: 5  of 10
simulation round: 6  of 10
simulation round: 7  of 10
simulation round: 8  of 10
simulation round: 9  of 10
simulation round: 10  of 10


In [3]:
# TEST NESTED MODEL TRAINING DATA
# my_data_config = config.data_generator_config['lan']['cnn']
my_data_config = ssms.config.data_generator_config['lan']['mlp']
my_data_config['dgp_list'] = 'glob'
my_data_config['n_parameter_sets'] = 100
my_data_config['n_samples'] = 1000

In [4]:
my_model_config = ssms.config.model_config['glob']

In [5]:
my_dataset_generator = ssms.dataset_generators.data_generator(generator_config = my_data_config,
                                                              model_config = my_model_config)

n_cpus:  6
checking:  data/lan_mlp/


In [6]:
my_output = my_dataset_generator.generate_data_nested(save = False)

simulation round:  1  of 10
simulation round:  2  of 10
simulation round:  3  of 10
simulation round:  4  of 10
simulation round:  5  of 10
simulation round:  6  of 10
simulation round:  7  of 10
simulation round:  8  of 10
simulation round:  9  of 10
simulation round:  10  of 10


In [7]:
my_output

{'data': array([[ 1.0340211 ,  1.        ],
        [ 1.0720211 ,  1.        ],
        [ 1.3030207 ,  1.        ],
        ...,
        [ 0.7438603 , -1.        ],
        [ 0.48186138,  1.        ],
        [ 0.43286133, -1.        ]], dtype=float32),
 'label_components': array([[1., 0., 0.],
        [1., 1., 0.],
        [0., 1., 1.],
        [1., 0., 0.],
        [1., 0., 1.],
        [0., 0., 0.],
        [1., 0., 1.],
        [0., 0., 0.],
        [0., 1., 0.],
        [0., 1., 0.],
        [0., 0., 1.],
        [1., 0., 1.],
        [1., 1., 1.],
        [0., 0., 1.],
        [0., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [0., 0., 0.],
        [1., 1., 1.],
        [1., 0., 1.],
        [1., 0., 0.],
        [1., 1., 1.],
        [0., 1., 0.],
        [0., 0., 1.],
        [0., 0., 0.],
        [0., 0., 0.],
        [1., 1., 1.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [1., 1., 0.],
        [0., 0., 0.],


In [5]:
# NEEDS NEW CONFIG HERE
my_output = my_dataset_generator.generate_data_ratio_estimator(save = False)

Starting simulations
simulations finished!
simulations finished!simulations finished!

simulations finished!
simulations finished!simulations finished!

simulations finished!
simulations finished!simulations finished!

simulations finished!
simulations finished!
simulations finished!


In [8]:
my_output['labels']

array([0., 1., 1., 0., 0., 1., 0., 0., 1., 0., 1., 0., 0., 0., 1., 0., 1.,
       0., 0., 0., 0., 1., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 1.,
       1., 0., 0., 1., 1., 0., 0., 1., 0., 0., 1., 1., 0., 1., 0., 1., 0.,
       0., 1., 0., 1., 1., 1., 0., 0., 1., 1., 1., 0., 1., 1., 0., 0., 0.,
       1., 0., 0., 0., 0., 1., 1., 1., 1., 1., 0., 1., 1., 1., 1., 1., 1.,
       0., 1., 0., 1., 1., 0., 1., 1., 1., 0., 1., 0., 0., 0., 0.],
      dtype=float32)

In [None]:
my_dataset_generator.model_config

In [8]:
x['data'].shape

(10, 512, 2)

In [9]:
x['labels'].shape

(10, 4)

In [10]:
my_output['label_parameters']

array([[ 1.2226119 ,  0.37123147,  0.5904053 ,  1.6807697 ,  0.        ,
         1.0837663 ,  0.        ],
       [ 2.9925056 ,  1.3653069 ,  0.1604542 ,  1.2743086 ,  0.18776555,
         1.2324505 ,  0.        ],
       [ 2.477468  ,  0.58937293,  0.3375957 ,  1.2296201 ,  0.82703817,
         0.92556745,  1.0006771 ],
       [-0.5683042 ,  0.43178543,  0.5617197 ,  1.1424209 ,  0.01943262,
         1.4299839 ,  0.        ],
       [-1.805324  ,  0.9111644 ,  0.5896453 ,  1.896022  ,  0.        ,
         1.6689351 ,  0.        ],
       [-1.7950133 ,  1.206874  ,  0.64769906,  1.4530326 ,  0.7132122 ,
         0.93900985,  0.        ],
       [ 1.1343992 ,  0.8687558 ,  0.489402  ,  1.9120643 ,  0.        ,
         0.5533016 ,  0.        ],
       [-0.55500275,  1.4295069 ,  0.6387958 ,  1.        ,  0.56790733,
         0.62682825,  0.83169496],
       [-0.0092219 ,  0.86031497,  0.42054382,  1.        ,  0.        ,
         0.01748262,  0.        ],
       [-0.2008512 ,  0.4849