# **FlexDDM Python Package Jupyter Notebook Tutorial**
This is a tutorial of how to use FlexDDM in action for results regarding how to fit models to participant data and how to validate the theoretical models that you create. 

### **Before running this Jupyter notebook, make sure to install Python and Anaconda.** <br>

#### <b><u>Python</u></b>
Click the link [here](https://www.python.org/downloads/release/python-3120/) to the Python 3.12.0 distribution that we use that is compatible with this package. 

#### <b><u>Anaconda</u></b>
Please make sure to create an environment in Anaconda. To do this, here are the steps: 
1. Download the Anaconda from this link [here](https://www.anaconda.com/download/success). Follow the installer instructions to correctly install the application. 
2. Go to the environments tab on the left hand side. Create a new environment by clicking the '+' sign and name it whatever you would like. In our case, we use flexddm. Once you create it, you should see some default packages already installed. 
<br><br> <img src="tutorial_images/environment.png" alt="Environment" style="width:600px;"/> <br>
<br> <img src="tutorial_images/create_environment.png" alt="Environment" style="width:600px;"/> <br>
<br> <img src="tutorial_images/create_environment_settings.png" alt="Environment" style="width:600px;"/> <br> <br>
3. There is a play button, click on it and click open with terminal. 
<br><br> <img src="tutorial_images/open_terminal.png" alt="Environment" style="width:600px;"/> <br> <br>
4. Use the `cd` command and locate the directory where the FlexDDM GitHub repository is in your computer. 
5. Type the following command into the terminal. Once the command is complete, you have created your environment! 
```bash
      pip install -r requirements.txt
```

When running the Jupyter notebook, make sure to set the kernel to be the Anaconda environment you just created (in our case, it would be flexddm).

### <b>Import Models and Functionality</b>
The first line allows us to import the models that already exist in the package. The second import statement allows us to fit the models to experimental data and validate the models. 

In [1]:
from flexddm.models import DMC, DMCfs, DSTP, DSTPit, mDMC, mDMCfs, mDSTP, mDSTPit, mSSP, mSSPit, SSP, SSPit, StandardDDM
from flexddm.main import fit, validation

### <b>Fitting Models</b>
#### <b>Fit Function</b>
To fit models, we will utilize the `fit` function from the `main.py` file in the FlexDDM package. This is how to call `fit`: 
```py
fit(models,  input_data, startingParticipants=None, endingParticipants=None, input_data_id="PPT", input_data_congruency="Condition", input_data_rt="RT", input_data_accuracy="Correct", output_fileName='output.csv', return_dataframes=False, posterior_predictive_check=True)
```

#### <b>Fit Function Parameters</b>
##### <b> Two Required Parameters: </b> 
- **`models`** (*list*):a list of model objects that you would like to use <br><br>
- **`input_data`** (*str / pd.DataFrame*): the participant data that you would like the models to fit to <br><br>
    - *str:* file path for CSV file <br><br>
    - *pd.DataFrame:* dataframe of the data

##### <b>Optional Parameters:</b>

If you want to specify a subset of participants to fit the models to, you can use the `startingParticipants` and `endingParticipants` parameters. When both `startingParticipants` and `endingParticipants` are `None`, it means that it will complete the fitting procedure for all participants. Please note that because of this feature, it is expected that all the participant IDs are in consecutive order (i.e. 1, 2, 3, 4, 5, and so on). 

- **`startingParticipants`** (*int*): the first participant of the subset <br><br>
- **`endingParticipants`** (*int*): the last participant of the subset

The next set of parameters are in regard to the format of the data. The standard format for the data contains 4 columns: **PPT** (Participant ID), **Condition** (Congruency- 0 for incongruent, 1 for congruent), **Correct** (Accuracy- 0 for incorrect, 1 for correct), and **RT** (Reaction time in milliseconds):

<br>
<img src="tutorial_images/sample_data.png" alt="Environment" style="width:600px;"/>  
<br>

If your data has similar data but does not have the exact same columns, this works too! You can customize the parameters in the `fit` function that allows you to modify the column names to align with your data. <br><br>

- **`input_data_id`** (*str*): the column name representing the participant ID <br><br>
- **`input_data_congruency`** (*str*): the column name the Flanker task trial congruency <br><br>
- **`input_data_rt`** (*str*): the column name representing the accuracy of the trial <br><br>
- **`input_data_accuracy`** (*str*): the column name representing the accuracy of the trial <br><br>

The next parameter is in regards to the output file. By default, every CSV will have the name `{model_name}_output.csv`. However, if you would like to change the name of the `output.csv`portion of the path, you can modify that using the output_fileName parameter. 

- **`output_fileName`** (*str*): the second portion of the CSV file path 

The next parameter tells whether or not you would like the dataframes which store the parameter values and model metrics. By default, this is set to `False`. 

- **`return_dataframes`** (*bool*): whether or not to return the dataframes 

The final parameter is in regards to whether or not you would like to see the posterior predictive graph for every participant. By default, this is set to `True`. 

- **`posterior_predictive_check`** (*bool*): whether or not to output the posterior predictive check plot 

Here is an example of what this looks like: 
<br><br>
<img src="tutorial_images/posterior_predictive_check.png" alt="Posterior Predictive Check" style="width:600px;"/>  
<br>

### This is how to fit one model on all the participants in this dataset. 

In [None]:
ssp = SSP()
fit([ssp], input_data='flexddm/data/hedge2018.csv')

### This is how to fit multiple models on all the participants in this dataset

In [None]:
ssp = SSP()
dmc = DMC()
dstp = DSTP() 
fit([ssp, dmc, dstp], input_data='flexddm/data/hedge2018.csv')

### This is how to turn off posterior predictive checks. 

In [None]:
ssp = SSP()
dmc = DMC()
dstp = DSTP() 
fit([ssp, dmc, dstp], input_data='flexddm/data/hedge2018.csv', posterior_predictive_check=False)

### This is how to only receive the model fits for 5 participants. 

In [None]:
ssp = SSP()
dmc = DMC()
dstp = DSTP() 
fit([ssp, dmc, dstp], startingParticipants=1, endingParticipants=5, input_data='flexddm/data/hedge2018.csv')

### <b>Validating Models</b>
#### <b>Validation Function</b>
To validate models, we will utilize the `validation` function from the `main.py` file in the FlexDDM package. This is how to call `validation`: 
```py
validation(models, model_recovery=True, model_recovery_simulations=100, parameter_recovery=True, param_recovery_simulations=100)
```

#### <b>Validation Function Parameters</b>
##### <b>Required Parameter:</b>
- **`models`** (*list*):a list of model objects correlating to the model that you would like to use <br>

##### <b>Optional Parameters: </b>
If you want to specify whether or not you would like to complete the model recovery functionality, you can use the `model_recovery` and `model_recovery_simulations` parameter. By default, the `model_recovery` is set to `True`, meaning that it will be completed, and `model_recovery_simulations` is set to 100, meaning that for 100 simulations will be run for every model. <br>
- **`model_recovery`** (*bool*): whether or not to complete the model recovery functionality<br><br>
- **`model_recovery_simulations`** (*int*): the number of simulations to run during model recovery <br>

If you want to specify whether or not you would like to complete the parameter recovery functionality, you can use the `parameter_recovery` and `parameter_recovery_simulations` parameter. By default, the `parameter_recovery` is set to `True`, meaning that it will be completed, and `parameter_recovery_simulations` is set to 100, meaning that for 100 simulations will be run for every model. <br>
- **`parameter_recovery`** (*bool*): whether or not to complete the parameter recovery functionality<br><br>
- **`parameter_recovery_simulations`** (*int*): the number of simulations to run during parameter recovery <br><br>

### This is how to validate three models with both parameter and model recovery. 

In [None]:
dmcfs = DMCfs()
mdmc = mDMC()
msspit = mSSPit()
validation(models=[dmcfs, mdmc, msspit])

### This is how to specify the number of simulations for parameter and model recovery (outside of the default 100).

In [None]:
dmcfs = DMCfs()
mdmc = mDMC()
msspit = mSSPit()
validation(models=[dmcfs, mdmc, msspit], model_recovery_simulations=200, param_recovery_simulations=75)

### This is how to specify that you do not want to complete model recovery. 

In [None]:
dmcfs = DMCfs()
mdmc = mDMC()
msspit = mSSPit()
validation(models=[dmcfs, mdmc, msspit], model_recovery=False)

### This is how to specify that you do not want to complete parameter recovery. 

In [None]:
dmcfs = DMCfs()
mdmc = mDMC()
msspit = mSSPit()
validation(models=[dmcfs, mdmc, msspit], param_recovery=False)