# Tutorial: model-based MVPA using MBfMRI

This tutorial is for offering guidance to perform model-based MVPA using MBfMRI package.<br>
By using this package, users can conduct the analysis with few lines of codes.<br>
You only need to prepare a root path for task-based fMRI data the preprocessed images under the path.

For model-based GLM using MBfMRI, see [here](#MBGLM).

## Workflow of MVPA approach, model-based MVPA

The exact workflow of the model-based MVPA consists of the following steps:

<center><img src=https://raw.githubusercontent.com/CCS-Lab/project_model_based_fmri/main/images/mbmvpa_workflow.png width="800px"><\center>

1. **Generate latent process signals** by fitting computational models with behavioral data and extracting time-series of latent process followed by HRF convolution.

2. **Generate multi-voxel signals** from preprocess fMRI images by allowing ROI masking, zooming spatial resolution, and improving the quality of signals by several well-established methods (e.g. detrending, high-pass filtering, regressing out confounds).

3. **Train MVPA models** by feeding multi-voxel signals as input (X) and latent process signals as ouput (y), or target, employing the repeated cross-validation framework. 

4. **Interpret the trained MVPA models** to visualize the brain implementation of the target latent process quantified as brain activation pattern attributed  to predict the target signals from the multi-voxel signals.


The aforementioned procedures can be done by a single function, `mbfmri.core.engine.run_mbfmri`.<br>
Users can choose which kinds of approach to apply and latent processes of interest.<br>
The configuration of the analysis can be controlled by various arguments.<br>
Users can manipulate each of the setting by providing relevant keywarded argument (e.g. run_mbfmri(..., mvpa_model='mlp',...) or run_mbfmri(...,detrend=True)).
```python
from mbfmri.core.engine import run_mbfmri

_ = run_mbfmri(bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS))
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='PEchosen'              # identifier for target latent process
               
               ##################################################################################
               #                                                                                #
               #                ANY KEYWARDED ARGUMENT DEFINED IN CONFIGURATION                 #
               #                                                                                #
               ##################################################################################
               )
```

Please refer to the [document](https://project-model-based-fmri.readthedocs.io/en/latest/mbfmri.core.html#mbfmri-core-engine-run-mbfmri) for the full list of configuration.

#### [Example data]

The example data is a mini-size version of the dataset issued by Bornstein et al., 2017 ([paper](https://www.nature.com/articles/nn.4573#abstracthttps://www.nature.com/articles/nn.4573#abstract), [openneuro](https://openneuro.org/datasets/ds001607/versions/1.0.1)).<br>
The data includes 8 subjects and each of them has one run for Reinforcement Learning task with 180 trials. <br>
You can download the zip file from the [link]().

In [11]:
from pathlib import Path

bids_layout = "mini_bornstein2017"

## What you should do before running MBfMRI

#### [1. Data - Check input data](#Data)
#### [2. Preprocessing 1 - Calculate latent process](#LatentProcess)
#### [3. Preprocessing 2 - Generate voxel features](#VoxelFeatures)
#### [4. fMRI Analysis - Choose which method and which model to use for fMRI analysis and decide some details](fMRIAnalysis)
#### [5. Run the code!](#Code)
#### [6. Output](#Output)

### 1. Data - Check input data <a name = "Data"> </a>

#### 1-1. BIDS layout with original data

The original task-based fMRI data should be in `bids_layout`.<br> 
Also, you need to place preprocessed fMRI images under the *derivatives* folder under `bids_layout`, named as `fmriprep`.<br>
(In the original layoout of sample data, there are only events files as the preprocessed images will be used.)<br>

In [12]:
# Check bids layout in the root
!ls mini_bornstein2017

dataset_description.json  masks   sub-02  sub-04  sub-06  sub-08
derivatives		  sub-01  sub-03  sub-05  sub-07


The behavioral data should be in `events.tsv`, following the original BIDS layout as well.

In [13]:
# Check behavioral data
import pandas as pd
pd.read_table('mini_bornstein2017/sub-01/func/sub-01_task-multiarmedbandit_events.tsv').head()

Unnamed: 0.1,Unnamed: 0,onset,duration,type,choice,rwdval,RT,time_feedback,gain,loss,PrecalculatedQchosen,PrecalculatedPEchosen
0,0,8.32,3,bandit,1,0,1.2439,11.5639,0,0,0.0,0.0
1,1,15.303,3,bandit,2,10,0.7019,18.0049,10,0,0.0,4.897044
2,2,22.286,3,bandit,3,0,0.9651,25.2511,0,0,0.0,0.0
3,3,29.269,3,bandit,3,10,1.0103,32.2793,10,0,0.0,4.897044
4,4,36.253,3,bandit,2,0,0.4705,38.7235,0,0,0.733677,-0.733677


#### 1-2. Derivative layout from *fMRIPrep*

The package assumes that preprocessing images is done by [fMRIPrep](https://fmriprep.org/en/stable/). <br>
Please refer to [mbfmri/utils/config.py](https://github.com/CCS-Lab/project_model_based_fmri/blob/main/mbfmri/utils/config.py) for configuration of it.<br> 

If you preprocessed fMRI data using other tool than fMRIPrep, you should change the directory and file layout to apply MBfMRI package to your data.

In [None]:
!ls mini_bornstein2017/derivatives/fmriprep

In [None]:
!ls mini_bornstein2017/derivatives/fmriprep/sub-01/func/

#### 1-3. Mask images

The mask images would be located under the folder below as default `(BIDSroot/masks/include)`.<br>
The images here are integrated into one binary image.

In [None]:
!ls mini_bornstein2017/masks/include
# [??] what should I do if I want to use other mask images?
# [CJ] Maybe we should discuss through Zoom about this. 
#      I am looking for appropriate and easy explanation for this function.
#      It might be better to refer to "_build_mask" function codes located in
#      mbfmri/utils/bold_utils.py. The mechanism is quite straightforward.
# [??] And what's the purpose of this mask images?
# [CJ] One of the major purposes is reducing the number of features (which is not applicable in CNN),
#      and another would be to restrict regions which are supposed be included or excluded in MVPA.

### 2. Preprocessing 1 - Caculate latent process: Choose cognitive model and target latent process to analyze in hBayesDM (or prepare precaculated latent process) <a name = "LatentProcess"> </a>


#### 2-1. hBayesDM model - Now you need to choose which cognitive model to use. Please refer to [hBayesDM](https://hbayesdm.readthedocs.io/en/v1.0.1/models.html) and check available models.<br>
- Here, the task used in the example data could be categorized as a **multi-armed bandit task** which requires subjects to learn the probability of three cards (or bandits) by rewards or none. So **a reinforcement learning model for multi-armed bandit task with 5 parameter (including decision noise and decay rate but not choice perseveration)** ([Niv et al., 2015](https://www.jneurosci.org/content/35/21/8145)) is chosen here for the example data. The model is named as `banditNarm_lapse_decay` in the hBayesDM package. As the hBayesDM would be used, the data would be fitted by hierarchical Bayesian estimation method.
- e.g. `dm_model = 'banditNarm_lapse_decay'`

#### 2-2. Target latent process - Next, you should decide which latent process to target. Please refer to the list of [available latent process]() for other possible processes and their explanations.
- In this example, **prediction error** of chosen options, named as `PEchosen` in hBayesDM, would be a target latent process.
- e.g. `process_name = 'PEchosen'`

#### 2-3. Check data format of behavioral data.
- The data format in `event.tsv` should match with input format of the corresponding model. If not, and you don't want to change your original data, you can use user-defined functions to remap the data while preprocessing. Please refer to `adjust_function`, `adjust_function_dfwise`, `filter_function`, `filter_function_dfwise` in [mbfmri.preprocessing.events module](https://project-model-based-fmri.readthedocs.io/en/latest/mbfmri.preprocessing.html#module-mbfmri.preprocessing.events) ([source code](https://github.com/CCS-Lab/project_model_based_fmri/blob/main/mbfmri/preprocessing/events.py))




Calculated latent process signals using hBayesDM will be saved in a new directory named as `mbmvpa` under `derivatives` directory. As default, `mbmvpa` will be under the same `derivative` folder as the `fmriprep`.

### 3. Preprocessing 2 - Generate voxel features <a name = "VoxelFeatures"> </a>

#### 3-1. ROI mask
- Here, an ROI mask is applied to restrict brain regions from which features are retrieved. The following mechanisms are provided to make a mask. 
    1. Default mask is MNI152 T1 template ([nilearn.datasets.load_mni152_brain_mask](https://nilearn.github.io/modules/generated/nilearn.datasets.load_mni152_brain_mask.html)).
    2. You can only include gray matter only by `gm_only` = True. ([nilearn.datasets.fetch_icbm152_brain_gm_mask](https://nilearn.github.io/modules/generated/nilearn.datasets.fetch_icbm152_brain_gm_mask.html)).
    3. You can choose ROIs from atlas by `atlas`= *{atlas_name}* and `rois`=*{list of roi}*. Please check available atlases and corresponding ROIs from [link](https://project-model-based-fmri.readthedocs.io/en/latest/roi_info.html#atlas-and-rois).
    4. You can provide nii files. The files in *{BIDS_ROOT}/masks/include* and *{BIDS_ROOT}/masks/exclude* will be then binarized by given cut-off threshold,`mask_threshold`. Then the binarized maps in each directory are integrated (Union). The final map can be defined using set operations as "(map from 1-2-3) $\cap$ \[(*include* map) - (*exclude* map)\]." You can redirect the mask path (*masks*) by `mask_path` argument.
    5. The resulting mask can have reduced resolution by zooming, e.g `zoom`=(2,2,2) will zoom 2mm^3 voxel to 4mm^3. Each dimension on `zoom` corresponds to the rescale factor for each dimension of *xyz* coordinate.

- Each of fMRI images will be resampled using the generated ROI mask, producing a flattened vector with shape = (*time*, *feature #*).

#### 3-2. Clean bold signal
- Additional improvements of the quality of signals are provided like detrending, high-pass filtering etc. The methods are well-established in fMRI analysis literature and please refer to the [document](https://project-model-based-fmri.readthedocs.io/en/latest/mbfmri.preprocessing.html#mbfmri-preprocessing-bold-module) for detail. 
- Unlike GLM, confounds are controlled in this step by regressing out confound factors. You can designate a list of confound names and each of it should match the corresponding column name in confounds file (generated by prior fMRI preprocessing).

### 4. fMRI Analysis - Choose which method and which model to use for fMRI analysis (GLM, MVPA-ElasticNet, MVPA-MLP, or MVPA-CNN) and decide some details <a name = "fMRIAnalysis"> </a>

#### 4-1. Type of analysis - either GLM or MVPA
- MVPA is used as default but GLM could be applied as well.
- e.g. `analysis = 'mvpa'` or `analysis = 'glm'`

#### 4-2. If you chose MVPA, you should decide which MVPA model to use and the type of cross-validation.
- For MVPA, **ElasticNet** is used as default, but other models such as MLP and CNN are provided by MBfMRI package.
    - For fitting ElasticNet, MBfMRI depends on [glmnet Python package](https://github.com/civisanalytics/python-glmnet). You will get plots for lambda searching and coefficients values as the convention of employing ElasticNet.
    - e.g. `mvpa_model = 'elasticnet'`, `mvpa_model = 'mlp'`, or `mvpa_model = 'cnn'`


- A cross-validation framework is employed in MBfMRI package to secure validity.
    - Two options are available currently, "N-fold" for N-fold cross-validation and "N-lnso" for leave-n-subjects-out. 
    - You will get the pearson R correlation plot generated using the results of cross-validation. All the visible reports and results are integrated from the results of each fold (you can also locate row result of each fold in the report folder).
    - e.g. `method = '5-fold'`

### 5. Run the code! <a name = "Code"> </a>

Now you are ready to run the code!
We provide sample codes for the analyses with different types of target latent processes and for different types of fMRI analyses including GLM, MVPA-ElasticNet, MVPA-MPL, and MVPA-CNN.

### 6. Output <a name = "Output"> </a>

#### Brain activation map

The final output and the purpose of the MB-MVPA is a brain activation pattern map attributed to the target latent process.<br>
This will be obtained by interpreting the MVPA model. For ElasticNet, it means reading coefficients of the linear layer.<br>
You can find the nii image under the "brain_map" folder in the reports.

Now, let's get started and check the outputs!

### 1) MVPA - Elastic NET

#### 1-1) MVPA - Elastic Net / PE as target latent process

In [None]:
from mbfmri.core.engine import run_mbfmri

_ = run_mbfmri(
               ### To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed") - to distinguish voxel feature data generated from different configurations.
                                                      # (e.g. the preprocessed file will be saved as: "sub-01_task-learn_desc-zoom2rgrout_voxelfeature.npy" ) 
                                                      # Redundant preprocessing step could be skipped and avoided when the files with the same feature name already exists.
                                                      # But it could be overridden by setting "overwrite = True"
    

               confounds=["trans_x", "trans_y",     # list of confounds (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
    
               ### To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='PEchosen',             # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               ### For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='elasticnet',             # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation

               n_thread=4,                          # number of thread for multi-threading in generating voxel features
               
    
               ### others
               overwrite=True,                      # indicate if re-generate voxel feaures and latent process and they should be overwritten. (not related to re-fitting hBayesDM)

              )

#### 1-2) MVPA - Elastic Net / Q value as target latent process

In [None]:
from mbfmri.core.engine import run_mbfmri

_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed")

    
               # To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='Qchosen',              # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='elasticnet',             # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation
               confounds=["trans_x", "trans_y",     # list of confounds to regress out (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features
               
    
               # others
               overwrite=True,                      # indicate if re-run and overwriting are required 
              )

#### 1-3) MVPA - Elastic Net / Model selection

In [None]:
from mbfmri.core.engine import run_mbfmri
import hbayesdm

_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])s
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed")

    
               # To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= ['banditNarm_lapse_decay', # computational model candidates
                          #'banditNarm_delta',
                          'banditNarm_2par_lapse',
                          'banditNarm_4par',
                          'banditNarm_lapse',
                          'banditNarm_singleA_lapse'],
               process_name='Qchosen',              # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='elasticnet',             # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation
               confounds=["trans_x", "trans_y",     # list of confounds to regress out (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features
               
    
               # others
               overwrite=True,                      # indicate if re-run and overwriting are required 
              )

#### 1-4) MVPA - Elastic Net /  Use precalculated latent process (Not using hBayesDM)

In [None]:
from mbfmri.core.engine import run_mbfmri

_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed")

    
               # To run computational modeling (use hBayesDM) and make latent process signal
               skip_compmodel=True,
               process_name='PrecalculatedPEchosen',# identifier for target latent process
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='elasticnet',             # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation
               confounds=["trans_x", "trans_y",     # list of confounds to regress out (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features
              )


### 2) MVPA - MLP

In [None]:
_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed")

    
               # To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='Qchosen',             # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='mlp',                    # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation
               confounds=["trans_x", "trans_y",     # list of confounds to regress out (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features
              )

### 3) MVPA - CNN

In [None]:
_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               feature_name='zoom2rgrout',          # Name for indicating preprocessed feature (default: "unnamed")


    
               # To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='Qchosen',             # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='mvpa',                     # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               mvpa_model='cnn',                    # (ONLY for MVPA) which kind of MVPA model will be used ('elasticnet', 'mlp', or 'cnn')
               method='5-fold',                     # (ONLY for MVPA) type of cross-validation
               confounds=["trans_x", "trans_y",     # list of confounds to regress out (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features
              )

# Tutorial: model-based GLM using MBfMRI <a name = "MBGLM"> </a>

The package provides the GLM approach, model-based GLM, as well and it has the same procedure of the prevailing approach.<br>
Following is the tutorial for conducting model-based GLM using MBfMRI.

In [None]:
_ = run_mbfmri(
               # To identify, load, and save data
               bids_layout='mini_bornstein2017',    # data path - the root for entire BIDS layout 
               task_name='multiarmedbandit',        # identifier for task (BIDS)
               subjects='all',                      # default: 'all' - load all subjects in the layout.
                                                      # could be a list of subject IDs (string) (e.g., ['01', '02'])
               sessions = 'all',                    # default: 'all', could be a list of sessions. (e.g. ['01','02'])
               
               # To run computational modeling (use hBayesDM) and make latent process signal
               dm_model= 'banditNarm_lapse_decay',  # computational model
               process_name='Qchosen',             # identifier for target latent process
               refit_compmodel=True,                # indicate if refitting comp. model is required
               n_core=4,                            # number of core for multi-processing in hBayesDM    

    
               # For fMRI analysis
               analysis='glm',                      # name of analysis ('mvpa' or 'glm', default: 'mvpa')
               confounds=["trans_x", "trans_y",     # list of confounds to be in desgin matrix (including motion regressors)
                          "trans_z", "rot_x",
                          "rot_y", "rot_z"],    
               n_thread=4,                          # number of thread for multi-threading in generating voxel features & Firstlevel glm
              )