# Tutorial

This is a tutorial for MB-MVPA using task-fMRI data of Mixed-gamble task by Tom et al., 2007. 



### Import the MB-MVPA libarary.

Other libraries(nilean, keras, etc..) dosen't need to be imported.<br>
Because mb-mvpa has wrapping the libararies.<br>
You don't necessarily have to know fMRI libraries like nilearn and machine learning libraries like tensorflow.<br>
<b>MB-MVPA is all you need.</b>

Most of mb-mvpa are wrapping nilearn, tensorflow, Keras and etc., so warning can occur from that libraries.<br>
This page does not print warning because most of them are can be ignored.<br>
You don't need to remove the warning when you are actually using it.

In [1]:
from time import perf_counter

In [2]:
from mbmvpa.preprocessing.preprocess import DataPreprocessor



TODO: add original data download link

Data download from AWS S3, ~ <b>1GB</b> (would be under the "Mixed-gamble_task/example_data/").<br>

We provide a small subset (2 subjects) of original Tom's dataset (16 subjects). The fMRI images in the example is preprocessed by conventional fMRI preprocessing pipeline by using 
[*fmriprep*](https://fmriprep.org/en/stable/) v.20.1.0. Please refer to the [original](https://openneuro.org/datasets/ds000005/versions/00001) for more information.

In [3]:
#root = load_example_data("tom")
root = "/data2/project_modelbasedMVPA/ds000005"

### Preprocessing fMRI images and behavioral data

MB-MVPA requires primariliy preprocessed task-fMRI experiments data fromatted in conventional [BIDS format](https://bids-specification.readthedocs.io/en/stable/) 

It expects the following organized files. All the naming conventions used here conform with outputs from *fmriprep* v.20.1.0. by Poldrack lab.

The fMRI images are usually located here<br>
<i>{BIDS_ROOT}/derivatives/fmriprep/subject/session/run/func/*nii.gz</i><br>
And the behavior data are located here<br>
<i>{BIDS_ROOT}/subject/session/run/func/*.tsv</i>

In [5]:
s = perf_counter()

dm_model = 'ra_prospect'

def example_adjust(row):
    ## rename data in a row to the name which can match hbayesdm.ra_prospect requirements ##
    row["gamble"] = 1 if row["respcat"] == 1 else 0
    row["cert"] = 0
    return row

def example_filter(row):
    # include all trial data
    return True

def example_latent(row, param_dict):
    ## calculate subjectives utility for choosing Gamble over Safe option
    ## prospect theory with loss aversion and risk aversion is adopted
    modulation = (row["gain"] ** param_dict["rho"]) - (param_dict["lambda"] * (row["loss"] ** param_dict["rho"]))
    row["modulation"] = modulation
    return row


preprocessor = DataPreprocessor(bids_layout=root,
                               adjust_function=example_adjust,
                               filter_function=example_filter,
                               latent_function=example_latent,
                               dm_model=dm_model,
                               confounds=[],
                               zoom=(2,2,2))
print(f"elapsed time: {(perf_counter()-s) / 60:.2f} minutes")

elapsed time: 0.49 minutes


In [6]:
preprocessor.summary()

[  fMRIPrep  ] BIDS Layout: .../ds000005/derivatives/fmriprep | Subjects: 16 | Sessions: 0 | Runs: 48
[  MB-MVPA   ] BIDS Layout: ...PA/ds000005/derivatives/mbmvpa | Subjects: 16 | Sessions: 0 | Runs: 48


In [None]:
s = perf_counter()

preprocessor.preprocess(overwrite=True,n_core=16)
#preprocessor.preprocess(overwrite=False,n_core=16)

print(f"elapsed time: {(perf_counter()-s) / 60:.2f} minutes")

12it [04:51, 24.33s/it]


Using cached StanModel: cached-ra_prospect-pystan_2.19.1.1.pkl

Model  = ra_prospect
Data   = <pandas.DataFrame object>

Details:
 # of chains                    = 4
 # of cores used                = 4
 # of MCMC samples (per chain)  = 4000
 # of burn-in samples           = 1000
 # of subjects                  = 16
 # of (max) trials per subject  = 256

Using cached StanModel: cached-ra_prospect-pystan_2.19.1.1.pkl




In [None]:
preprocessor.summary()

### Load data and shape check

In [None]:
from mbmvpa.data.loader import BIDSDataLoader

In [None]:
s = perf_counter()

loader = BIDSDataLoader(layout=root)
X,y = loader.get_total_data()

print(f"elapsed time: {(perf_counter()-s) / 60:.2f} minutes")

In [None]:
print("X", X.shape)
print("y", y.shape)

In [None]:
voxel_mask = loader.get_voxel_mask()

### Fitting MVPA models & Results

In [None]:
from mbmvpa.models.mvpa_elasticnet import elasticnet

In [None]:
coef,intercept = elasticnet(X=X,
                  y=y,
                  voxel_mask=voxel_mask,
                  save_path='report_elasticnet',
                  sigma=0)

# survival coefs # report

In [None]:
coef.shape

In [None]:
X.shape

In [None]:
coef.shape, intercept.shape

In [None]:
import numpy as np
from scipy import stats

In [None]:
pred = np.matmul(coef,X.T) + intercept

In [None]:
stats.pearsonr(pred.flatten(),y.flatten())

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
import tqdm

In [None]:
r_test = []
r_train = []
for i in tqdm.tqdm(range(30)):
    ids = np.arange(X.shape[0])

    train_ids, test_ids = train_test_split(
                ids, test_size=0.2, random_state=42+i
            )



    coef,intercept = elasticnet(X=X[train_ids],
                      y=y[train_ids],
                      voxel_mask=voxel_mask,
                      save_path='report_elasticnet',
                      save=False,
                      verbose=0)

    r_train.append(stats.pearsonr((np.matmul(coef,X[train_ids].T) + intercept).flatten(),y[train_ids].flatten())[0])
    r_test.append(stats.pearsonr((np.matmul(coef,X[test_ids].T) + intercept).flatten(),y[test_ids].flatten())[0])

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.boxplot([r_train, r_test], labels=['train','test'], widths=0.6)
plt.show()

In [None]:
print(r_train)

In [None]:
print(r_test)