# 2a. Preprocess functional data

#### Note: This notebook is created just for the purposes of following along and possibly using as a reference in the future, since many of you won't have installed AFNI and since the input files would be very large and would take a long time to download.

This script creates and executes AFNI scripts to preprocess time series data
for a list of subjects. It takes as input:

1) A list of subject IDs (here, *subjects.json*)

2) Nifti or AFNI files for each functional run for each subject

3) Each subject's high-resolution anatomical scan (*subjectid_SurfVol_Alnd_Exp+orig*)

4) Each subject's anatomical segmentation file (*aseg_Alnd_Exp+orig*), which was created with Freesurfer (*recon-all*), converted to AFNI format (with  *@SUMA_Make_Spec_FS*) and aligned to the high-res scan to which the functional data will also be aligned. In preprocessing, the *aseg* file is used to obtain anatomical masks of the ventricles and white matter to generate global and local nuisance regressors.


You can refer to the relevant AFNI documentation for further details on how these are defined, and on the various options that are specified, but briefly, preprocessing steps include:

- Despiking (removal of transient, extreme fluctuations in signal)
- Alignment to the participant's high-res anatomical file (which is already in alignment with the participant's FreeSurfer-derived tissue segmentation and cortical parcellation files)
- Scaling values within voxels across time
- Extracting, resampling and eroding white matter and ventricle masks (from the participant's Freesurfer segmentation file)
- Smoothing data separately in the grey and non-grey matter masks
- Extracting and detrending (using the polynomical selected using *-polort*) local and global nuisance regressor time series from the WM and ventricle masks, respectively
- Removal of regressors of non-interest (ventricle signal, local WM signal, motion parameters, their derivatives), as well as a third-order polynomical, from the EPI time series



In [2]:
import os
import subprocess
import json
import glob

In [3]:
def sh(c):
    '''
    run shell commands
    '''
    subprocess.call(c, shell = True)

In [4]:
study_path = "./data"

# load in list of fMRI subjects
with open("{}/fmri/fmri_subjects.json".format(study_path)) as data_file:
    subj_list = json.load(data_file)
    
nruns = 6

In [18]:
# preprocess each run for each subject
for subject in subj_list:
    subj_dir = "{}/fmri/example_subj/{}".format(study_path, subject)
    os.chdir(subj_dir)
    
    # create dictionary of raw EPI files, where keys are prefixes to use
    # when naming and storing pre-processed data, and values are the names
    # of the raw data files
    my_epis = {}
    for epi_num in range(1, nruns+1):    
        my_epis['epi{}'.format(epi_num)] = glob.glob('{}/{}_fMRI_{}_*'.format(subj_dir, subject, epi_num))[0]
        
    # get relevant anatomical files (high-res anatomical and 
    # Freesurfer-derived tissue segmentation file to be used 
    # to generate nuisance regressors)
    hires = "{}/{}_SurfVol_Alnd_Exp+orig".format(subj_dir, subject)
    aseg = "{}/aseg_Alnd_Exp+orig".format(subj_dir)

    # create and run preprocessing command for each epi run 
    for epi_prefix, raw_epi in my_epis.items():
        preproc = ("""afni_restproc.py -anat {hires} \
              -epi {raw_epi} \
              -prefix {epi_prefix}_preprocessed \
              -script {epi_prefix}_rest_proc_script.tcsh \
              -epi2anat \
              -uniformize \
              -anat_has_skull yes \
              -aseg {aseg} \
              -dreg \
              -trcut 2  \
              -localnorm \
              -smoothfirst \
              -smoothrad 4 \
              -polort 3 \
              -despike on""").format(hires = hires, raw_epi = raw_epi, epi_prefix = epi_prefix, aseg = aseg)
    
        sh(preproc)  
        