# FMRI Preprocessing

# Quality Control with MRIQC

We use the Poldrack Lab's <a href="https://mriqc.readthedocs.io/en/0.10.3/"> mriqc v0.10.3 </a> to quality control all of the scans. MRIQC is a great tool that performs automatic extraction of qc metric and generates standardized subject reports as well as aggregate group reports. This allows us to use the group reports to first screen for outlier scans and then use the participant level reports to determine if these outliers are candidates for exclusion.

## Generate Reports

This cell first generates individual reports and qc measures for each participant. It then generates group aggregate reports that are used to determine outlier candidates for exclusion as documented in the cell below.

Before running, one needs to change the user id and group to match the desired user and group (-u userid:groupid). One should also change the fd threshold to match the desired threshold to generate motion statistics with for the BOLD data. Finally, one should change the --n_procs argument based on their available computational resources.

Warning: This cell takes several hours to run with 10 processes allocated.

In [7]:
from subprocess import call
import os

# bids path (absolute path needed for docker) 
data_path = '/autofs/space/cassia_001/users/matt/msit/data'

# mriqc output path (absolute path needed for docker) 
# this must be created ahead of time or it will be created with root permissions
mriqc_path = '%s/derivatives/mriqc' % data_path
if not os.path.exists(mriqc_path):
    os.makedirs(mriqc_path)

# change these if necessary
user_id = '3950117'
group_id = '1047'
fd_thres = '0.9'

docker_command = ['docker',
                  'run',
                  '--rm',
                  '-u', '%s:%s' % (user_id, group_id),
                  '-v', '%s:/data:ro' % data_path,
                  '-v', '%s:/out' % mriqc_path,
                  '-v', '%s:/work' % mriqc_path,
                  'poldracklab/mriqc:0.10.3',
                  '/data', '/out',
                  'participant',
                  '-w', '/work',
                  '--no-sub',
                  '--verbose-reports',
                  '--write-graph',
                  '--ica',
                  '--n_procs=10',
                  '--fft-spikes-detector',
                  '--fd_thres', fd_thres]

print(' '.join(docker_command))
call(docker_command)

# generate group reports
docker_command = ['docker',
                  'run',
                  '--rm',
                  '-u', '%s:%s' % (user_id, group_id),
                  '-v', '%s:/data:ro' % data_path,
                  '-v', '%s:/out' % mriqc_path,
                  '-v', '%s:/work' % mriqc_path,
                  'poldracklab/mriqc:0.10.3',
                  '/data', '/out',
                  'group',
                  '-w', '/work']

print(' '.join(docker_command))
call(docker_command)

docker run --rm -u 3950117:1047 -v /autofs/space/cassia_001/users/matt/msit/data:/data:ro -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/mriqc:/out -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/mriqc:/work poldracklab/mriqc:0.10.3 /data /out participant -w /work --no-sub --verbose-reports --write-graph --ica --n_procs=10 --fft-spikes-detector --fd_thres 0.9
docker run --rm -u 3950117:1047 -v /autofs/space/cassia_001/users/matt/msit/data:/data:ro -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/mriqc:/out -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/mriqc:/work poldracklab/mriqc:0.10.3 /data /out group -w /work


0

### Observations & Subject Exclusions

We use the group reports for the T1 and BOLD scans to determine exclusions:

- <a href="../data/derivatives/mriqc/reports/T1w_group.html"> T1 Group Report</a>
- <a href="../data/derivatives/mriqc/reports/bold_group.html"> BOLD Group Report</a>

T1 Observations:
* pre-scan normalize was off for the T1 on the bay 4 prisma
* The newer prisma has much better noise quality than the older trio scanners as measured by EFC (measure of ghosting/blurring) & FBER (relative energy within brain relative to background). Hard to actually see this however due to lack of pre-scan normalize for the prisma scans. This difference also appears whenever anything is normalized by background noise (such as SNRD).
* The CJV & CNR measures appeared to be inversely related with heavy tails. The tails seemed ok and not grounds for exclusion. It seems it may have been sensitive to the amount of gyral folding and hence the discriminability of grey and white matter?
* A few other outliers, but most related to wrap around, ghosting, etc. that did not affect the brain itself (apart from exclusions below).

BOLD Observations:
* A few with fairly bad motion. Leaving in to see if scrubbing/correction can help.
* sub-hc045 has weird frontal dropout (not present in T1). Leaving in to see if b0 correction helps. Outlier on FWHM y.

Subject Exclusions:
* sub-hc037: Appears to not have had anterior head coil in. Caught by being outlier in FWHM y for T1 and bold.
* sub-hc018: Has a 10 mm motion. Outlier on FD and AOR.
* sub-hc020: Extremely bad motion. Caught as outlier on AOR and Average FD.
* sub-hc047: Really bad motion and really bad distortion/dropout in the frontal regions. Caught as outlier on Average FD and FWHM y.

In [1]:
import sys
sys.path.append('../src')
from utils import exclude_subjects

exclude_subjects(['sub-hc018', 'sub-hc020', 'sub-hc037', 'sub-hc047'], 'fmri')

# Preprocessing with FMRIPREP

Here we use the Poldrack lab's <a href="https://fmriprep.readthedocs.io/en/1.0.8/index.html">fmriprep v1.0.8</a> software package to perform preprocessing of the MSIT BOLD data. fmriprep is an awesome tool built off of nipype that combines different preprocessing steps across multiple packages into a single preprocessing workflow. 

The full workflow is detailed <a href="https://fmriprep.readthedocs.io/en/1.0.8/workflows.html#">here</a>. The primary components of the workflow are:
- brainmask generation
- freesurfer reconstruction
- BOLD motion correction
- BOLD B0 field distortion correction
- Slice time correction
- Spatial normalization
- Generation of confound signals

## Run FMRIPREP 

We used fmriprep's docker image to install and run fmriprep following the fmriprep documentation's instructions. This will require one to have docker installed. With docker installed and the image downloaded, the commands below should then work.

Warning: Each fmriprep run is quite computationally intensive. A single run will take a few GB of memory and will need to run overnight. Running all of our subjects serially would take > 50 days. To speed up this process, we run multiple subjects in parallel. With our computing resources we were able to run 10 subjects at a time reducing the computation time to ~5 days. The script below automatically detects the number of fmriprep processes running and adds new one as cores open up so that the number of cores being used for processing is always 10 or less. One can lower or raise the number of cores as allowed by their computing resources.

Additionally, to run the cell below, one should change the user_id and group_id variables to match your user_id and group_id. To determine you user id run: 'id -u username' in the terminal replacing username with your username. To see all of your group ids run 'id -G username'

In [1]:
from subprocess import Popen
import os
import sys
sys.path.append('../src')
from utils import select_subjects

# change these if necessary
num_cores = 10
user_id = '3950117'
group_id = '1047'

# bids path (absolute path needed for docker) 
data_path = '/autofs/space/cassia_001/users/matt/msit/data'

# fmriprep output path (absolute path needed for docker) 
# this must be created ahead of time or it will be created with root permissions
# you must also have placed a valid fs license file in this directory
fmriprep_path = '%s/derivatives/fmriprep' % data_path
if not os.path.exists(fmriprep_path):
    os.makedirs(fmriprep_path)

# get the subjects
subjects = select_subjects('fmri', [])
num_sub = len(subjects)
sub_num = 1
    
# fmriprep docker command template
docker_command = ['docker', 'run', '--rm',
                  '-u', '%s:%s' % (user_id, group_id),
                  '-v', '%s:/data:ro' % data_path,
                  '-v', '%s:/out' % fmriprep_path,
                  '-v', '%s:/work' % fmriprep_path,
                  '-w', '/work',
                  'poldracklab/fmriprep:1.0.8',
                  '/data', '/out',
                  'participant',
                  '--participant_label', 'replaced with subject id',
                  '-t', 'msit',
                  '-w', '/work',
                  '--omp-nthreads', '1',
                  '--nthreads', '1',
                  '--output-space', 'fsaverage', 'fsnative', 
                  'T1w', 'template',
                  '--fs-license-file', '/out/fs_license.txt']

running_procs = []
running_subs = []

while sub_num <= num_sub:
    
    if len(running_procs) < num_cores:
        subject = subjects[sub_num - 1]
        print('Starting Subject # %d: %s' % (sub_num, subject))
        
        # start new fmriprep on available core
        docker_command[18] = subject
        running_procs.append(Popen(docker_command))
        running_subs.append((subject, sub_num))
        print(' '.join(docker_command))
        
        sub_num += 1
    else:
        # run through existing fmriprep processes to see if 
        # any have completed, if so remove from running list
        running_procs_copy = []
        running_subs_copy = []
        for r, s in zip(running_procs, running_subs):
            if r.poll() is None:
                running_procs_copy.append(r)
                running_subs_copy.append(s)
            else:
                print('Finished Subject # %d: %s' % (s[1], s[0]))
        running_procs = running_procs_copy
        running_subs = running_subs_copy
        
print('Done!')

Starting Subject # 1: sub-hc001
docker run --rm -u 3950117:1047 -v /autofs/space/cassia_001/users/matt/msit/data:/data:ro -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep:/out -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep:/work -w /work poldracklab/fmriprep:1.0.8 /data /out participant --participant_label sub-hc001 -t msit -w /work --omp-nthreads 1 --nthreads 1 --output-space fsaverage fsnative T1w template --fs-license-file /out/fs_license.txt
Starting Subject # 2: sub-hc002
docker run --rm -u 3950117:1047 -v /autofs/space/cassia_001/users/matt/msit/data:/data:ro -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep:/out -v /autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep:/work -w /work poldracklab/fmriprep:1.0.8 /data /out participant --participant_label sub-hc002 -t msit -w /work --omp-nthreads 1 --nthreads 1 --output-space fsaverage fsnative T1w template --fs-license-file /out/fs_license.txt
Starting Sub

After the computations above are finished, you should run this quick cell here to clean up the directory output from fmriprep. This results in a separate fmriprep derivatives folder with the fmriprep results inside and a separate Freesurfer recons derivatives folder (labeled freesurfer) with the Freesurfer reconstructions inside.

In [9]:
%%bash

cd ../data/derivatives/fmriprep
mv fmriprep/* .
rm -r fmriprep
mv freesurfer ..

mv: cannot stat ‘fmriprep/*’: No such file or directory


## Observations & Subject Exclusions

Here we do a sanity check through the output report for each subject to make sure nothing really bad was off. 

<a href="file:///autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep/sub-hc016.html"> Here is a link to an Example Report </a>

Observations:
* All looked good except the additional exclusions denoted below and those in the QC exclusions section.

Exclusions:
* sub-hc045: Serious frontal signal dropout extending quite far back. Would make inference in large parts of frontal cortex intractable.

In [2]:
import sys
sys.path.append('../src')
from utils import exclude_subjects

exclude_subjects(['sub-hc045'], 'fmri')

# FMRIPREP to FSFAST Interface

The final step in our preprocessing pipeline is to coerce the outputs from fmriprep into the fsfast file structure so that we can run 1st levels using fsfast. We also perform the final preprocessing step which involves spatially smoothing the data to help increase our SNR. 

In [3]:
%%bash

# source freesurfer
export FREESURFER_HOME=/usr/local/freesurfer/stable6_0_0
source ${FREESURFER_HOME}/SetUpFreeSurfer.sh

# full path to recons folder
export SUBJECTS_DIR=/autofs/space/cassia_001/users/matt/msit/data/derivatives/freesurfer

fwhm=4
tr=1750

fmriprep_dir=/autofs/space/cassia_001/users/matt/msit/data/derivatives/fmriprep
fsfast_dir=/autofs/space/cassia_001/users/matt/msit/data/derivatives/fsfast

# subjects=$(find $fmriprep_dir -type d -name "sub-*" -printf "%f\n" -maxdepth 1)
subjects=$(cat $fsfast_dir/subjects)
for subject in $subjects
do
    echo $subject
    
    # set up folder
    run_dir=$fsfast_dir/$subject/msit/001
    mkdir -p $run_dir/masks
    echo $subject > $fsfast_dir/$subject/subjectname
    fmp_dir=$fmriprep_dir/$subject/func
    fmp_stem=${subject}_task-msit_bold
    
    # copy over functional volume and reinforce tr
    # we copy the same twice as f and fmcpr since fsfast requires
    # files named this way at different stages that we're skipping
    in_stem=${fmp_stem}_space-T1w_preproc
    cp $fmp_dir/${in_stem}.nii.gz $run_dir/f.nii.gz
    mri_convert $run_dir/f.nii.gz  \
                $run_dir/f.nii.gz -tr $tr
    cp $run_dir/f.nii.gz $run_dir/fmcpr.nii.gz
                
    # convert surface files
    declare -a hemis=("l" "r")
    for hemi in "${hemis[@]}"
    do
       uphemi=$(echo $hemi| awk '{print toupper($0)}')
       out_stem=fmcpr.sm${fwhm}.fsaverage.${hemi}h
       in_stem=${fmp_stem}_space-fsaverage.${uphemi}.func
       
       mri_surf2surf --srcsubject fsaverage --trgsubject fsaverage \
                     --sval $fmp_dir/${in_stem}.gii \
                     --tval $run_dir/${out_stem}.nii.gz \
                     --fwhm-trg $fwhm --hemi ${hemi}h
       mri_convert $run_dir/${out_stem}.nii.gz  \
                   $run_dir/${out_stem}.nii.gz -tr $tr
    done
    
    # sample volume file to tal space 
    # this will re-create
    cd $fsfast_dir
    preproc-sess -per-run -s $subject -mni305 -fwhm $fwhm -nostc -nomc \
                 -fsd msit
    
done

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

** DA[0] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[1] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[2] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[3] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[4] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[5] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[6] has coordsys with intent NIFTI_INTENT_TIME_SERIES (should be NIFTI_INTENT_POINTSET)
** DA[7] has coordsys