# MRTrix Preprocessing Pipeline
- via: https://andysbrainbook.readthedocs.io/en/latest/MRtrix/MRtrix_Course/MRtrix_04_Preprocessing.html
- General approach in this notebook: 
    - build an initial dataframe containing the subject ids
    - apply preprocessing steps, where each step creates a new subdirectory and/or file
    - store the filepaths in new columns with

In [22]:
import numpy as np
import pandas as pd

from nipype.interfaces.dcm2nii import Dcm2nii

In [17]:
# specify project directory
main_dir = '/media/forest/wd_1/data_CLMS/diffusion'

## Prep Data for Preprocessing
- specified `home_dir` should contain subdirectories, 1 for each subject at one timepoint (m00 or m24)
- each subdirectory should contain files with extension `.mif`

### dcm to nifti+bval+bvec
-  dcm2nii: https://www.nitrc.org/plugins/mwiki/index.php/dcm2nii:MainPage
    - prereq: `sudo apt install mricron`
    - alternatively, can use the wrapper in nipype
        - https://nipype.readthedocs.io/en/latest/api/generated/nipype.interfaces.dcm2nii.html
- dcm files for each subject are stored inside the subdirectory: `1_dcm`


In [40]:
# specify subdirectory storing dcm files
dcm_dir = main_dir + '/1.1_dcm'

# create new dataframe to store file paths; subj id format: \w{3}_\d{3}_m\d{2}
subject_names = !ls $dcm_dir # dcm data should be stored in subdirectories named by subject
subject_dcm_dir = !ls -d $dcm_dir/* 

df_filepaths = pd.DataFrame({'subject_names': subject_names, 'dcm_path': subject_dcm_dir})
df_filepaths

Unnamed: 0,subject_names,dcm_path
0,aah_032_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
1,aaj_034_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
2,aao_039_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
3,bbe_055_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
4,ghi_007_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
5,ijk_009_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
6,mno_013_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
7,pqr_016_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
8,vwx_022_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...
9,xyz_024_m00,/media/forest/wd_1/data_CLMS/diffusion/1.1_dcm...


In [111]:
# Process dcm files into nifti

# check each row of the subject_dcm_dir column
# pass to dcm2nii to convert data to nifti
for subject, dcm in zip(df_filepaths['subject_names'], df_filepaths['dcm_path']):

    nifti_dir_exists = ![ -d "{main_dir}/1.2_nifti/{subject}" ] && echo 'True';
    if nifti_dir_exists == ['True']:
        print('Directory already exists!')
    else:
        !echo "Creating Subdirectory: {main_dir}/1.2_nifti/{subject}"
        !mkdir "{main_dir}/1.2_nifti/{subject}"
        !echo "Running dcm2nii..."
        # run command
        !dcm2nii -o {main_dir}/1.2_nifti/{subject} {dcm_dir}/{subject}

Creating Subdirectory: /media/forest/wd_1/data_CLMS/diffusion/1.2_nifti/aah_032_m00
Running dcm2nii...
Chris Rorden's dcm2nii :: 4AUGUST2014 (Debian) 64bit BSD License
reading preferences file /home/forest/.dcm2nii/dcm2nii.ini
Data will be exported to /media/forest/wd_1/data_CLMS/diffusion/1.2_nifti/aah_032_m00/
Validating 140 potential DICOM images.
Found 140 DICOM images.
Converting 140/140  volumes: 140
002155.dcm->20160926_112437diffmb3140b2000APs015a001.nii
  MosaicRefAcqTimes (72 values for Slice Time Correction) 	0	1275	2547.5	1042.5	2317.5	810	2085	580	1852.5	347.5	1622.5	115	1390	2665	1157.5	2432.5	927.5	2200	695	1970	462.5	1737.5	232.5	1505	0	1275	2547.5	1042.5	2317.5	810	2085	580	1852.5	347.5	1622.5	115	1390	2665	1157.5	2432.5	927.5	2200	695	1970	462.5	1737.5	232.5	1505	0	1275	2547.5	1042.5	2317.5	810	2085	580	1852.5	347.5	1622.5	115	1390	2665	1157.5	2432.5	927.5	2200	695	1970	462.5	1737.5	232.5	1505
 These values suggest a multiband factor of 3
For slice timing correction: 

In [112]:
# rename nifti, bval, bvec files to subject name only, then add columns to the dataframe 
for subject in df_filepaths['subject_names']:
    print(f'Subject name: {subject}...')
    nifti_file = !ls {main_dir}/1.2_nifti/{subject}/*.nii.gz
    bval_filepath = !ls {main_dir}/1.2_nifti/{subject}/*.bval
    bvec_filepath = !ls {main_dir}/1.2_nifti/{subject}/*.bvec
    !mv {nifti_file[0]} {main_dir}/1.2_nifti/{subject}/{subject}_dwi.nii.gz
    !mv {bval_filepath[0]} {main_dir}/1.2_nifti/{subject}/{subject}_dwi.bval
    !mv {bvec_filepath[0]} {main_dir}/1.2_nifti/{subject}/{subject}_dwi.bvec
    print("Files renamed.")

Subject name: aah_032_m00...
Files renamed.
Subject name: aaj_034_m00...
Files renamed.
Subject name: aao_039_m00...
Files renamed.
Subject name: bbe_055_m00...
Files renamed.
Subject name: ghi_007_m00...
Files renamed.
Subject name: ijk_009_m00...
Files renamed.
Subject name: mno_013_m00...
Files renamed.
Subject name: pqr_016_m00...
Files renamed.
Subject name: vwx_022_m00...
Files renamed.
Subject name: xyz_024_m00...
Files renamed.


In [125]:
# append filepaths to dataframe
niftis, bvals, bvecs = [], [], []

for subject in df_filepaths['subject_names']:
    tmp1 = !ls {main_dir}/1.2_nifti/{subject}/*.nii.gz
    tmp2 = !ls {main_dir}/1.2_nifti/{subject}/*.bval
    tmp3 = !ls {main_dir}/1.2_nifti/{subject}/*.bvec

    niftis.append(tmp1[0])
    bvals.append(tmp2[0])
    bvecs.append(tmp3[0])

In [128]:
df_filepaths['1.2_niftis'] = niftis
df_filepaths['1.2_bvals'] = bvals
df_filepaths['1.2_bvecs'] = bvecs

### Validation & QC

In [None]:
# Ensure number of volumes matches with bval/bvec for all subjects



In [None]:
# 

In [None]:
## 

### Convert files to .mif

In [None]:
# visual inspection--use ipywidgets & matplotlib

## Preprocessing

### Denoising

### Remove Gibbs Artifacts

In [None]:
### Final 