## Data Extraction
This is a guided tutorial on how to extract data from the DICOM-SERVER at CFMI unto your own machine. This notebook assumes you have the following already set up:

    1. An account on the CFMI server
    2. Proper permissions to view raw data
    3. A working copy of the ssh-tools repository
    4. A mounted network volume that points to the raw dicoms (see below for more info)
    5. GNU Parallel installed on your machine

Assuming you have the required prerequesities already satisfied you can get started by giong to https://github.com/seldamat/dicom-tools and forking it repository to your account

Next clone your fork to a place on your computer (you can change the code to choose your home directory)

In [4]:
# Define your github user name here
github_un='seldamat'
# Define place you want to put it
clone_here='/Volumes/CFMI-CFS/opt/dicom-tools'
# If directory doesn't exist, then Clone!
[ ! -d ${clone_here} ] && git clone https://github.com/${github_un}/dicom-tools.git ${clone_here}

Cloning into '/Users/shad/dicom-tools'...
remote: Counting objects: 27, done.[K
remote: Total 27 (delta 0), reused 0 (delta 0), pack-reused 27[K
Unpacking objects: 100% (27/27), done.
Checking connectivity... done.


Now check what's inside this new repo

In [7]:
cd ${clone_here}
ls -alg ./

total 24
drwxr-xr-x   5 staff   170 Jun 28 10:29 .
drwxr-xr-x+ 92 staff  3128 Jun 28 10:29 ..
drwxr-xr-x  12 staff   408 Jun 28 10:29 .git
-rw-r--r--   1 staff    74 Jun 28 10:29 README.md
-rwxr-xr-x   1 staff 17261 Jun 28 10:29 fetch-dicom


Use the --help argument to get some useful help information

```bash
./fetch-dicom --help

Usage :: fetch-dicom [-h|--help] [-r|--rawdatadir PATH] [-o|--outputdir PATH] [-s|--subject] 
[-y|--year YYYY] [-m|--month MM] [-d|--day DD] [-t|--tasks LIST]

Search for Seimens .IMA files in the raw data directory for a subject scanned on YYYY-MM-DD. User can specify tasks 
for extraction, or leave blank to extract everything. Dicoms are placed in the output directory.

Required Arguments
-r  --rawdatadir path to dicoms (subdirectory tree must follow ./YYYY/MM/DD)
-o  --outputdir extract dicoms here	       
-s  --subject subject ID
-y  --year year of scan date
-m  --month month of scan date
-d  --day day of scan date

Optional Arguments 	      
-t  --tasks list of tasks to extract data for (series description)
-prv --private omit identifiers from meta data text file (will still be present in dcm headers)

Other Arguments
-h  --help displays this message
```

The raw data directory is set inside fetch dicom by default to: 
```bash
/Volumes/CFMI-CFS/mnt/CFMI-DICOMS/
```
This may be different on your computer. I can help you get set up if you don't know what this means.

We may not know what the subject ID is.. but we do know the days that we scanned! We can use this to find our data.
```bash
./fetch-dicom -y 2017 -m 06 -d 26 -r '/Volumes/CFMI-CFS/mnt/CFMI-DICOMS'

Parsing Metadata for Studies Performed on 2017-06-26...🗒  3 Studies Found...
Printing List of Subjects Scanned on This Day

 ♔ Study PI: Turkeltaub 
 ✎ SID: QNJ
 ✎ Age: 064Y
 ✎ Sex: M 
 ✎ Wgt: 91.1720779677 
 ✎ Series (In Order Acquired) ‣ Session Start Time: 11:25:33
     1	 	PA_Multi Plane 50Slice Loc
     2	 	Siemens_MPRAGE
     3	 	ep2d_diff 80dir iPAT
     4	 	ep2d_diff 80dir iPAT_FA 
     5	 	ep2d_diff 80dir iPAT_ColFA
     6	 	MoCoSeries
     7	 	ep2d_pcasl_UI_PHC 

 ♔ Study PI: VanMeter 
 ✎ SID: FSSUM-1
 ✎ Age: 022Y
 ✎ Sex: F 
 ✎ Wgt: 63.502939878
 ✎ Series (In Order Acquired) ‣ Session Start Time: 13:11:08
     1	 	Multi Plane 50Slice Loc 
     2	 	Siemens_MPRAGE
     3	 	Siemens_MPRAGE

 ♔ Study PI: <unknown>
 ✎ SID: LM-LOKI
 ✎ Age: 006Y
 ✎ Sex: M 
 ✎ Wgt: 9.2 
 ✎ Series (In Order Acquired) ‣ Session Start Time: 17:09:17
     1	 	PA_Multi Plane 50Slice Loc
     2	 	Siemens_MPRAGE
     3	 	PA_Multi Plane 50Slice Loc
     4	 	Siemens_MPRAGE
     5	 	PA_Multi Plane 50Slice Loc
     6	 	Siemens_MPRAGE
     7	 	<MPR Range_Native (MR)_Anatomy[1]>
```

Now, let's make a directory to stick our data in and extract the data
```bash
fssw_folder='~/Documents/GitHub/FreeSurfer-Summer-Workshop'
datafolder="${fssw_folder}/mris/hydration-experiment/dicoms/baseline-day/FSSUM-1"
mkdir -p ${folder}
fetch-dicom -y 2017 -m 06 -d 26 -s 'FSSUM-1' -o ${datafolder} -r '/Volumes/CFMI-CFS/mnt/CFMI-DICOMS'
```

We can repeat the same procedure to check that the data is present on the second day
```bash
./fetch-dicom -y 2017 -m 06 -d 27

Parsing Metadata for Studies Performed on 2017-06-27..🗒  2 Studies Found...
Printing List of Subjects Scanned on This Day

 ♔ Study PI: Mandelblatt
 ✎ SID: 108880474
 ✎ Age: 076Y
 ✎ Sex: F 
 ✎ Wgt: 61.6885701672 
 ✎ Series (In Order Acquired) ‣ Session Start Time: 08:57:52
     1	 	Multi Plane 50Slice Loc 
     2	 	MPRAGE-ADNI 
     3	 	gre_field_mapping_88
     4	 	MoCoSeries
     5	 	ep2d_pace_vv2bk 
     6	 	MoCoSeries
     7	 	ep2d_pace_sc-encode1
     8	 	MoCoSeries
     9	 	ep2d_pace_sc-recognition1 
    10	 	MoCoSeries
    11	 	ep2d_pace_rest
    12	 	t2_tirm_tra_dark-fluid_3mm
    13	 	t2_spc_irprep_SAG_p2_dark-fluid_100pfov 
    14	 	ep2d_diff_55dir 
    15	 	ep2d_diff_55dir_TRACEW
    16	 	ep2d_diff_55dir_FA
    17	 	ep2d_diff_55dir_ColFA 

 ♔ Study PI: VanMeter 
 ✎ SID: FSSUM-1
 ✎ Age: 022Y
 ✎ Sex: F 
 ✎ Wgt: 63.502939878
 ✎ Series (In Order Acquired) ‣ Session Start Time: 11:55:22
     1	 	Multi Plane 50Slice Loc 
     2	 	Siemens_MPRAGE
     3	 	Multi Plane 50Slice Loc 
     4	 	Siemens_MPRAGE
     5	 	t2_spc_sag_p2_iso 
```

Let's extract this data as well

```bash
datafolder="${fssw_folder}/mris/hydration-experiment/dicoms/dehydrated-day/FSSUM-1"

mkdir -p ${folder}

fetch-dicom -y 2017 -m 06 -d 26 -s 'FSSUM-1' -o ${datafolder} -r '/Volumes/CFMI-CFS/mnt/CFMI-DICOMS'
```

## Data Conversion

The data that we have extracted is in a raw DICOM format (.IMA) and must be converted to the standard we use for neuroimaging analysis (.nii). We also want to compress our files to minimize the amount of space we are using (.nii.gz). 

First let's begin by moving to our workshop `mris` folder and creating a few folders to keep everything in.

```bash
cd ${fssw_folder}/mris
mkdir -p nii.gz/baseline-day/FSSUM-1    #type of file/experiment-day/subjectID
mkdir -p nii.gz/dehydrated-day/FSSUM-1
```

To convert the files, we need to use a converter. Here I have chosen the dcm2niix software as my preferred converter. You can obtain this converter by cloning it from github. Once you have it installed you can type

```bash
[ ! -d '~/bin/dcm2niix' ] && git clone https://github.com/rordenlab/dcm2niix ~/bin/dcm2niix
cd ~/bin/dc2mniix
```

The conversion is pretty straightforward. Type the following to get usage information

```bash
./dcm2niix

Compression will be faster with /usr/local/bin/pigz
Chris Rorden's dcm2niiX version 7July2016 (64-bit)
usage: dcm2niix [options] <in_folder>
 Options :
  -b : BIDS sidecar (y/n, default n)
  -f : filename (%c=comments %f=folder name %i ID of patient %m=manufacturer %n=name of patient %p=protocol, %q=sequence %s=series, %t=time; default '%q-%p')
  -h : show help
  -m : merge 2D slices from same series regardless of study time, echo, coil, orientation, etc. (y/n, default n)
  -o : output directory (omit to save to input folder)
  -s : single file mode, do not convert other images in folder (y/n, default n)
  -t : text notes includes private patient details (y/n, default n)
  -v : verbose (y/n, default n)
  -x : crop (y/n, default n)
  -z : gz compress images (y/i/n, default y) [y=pigz, i=internal, n=no]
 Defaults file : /Users/shad/.dcm2nii.ini
 Examples :
  dcm2niix /Users/chris/dir
  dcm2niix -o /users/cr/outdir/ -z y ~/dicomdir
  dcm2niix -f mystudy%s ~/dicomdir
  dcm2niix -o "~/dir with spaces/dir" ~/dicomdir
```

To convert our data from the experiments we ran, type this:

```bash
cd ${fssw_folder}/mris
dcm2niix -o nii.gz/baseline-day/FSSUM-1 dicoms/baseline-day/FSSUM-1/
dcm2niix -o nii.gz/dehydrated-day/FSSUM-1 dicoms/dehydrated-day/FSSUM-1
```


## Executing recon-all for Reconstruction

First we need to set our subjects directory and import the data

```bash
cd ${fssw_folder}/mris
mkdir ./fs
cd fs
export SUBJECTS_DIR=$(pwd)

# Day 1
## T1s
recon-all -i '../mris/hydration-experiment/nii.gz/baseline-day/FSSUM-1/GR_IR-Siemens_MPRAGE.nii.gz' -s 'sub.01.bl.01'
recon-all -i '../mris/hydration-experiment/nii.gz/baseline-day/FSSUM-1/GR_IR-Siemens_MPRAGEa.nii.gz' -s 'sub.01.bl.02'


# Day 2
## T1-1
recon-all -i '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/GR_IR-Siemens_MPRAGE.nii.gz' -s 'sub.01.dh.01'
## T1-2
recon-all -i '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/GR_IR-Siemens_MPRAGEa.nii.gz' -s 'sub.01.dh.02'
## T2-1
recon-all -i '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/GR_IR-Siemens_MPRAGE.nii.gz' -T2 '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/SE-t2_spc_sag_p2_iso.nii.gz' -s 'sub.01.dh.01t2'
## T2-2
recon-all -i '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/GR_IR-Siemens_MPRAGEa.nii.gz' -T2 '../mris/hydration-experiment/nii.gz/dehydrated-day/FSSUM-1/SE-t2_spc_sag_p2_iso.nii.gz' -s 'sub.01.dh.02t2'
```


Now we can run the recon-all pipeline

```bash
# Run all the T1s
find . -type d -maxdepth 1 -not -name '*t2' -not -name '*.log' -name 'sub*' | \  # find all T1s
 sed 's_./__' | \  # trim leading directory 
 nohup parallel \  # command was run remotely, so noh(ang)up.. also run in parallel
 recon-all -s {} -all -brainstem-structures -hippocampal-subfields-T1 -3T -noappend -make all -parallel -qcache \
 > pararecon.log & # print all output to a log file

# Run the special recon pipeline with the T2 weighted images
ls -d *t2 | awk '{print $NF}' | nohup parallel recon-all -s {} -T2pial -all -brainstem-structures -hippocampal-subfields-T1 -3T -make all -qcache -noappend -parallel > pararecon-t2s.log &
```


## Checking the Status of Our Submitted Jobs

```bash
ps aux | grep 'recon-all'

## or we can open the log file
cd $SUBJECTS_DIR
cat sub.01.dh.01/scripts/recon-all-status.log
```

## Now what? View your results!