# MNC to BIDS conversion - First Brain Hack School Project
## A step-by-step tutorial
________________

The goal of this tutorial was initially to generate functional connectivity matrices from functional magnetic resonance imaging (fMRI) starting from mnc images. The aim was to first convert the .mnc images in BIDS format. Then, to use fMRIPrep to pre-process the data and use the Nilearn package to extract connectivity matrices from the pre-processed images using different brain parcellations. 

This project was to be executed with data from the PREVENT-AD, a dataset of participants at risk of Alzheimer's Disease (AD), available openly on the Canadian Open Neuroscience Platform. We wanted to use fMRI data from 10 participants at two different timepoints, at their baseline visit and at their 12 months visit, to obtain two functional connectivity matrices per participants.

However, as can be seen below, this project could not be completed. A Jupyter notebook describing how to pre-process the data from a bids dataset is included in this repository.
________________


## Background

The current notebook aims to adress an issue in using .mnc images from the CONP dataset. The code below presents the data to be downloaded using DataLad, then, the minc-toolkit to be downloaded to obtain the mnc2nii tool. Finally, a tentative bash script, mnc2bids, is presented where the intended goal was to process the mnc2nii conversion while setting the output files in bids format. However, as I learned afterwards, the mnc2nii script doesn't produce bids sidecar files (i.e. .json files with parameters), which made it impossible to validate with the bids-validator.


### 1. - Presentation of the data (Prevent-AD / CONP)

#### 1.1 - Preparation for data download

The data used for this project is taken from the Prevent-AD cohort, the data for which is part of the Canadian Open Neuroscience Platform (CONP). The data is accessible [here](https://portal.conp.ca/dataset?id=projects/preventad-open). Note that an account is required for accessing the data. You can gain access by filling the form [here](https://openpreventad.loris.ca/). If you are using your own dataset with fMRIPrep, you can skip ahead to section 2.0 of the tutorial.

You can refer to the instructions on the CONP/Prevent-AD for how to download the data. Note that you will require the following to be able to install the dataset on your workstation:
- Datalad (you can install Datalad from this [link](http://handbook.datalad.org/en/latest/intro/installation.html)
- Git-annex (you can install Git-Annex from this [link](https://git-annex.branchable.com/install/)
- Homebrew (in the case where you can't install Git-annex with Conda and you use a Mac, as was my case, you can install it using Homebrew. You can install Homebrew from this [link](https://brew.sh/)

With Datalad, you gain access to the full dataset through symbolic links. This means that you will have access to the folders and be able to see what you can download before downloading any actual data. You will need to enter your credentials for the Open PREVENT-AD initiative to download and work with anything. 

**Be careful:** Note that the full dataset is quite heavy (170.89GB). Datalad gives an option to download all subjects, but only subjects/sessions of interest should be downloaded.

The commands used in bash to download the data are provided below for illustration purposes:

In [2]:
#Move to the directory where you want to download the data (I created a special folder called data_CONP/)

##%cd /Users/stong3/Desktop/data_CONP/
#!datalad install https://github.com/CONP-PCNO/conp-dataset.git

#Once installed, you can go in the directory and install the Prevent-AD dataset
##%cd /Users/stong3/Desktop/data_CONP/conp-dataset/
##!datalad install projects/preventad-open

#You can now navigate to the project directory
##%cd /Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/

#You should be able to see the list of subjects within the project
##!ls

/Users/stong3/Desktop/data_CONP
/Users/stong3/Desktop/data_CONP/conp-dataset
/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open
[34m1004359[m[m   [34m2424540[m[m   [34m3408795[m[m   [34m4396879[m[m   [34m5456920[m[m   [34m6788676[m[m   [34m7917918[m[m   [34m9522570[m[m
[34m1016072[m[m   [34m2448082[m[m   [34m3452929[m[m   [34m4437799[m[m   [34m5458966[m[m   [34m6794127[m[m   [34m7945015[m[m   [34m9539210[m[m
[34m1031654[m[m   [34m2484374[m[m   [34m3455156[m[m   [34m4498598[m[m   [34m5558904[m[m   [34m6795892[m[m   [34m8019820[m[m   [34m9555827[m[m
[34m1072774[m[m   [34m2496306[m[m   [34m3463254[m[m   [34m4532706[m[m   [34m5562282[m[m   [34m6851811[m[m   [34m8036701[m[m   [34m9566680[m[m
[34m1076159[m[m   [34m2623146[m[m   [34m3475739[m[m   [34m4538817[m[m   [34m5692079[m[m   [34m6852929[m[m   [34m8120729[m[m   [34m9584420[m[m
[34m1154932[m[m   [3

#### 1.2 - Data download

We can now see that we can access all the subjects within the Prevent-AD project. All the subjects are divided in the following manner are divided in sessions and then in an image folder.

For this pre-processing tutorial, we will only use 10 subjects from whom we have the Baseline and the 12 months follow-up. In total, we will have 20 fMRI scans to pre-process. For simplicity, I took the first 10 subjects where:
- An anatomical scan (T1w) was available at baseline
- A resting-state functional MRI (rest BOLD) was available at baseline
- An anatomical scan (T1w) was available at 12 months
- A resting-state functional MRI (rest BOLD) was available at 12 months

The way Datalad downloads data is by using the following command: ```datalad get <filepath>```. 

In our case, based on the folders, we will require the following subjects (datalad commands below) and will only download these to reduce the data to analyze:
- 1004359 (Done) 
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1004359/```
- 1016072 (Done) 
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1016072/```
- 1072774 (Done) 
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1072774/```
- 1076159 (Done) 
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1076159/```
- 1154932 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1154932/```
- 1176949 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1176949/```
- 1177880 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1177880/```
- 1284264 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1284264/```
- 1322140 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1322140/```
- 1346022 (Done)
```/Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/1346022/```

The first time this is done, there will be a prompt to enter a username and password (twice). These are the same as the ones create to access LORIS.

The data is downloaded directly in the CONP folder. 

#### 1.3 - Next steps

Ok. So our data is downloaded. Are we ready for running fMRIPrep? Not quite.

When looking at the data, we can see that in each visits, we have a folder called "images", in which all the different images are stored in a single folder. Furthermore, each image is succeeded by the .mnc extension... Yikes!

We still have to fix a few things before fMRIPrep can be run. Based on the documentation from fMRIPrep, we need:
1) To convert our .mnc images to a .nii format. 
2) To make our data compliant with the BIDS organization.

Step 1) will be done with the mnc2nii tool available in the minc-toolkit. Step 2) will be done using a custom bash script and the bids-validator.

To ease the process, I move the 10 participants to a different folder on my Desktop called "bhs_project". In this, I create a subdirectory called sourcedata where the folders are transferred. Then, I create a directory called "rawdata" where transformation of the data will occur.

Note that for all the code below, the Bash code is preceeded by ```!```. Sometimes, it is possible that to work within Jupyter, we need to preceed the code by a ```%``` Make sure to change the paths to the appropriate one on your computer so that it works appropriately. You can also copy the code to your terminal, as it doesn't always work well directly on Jupyter.

In [None]:
"""
#Ok so first, just figuring out where we are
!pwd
%cd /Users/stong3/Desktop #Moving back to the desktop

#So now that I am on the Desktop, I create the folders 
!mkdir bhs_project/ #Creating the main folder to put the data in
%cd /Users/stong3/Desktop/bhs_project/
!mkdir sourcedata/ #Creating a "sourcedata" repository where all the untouched data is stored
!mkdir rawdata/ #Creating a "rawdata" repository where we will do the transformations
!mkdir derivatives/ #This will come late in the process, once we pre-process the data

#Then, we go back to the directory where the data is stored (i.e. conp_dataset). So in our case:
%cd /Users/stong3/Desktop/data_CONP/conp-dataset/projects/preventad-open/
#Now we copy the 10 subjects we need to the folder we just created. Careful! This command can take a long time so make sure the path is correctly specified
!cp -r 1004359 1016072 1072774 1076159 1154932 1176949 1177880 1284264 1322140 1346022 /Users/stong3/Desktop/bhs_project/sourcedata/

#Now we copy the 10 subjects we need, but to the rawdata folder WITHOUT the images inside (this will become useful in the .mnc format conversion).
#However, we have .json files that we might wanna keep for later (will be useful for BIDS). We can use the following:
!find /Users/stong3/Desktop/bhs_project/sourcedata -name '*.json' | cpio -pdm /Users/stong3/Desktop/bhs_project/rawdata

#This finds all the .json files within the selected directory and then pipes the input into cpio which copies the directories and the files.
"""

#### 1.4 - Converting the images from .mnc to .nii individually

To convert the images, we can use the mnc2nii tool from the minc-toolkit. The first step is to actually get the toolkit. To do so, we use a Docker image using the code below:

In [None]:
!docker pull nistmni/minc-toolkit:latest

Docker works as a "mini-computer environment" separate from your computer. So before running the actual code, we need to guide Docker to the appropriate locations on our computer and set-up mounts (i.e. integrating the computer's paths in the container). First, you can run the container by itself without mounting anything inside to see how it is structured:

In [None]:
#The -it option makes the container interactive (i.e. you can access it in the bash terminal directly)
#The --rm option remove the current environment to insure that nothing LEAKS inside the computer

!docker run -it --rm \
nistmni/minc-toolkit

You can simply type ```exit``` to exit the Docker. 

The entry point of the container when we access doesn't show anything... Well that's ok. You can do ```cd /``` to access the root of the container. You will see a LOT of folders. They are mostly the software needed for the minc-toolbox. 

Ok, now we need to "link" our computer to the container. With docker, we can do this using the option ```-v <filepath>```. After giving the path inside our computer, we give the path that Docker uses IN the container. The best way I methaphored this in my mind is that it acts as "entry" and "exit" doors to and from the container.

To insure that Docker doesn't modify the original files in "sourcedata", we add the ```:ro``` option after the paths which tells Docker that it cannot modify these files.

In [None]:
!docker run -it --rm \
-v /Users/stong3/Desktop/bhs_project/sourcedata:/files_to_convert:ro \
-v /Users/stong3/Desktop/bhs_project/rawdata:/converted_files \
nistmni/minc-toolkit

Ok! We are now in the container! If we do ```cd /``` and ```ls```, we see our two folders ("files_to_convert" and "converted_files") in the Docker container along with all the other "software" folders. 

Yay! Now we are ready to convert our files!

The code is quite straight forward. Inside the container, you type the following command: 

In [None]:
#mnc2nii -<format_desired> <path/to/file.mnc> <path/to/new_file.nii>
#Note that you can give the name of the new file in the path of the .nii file
!mnc2nii -nii /files_to_convert/1004359/PREBL00/images/preventad_1004359_PREBL00_t1w_001_t1w-defaced_001.mnc /converted_files/preventad_1004359_PREBL00_t1w_001_t1w-defaced_001.nii

Ok! We see an output, that I have no idea what it means... But I guess "reconstruction" is reassuring? 

Once the process is done, we can check the images both inside and out the container... But now it's a bit of a long process to do the mnc2nii x20 times... Specifying a specific path EVERY time. Maybe there's a way to simplify it? and maybe there is even a way to transform the data to .nii format AND in Bids at the same time??

Well why not! We just need to tweak our approach a little bit. 

#### 1.5 - Converting the images in loop for all subjects form .mnc to .nii in BIDS format

So the plan is to create a custom bash script that we will use in the Docker container to:
1) Move to the correct folder to launch the job
2) Loop over all subjects in the directory
3) Create new directories to prepare for the BIDS formatted data
4) Launch the conversion using mnc2nii for all subjects available

Note that I composed this script in VSCode and store the script in a separate directory. This directory also needs to be mounted in Docker. The script below, is available on Github of this tutorial:

In [None]:
"""
#!/bin/bash

#First, inside the container, we need to move to the directory where the files to convert are stored. We can force this with a cd command.

cd /files_to_convert

#The loop starts below
for folders in */; do #This starts the loop. For each folder in the files_to_convert directory, do the following:
    subj="${folders%/}" #Take the name of the folder, strip the "/" and assign this value to a variable "subj" (subject)
    echo "--------------------------"
    echo "--------------------------"
    echo "Processing subject ${subj}" #This is for us: it simply echoes which subject we are currently processing in ther terminal.
    echo "--------------------------"
    #We need to prepare new directories for the new BIDS formatted files. 
    #The format according to BIDS specification is: rawdata/<subject>/<session>/<modality>
        #The loop will take care of the "subject" part using the $subj variable
        #We create 2 sub-directories for each subject (session 1 - baseline, and session 2 - 12 months)
        #In each sub-directory, for each session, we create an anat and a func folder
    echo "--------------------------"
    echo "Creating..."
    mkdir -p /converted_files/sub-${subj}/ses-BL00/anat
    echo "...Directory for ${subj} converted anat BL created (session 1)"
    mkdir -p /converted_files/sub-${subj}/ses-FU12/anat
    echo "...Directory for ${subj} converted anat FU12 created (session 2)"
    mkdir -p /converted_files/sub-${subj}/ses-BL00/func #The p create intermediate directories too.
    echo "...Directory for ${subj} converted func BL created (session 1)"
    mkdir -p /converted_files/sub-${subj}/ses-FU12/func
    echo "...Directory for ${subj} converted func FU12 created (session 2)"

    echo " "
    echo "Starting conversion..."
    echo " "
    #We have a little problem with the visit labels. The dataset includes visit from 2 different visit labels, either PRE or NAP. We can use a bash operator "||" to lauch
    ## the same task with NAP instead of PRE if the mnc2nii cannot find a PRE visit.
    echo "--------------------------------"
    echo "Conversion - Baseline Structural"
    echo "--------------------------------"

    echo "Testing if the subject has a PRE visit"
        #If the line below fails, the || operator executes the next line. Since we use line break operators, the echo and next mnc2nii commands count as a single command.
    mnc2nii -nii ${subj}/PREBL00/images/preventad_${subj}_PREBL00_t1w_001_t1w-defaced_001.mnc /converted_files/sub-${subj}/ses-BL00/anat/sub-${subj}_ses-BL00_T1w.nii || \
    ( echo "There is no PRE visit. Checking for a NAP visit..." && \
    mnc2nii -nii ${subj}/NAPBL00/images/preventad_${subj}_NAPBL00_t1w_001_t1w-defaced_001.mnc /converted_files/sub-${subj}/ses-BL00/anat/sub-${subj}_ses-BL00_T1w.nii )

    echo "--------------------------------"
    echo "Conversion - 12 months Structural"
    echo "--------------------------------"

    echo "Testing if the subject has a PRE visit"
    mnc2nii -nii ${subj}/PREFU12/images/preventad_${subj}_PREFU12_t1w_001_t1w-defaced_001.mnc /converted_files/sub-${subj}/ses-FU12/anat/sub-${subj}_ses-FU12_T1w.nii || \
    ( echo "There is no PRE visit. Checking for a NAP visit..." && \
    mnc2nii -nii ${subj}/NAPFU12/images/preventad_${subj}_NAPFU12_t1w_001_t1w-defaced_001.mnc /converted_files/sub-${subj}/ses-FU12/anat/sub-${subj}_ses-FU12_T1w.nii )

    echo "--------------------------------"
    echo "Conversion - Baseline Functional"
    echo "--------------------------------"

    echo "Testing if the subject has a PRE visit"
    mnc2nii -nii ${subj}/PREBL00/images/preventad_${subj}_PREBL00_bold_001.mnc /converted_files/sub-${subj}/ses-BL00/func/sub-${subj}_ses-BL00_task-rest_bold.nii || \
    ( echo "There is no PRE visit. Checking for a NAP visit..." ; \
    mnc2nii -nii ${subj}/NAPBL00/images/preventad_${subj}_NAPBL00_bold_001.mnc /converted_files/sub-${subj}/ses-BL00/func/sub-${subj}_ses-BL00_task-rest_bold.nii )

    echo "--------------------------------"
    echo "Conversion - 12 months Functional"
    echo "--------------------------------"

    echo "Testing if the subject has a PRE visit"
    mnc2nii -nii ${subj}/PREFU12/images/preventad_${subj}_PREFU12_bold_001.mnc /converted_files/sub-${subj}/ses-FU12/func/sub-${subj}_ses-FU12_task-rest_bold.nii || \
    ( echo "There is no PRE visit. Checking for a NAP visit..." ; \
    mnc2nii -nii ${subj}/NAPFU12/images/preventad_${subj}_NAPFU12_bold_001.mnc /converted_files/sub-${subj}/ses-FU12/func/sub-${subj}_ses-FU12_task-rest_bold.nii )
    
    echo "Conversion done for subject ${subj}! "
    echo " "
done #Don't forget to tell the loop to close!
"""

For this script, I would recommend lauching the Docker image this way: 

In [None]:
!docker run -it --rm \
    -w="/script_conversion" \
    -v /Users/stong3/Desktop/bhs_project/sourcedata:/files_to_convert:ro \
    -v /Users/stong3/Desktop/bhs_project/rawdata:/converted_files \
    -v /Users/stong3/Desktop/BHS2020_Project/01_script_conversion_mnc2bids:/script_conversion \
    nistmni/minc-toolkit

For some reason, the script would not launch from outside the container. As such, I ran the Docker command above and then I simply ran ```sh mnc2bids.sh```, after telling Docker to start its run in the 'script_conversion' folder, which contained the mnc2bids script from the mount.

Ultimately, after the conversion ran, I thought I was ready. The conversion gave us this nice folder (where I created the dataset_description.json manually): 

#### 1.6 - BIDS validation

It is in this step that things started to derail a little bit. I first decided to use ```pybids``` to check if everything was in order. Using pybids, I am able to obtain subjects, sessions and task for my subjects. But the get_collections was not working. 

In [5]:
from bids import BIDSLayout
from bids.reports import BIDSReport
layout = BIDSLayout('/Users/stong3/Desktop/bhs_project/rawdata')
report = BIDSReport(layout)


subjects = layout.get_subjects()
print(subjects)
sessions = layout.get_sessions()
print(sessions)
tasks = layout.get_tasks()
print(tasks)
collections = layout.get_collections(level=subjects, extension='nii')[0].filename
print(collections)

['1004359', '1016072', '1072774', '1076159', '1154932', '1176949', '1177880', '1284264', '1322140', '1346022']
['BL00', 'FU12']
['rest']


KeyError: '1004359'

So I simply decided to run the bids validator.