# Preprocessing notebook

Steps include:

1. Downloading OASIS3 data in BIDS format
2. Fix naming errors in data
3. Preprocessing data with Clinica pipeline
4. Tensor extraction of 2D slices
5. Quality check of preprocessed data

### Reference __[notebook](https://colab.research.google.com/github/aramis-lab/Notebooks-AD-DL/blob/master/preprocessing.ipynb#scrollTo=DZ6LPnlGgeb2)__

### 1. Download data in bids format
First, we need to download OASIS data from Xnat using a script. Guide and script found at __[guide](https://github.com/NrgXnat/oasis-scripts#detailed-instructions-on-how-to-run-these-scripts)__.

We need to:
 1. Extract 'MR ID' column from MR session table and save in csv file. Only keep the ID's you wish to download.
 2. Save script 'download_oasis_scans_bid.sh' in same folder as ID csv file
 3. Create target folder for BIDS data
 4. Finally, run in terminal: `./download_oasis_scans_bids.sh <id_file.csv> <directory_name_to_store_data> <xnat_username> <scan_type>`. <br>

In [4]:
# CD to dir with download script and run the following:
#!./download_oasis_scans_bids.sh id_.csv ./bids_data Hecter T1w

### 2. Fix naming errors in directory
As the pre-processing pipeline depends on naming conventions, naming bugs in the data directory are corrected. Two errors are corrected:
- 'sess' needs to be replaced with 'ses'.
- Some sessions includes two runs resulting in two images for the same subject in the same session. In such case, one run is deleted as only one is expected.

In [None]:
import os

data_folder = "/home/hecter/OneDrive/6_semester/Project/Data/script_download/rename_script/data"   # Insert path to dir

for root, dirs, files in os.walk(data_folder):
    for file_name in files:
        # Check for 'sess' and replace with 'ses'
        if 'sess' in file_name:
            new_file_name = file_name.replace('sess', 'ses')
            os.rename(os.path.join(root, file_name), os.path.join(root, new_file_name))

        # Check for duplicate run-01 and run-02 files and delete run-01
        if 'run-01' in file_name:
            run_01_path = os.path.join(root, file_name)
            os.remove(run_01_path)

### 3. Preprocessing with Clinica pipeline

Clinica has dependency 'ANTs'. Guide to install found __[here](https://github.com/ANTsX/ANTs/wiki/Compiling-ANTs-on-Linux-and-Mac-OS)__. Run the command: 
`clinica run t1-linear <bids_data_dir> <target_dir> <no_of_cores>` (__[Clinica](https://aramislab.paris.inria.fr/clinica/docs/public/latest/Pipelines/T1_Linear/)__).

In [3]:
# Define global and local path variables at each boot

#!export ANTSPATH=/opt/ANTs/bin/
#!export PATH=${ANTSPATH}:$PATH

#!clinica run t1-linear ./bids_data ./output_data --n_procs 7

### 4. Tensor extraction of 2D slices

As a final step, the pre-processed images are obtained in tensor format suitable for input in PyTorch deep learning models. Additionally, each brain is converted into 2D slices. To do so, run the following command: <br> `clinicadl prepare-data [image|patch|slice|roi] [OPTIONS] <post_t1-processing_data> t1-linear` (__[clinicadl](https://clinicadl.readthedocs.io/en/latest/Preprocessing/Extract/)__). <br> <br>
Relevant `[OPTIONS]`: `--save_features`, `--slice_direction`, `--discard_slices`, `--slice_mode`.

In [5]:
# Extract tensors with parameters set to axial plane and to store 5 slices of each brain mri volume
#!clinicadl prepare-data slice --save_features --slice_direction 2 --discarded_slices 87 ./data t1-linear

### 5. Quality check of preprocessed data

As a final measure, all processed images are quality checked. The quality check returns a tsv file with pass probability and whether a given image passed the threshold or not. We want to avoid using images with low probabilies. Use the following command: <br>
`clinicadl quality-check t1-linear [OPTIONS] <post_t1-processing_data> <output.tsv>` __[clinicadl](https://clinicadl.readthedocs.io/en/latest/Preprocessing/QualityCheck/)__ <br> <br>
Relevant `[OPTIONS]`: `--use_tensor`, `--gpu/--no_gpu`

In [6]:
#!clinicadl quality-check t1-linear --no-gpu ./t1_data QC_result.tsv