# Preprocessing

For traditional statistical analysis there exist established preprocessing pipelines such as
- [FreeSurfer](https://surfer.nmr.mgh.harvard.edu/)
- [CAT12](https://neuro-jena.github.io/cat/)
- [SPM](https://www.fil.ion.ucl.ac.uk/spm/)

which align Niftis using a bunch of operations (see image) enabling simple regression analysis for specific voxel positions.


<p>
<img src="https://neuro-jena.github.io/cat12-help/images/cat_processing_steps.png" width=600/>
<figcaption>Taken from <a href="https://neuro-jena.github.io/cat12-help/images/cat_processing_steps.png">https://neuro-jena.github.io/cat12-help/images/cat_processing_steps.png</a></figcaption>
</p>

Luckily, Neural Nets are more advanced than simple regression methods and can perform high precision prediciton without rigorous alignment.

Therefore, (and for brevity) we only apply minimal preprocessing.

## 1. Brain Extraction
As we are only interested in the brain of the participants, we first have to set all voxels which are not inside the brain to 0.

Luckily, there exist a neural net based tool called [deepbet](https://github.com/wwu-mmll/deepbet) which enables brain extraction via:

```python
from deepbet import run_bet

input_paths = ['path/to/sub_1/t1.nii.gz', 'path/to/sub_2/t1.nii.gz']
mask_paths = ['path/to/sub_1/mask.nii.gz', 'path/to/sub_2/mask.nii.gz']
brain_paths = ['path/to/sub_1/brain.nii.gz', 'path/to/sub_2/brain.nii.gz']

run_bet(input_paths, brain_paths)
```


________________
### Enable GPU on Colab 🔥

Like most neural net based tools deepbet can be run on GPU and thereby realize a speedup compared to normal CPU execution 💨

Thankfully, Colab allows you to use a GPU for free 🍀

To enable GPU you simply click on
- *Runtime*
- *Change runtime type*
- *T4 GPU*
________________

# Exercise 1

## 🚨 Warning 🚨

This Notebook builds on 1_Introduction and the Exercise of 2_Data_Exploration.

You have to run that Notebook (if you didn't already) and mount your Google Drive to this Notebook via
```python
from google.colab import drive
drive.mount('/content/drive')
```
then you are ready to go!

1. Install `deepbet`

2. Load the DataFrame you saved at the end of the exercise in 2_Data_Exploration


3. Create two new columns called `brain_filepath` and `mask_filepath` containing the output filepaths needed for the `run_bet` function

The output filepaths should be in the following pattern
- the `t1w_filepath` "...ds000001/sub-XY/anat/sub-XY_T1w.nii.gz"

results in

- the `brain_filepath`: "...ds000001/derivatives/sub-XY/anat/sub-XY_deepbet-brain_T1w.nii.gz"
- the `mask_filepath`: "...ds000001/derivatives/sub-XY/anat/sub-XY_deepbet-mask_T1w.nii.gz"

4. Save the DataFrame to "drive/MyDrive/openneuro/dataframe_after_deepbet.csv"

5. Create the needed "...ds000001/derivatives/sub-XY/anat/" directories using `Path`

6. Run brain extraction via `run_bet` using the filepath columns from 3.

## 2. Intensity Normalization

Image Normalization is a non-trivial preprocessing step and there is no established best way to apply it ([here](https://github.com/jcreinhold/intensity-normalization) is a detailed resource).

Again for brevity, we will only discuss two typically used normalization techniques which are also used for non-image data.

### 2.1 Min-max Normalization

Min-Max normalization scales the values of an array to a range between 0 and 1

```python
import numpy as np

x = np.array([1, 2, 3, 4, 5])
x_min = x.min()
x_max = x.max()
x_normalized = (x - x_min) / (x_max - x_min)
```

**Task 2.1:** What are Pros and Cons of Min-Max normalization?

### 2.2 Z-score Normalization

Z-score normalization, also known as standardization, scales the values of an array to have a mean of 0 and a standard deviation of 1

```python
import numpy as np

x = np.array([1, 2, 3, 4, 5])
mu = x.mean()
std = x.std()
x_normalized = (x - mu) / std
```

**Task 2.2:** What are Pros and Cons of Z-score normalization?

# Exercise 2
1. Load the DataFrame from Exercise 1

2. Add two columns `brain_minmax_filepath` and `brain_zscore_filepath` containing the output filepaths. These should be in the same folder as `df.brain_filepath` and called "sub-XY_deepbet-brain-minmax_T1w.nii.gz" and "sub-XY_deepbet-brain-zscore_T1w.nii.gz", respectively.

3. Save the DataFrame to "drive/MyDrive/openneuro/dataframe_after_preprocessing.csv"

4. For each Nifti in the `brain_filepath` column apply Min-max and Z-score normalization and save it to the corresponding `brain_minmax_filepath` / `brain_zscore_filepath` output *path*.