<img src="https://www.epfl.ch/about/overview/wp-content/uploads/2020/07/logo-epfl-1024x576.png" style="padding-right:10px;width:140px;float:left"></td>
<h2 style="white-space: nowrap">Neural Signals and Signal Processing (NX-421)</h2>
<hr style="clear:both"></hr>

Welcome to the NX-421 class! In today's week, you will get more familiar with preprocessing steps typically conducted in MRI analysis, specifically for anatomical MRI!

In [1]:
%gui wx
import sys
import os

#####################
# Import of utils.py functions
#####################
# Required to get utils.py and access its functions
notebook_dir = os.path.abspath("")
parent_dir = os.path.abspath(os.path.join(notebook_dir, '..'))
sys.path.append(parent_dir)
sys.path.append('.')
from utils import loadFSL, FSLeyesServer, mkdir_no_exist, interactive_MCQ

####################
# DIPY_HOME should be set prior to import of dipy to make sure all downloads point to the right folder
####################
os.environ["DIPY_HOME"] = "/home/jovyan/Data"


#############################
# Loading fsl and freesurfer within Neurodesk
# You can find the list of available other modules by clicking on the "Softwares" tab on the left
#############################
import lmod
await lmod.purge(force=True)
await lmod.load('fsl/6.0.7.4')
await lmod.load('freesurfer/7.4.1')
await lmod.list()

####################
# Setup FSL path
####################
loadFSL()

###################
# Load all relevant libraries for the lab
##################
import fsl.wrappers
from fsl.wrappers import fslmaths

import mne_nirs
import nilearn
from nilearn.datasets import fetch_development_fmri

import mne
import mne_nirs
import dipy
from dipy.data import fetch_bundles_2_subjects, read_bundles_2_subjects
import xml.etree.ElementTree as ET
import os.path as op
import nibabel as nib
import glob

import ants

import openneuro
from mne.datasets import sample
from mne_bids import BIDSPath, read_raw_bids, print_dir_tree, make_report


# Useful imports to define the direct download function below
import requests
import urllib.request
from tqdm import tqdm


# FSL function wrappers which we will call from python directly
from fsl.wrappers import fast, bet
from fsl.wrappers.misc import fslroi
from fsl.wrappers import flirt

# General purpose imports to handle paths, files etc
import glob
import pandas as pd
import numpy as np
import json
import subprocess

Gtk-Message: 08:38:37.632: Failed to load module "canberra-gtk-module"


In [None]:
################
# Start FSLeyes (very neat tool to visualize MRI data of all sorts) within Python
################
fsleyesDisplay = FSLeyesServer()
fsleyesDisplay.show()


(ipykernel_launcher.py:611): Gtk-CRITICAL **: 08:38:48.938: gtk_window_resize: assertion 'height > 0' failed


In [None]:
class DownloadProgressBar(tqdm):
    def update_to(self, b=1, bsize=1, tsize=None):
        if tsize is not None:
            self.total = tsize
        self.update(b * bsize - self.n)

def download_url(url, output_path):
    with DownloadProgressBar(unit='B', unit_scale=True,
                             miniters=1, desc=url.split('/')[-1]) as t:
        urllib.request.urlretrieve(url, filename=output_path, reporthook=t.update_to)

def direct_file_download_open_neuro(file_list, file_types, dataset_id, dataset_version, save_dirs):
    # https://openneuro.org/crn/datasets/ds004226/snapshots/1.0.0/files/sub-001:sub-001_scans.tsv
    for i, n in enumerate(file_list):
        subject = n.split('_')[0]
        download_link = 'https://openneuro.org/crn/datasets/{}/snapshots/{}/files/{}:{}:{}'.format(dataset_id, dataset_version, subject, file_types[i],n)
        print('Attempting download from ', download_link)
        download_url(download_link, op.join(save_dirs[i], n))
        print('Ok')
        
def get_json_from_file(fname):
    f = open(fname)
    data = json.load(f)
    f.close()
    return data

In this lab, we will have you look at some of the essential anatomical preprocessing steps that should be conducted before any analysis. 

These steps are critical: their aim is to eliminate noise and ensure that our data is put in a form where we can conduct reliable analysis. As an analogy, if you were to examine pictures of the universe and trying to analyze distant stars, you would have to remove the Milky Way's own light to examine the light of stars outside our galaxy. This is a preprocessing step. In MRI, you will find that we need to conduct several corrections on our input signal before we can analyze anything reliably.

<div class="warning" style='background-color:#805AD5; color: #90EE90; border-left: solid #805AD5 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>⚠️ Preprocessing warnings ⚠️</b></p>
<p style='text-indent: 10px;'>
The steps of preprocessing you have seen in class are dependent on the analysis you wish to conduct.
In particular, there is not yet a clear consensus on the order in which some steps should be applied, although said order is known to impact subsequent analysis.
As an example, we will teach you how to put your subjects in a common space, the MNI space through a step of <b>normalization</b> (more on that later!), such that subjects can be compared at the group-level. Assume now that for the purpose of your analysis, you are interested only in a single subject, or that the method of choice should be at an individual level for your application. Clearly the normalization is superfluous in this case.
You should also know that most steps will have parameters to set. These parameters can be set at the population level (this is often the case). Such choice means the preprocessing won't be optimal for each subject.
<br>
</p></span>
</div>
<br>
<div class="warning" style='background-color:#90EE90; color: #805AD5; border-left: solid #805AD5 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>Preprocessing pipelines</b></p>
<p style='text-indent: 10px;'>
We will teach you how to perform each step individually, so that you get a precise understanding of what each step's purpose is. In real-world applications, there exist dedicated softwares to do so, such as <a href="https://fmriprep.org/en/stable/">fMRIPrep</a>. These softwares, however, take a long time to run (fMRIPrep would take you the whole day for one participant).
<br>
</p></span>
</div>
<br>
<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid #darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>💡 Quality Control (QC): the 1 - 10 - 100 dollar rule 💡</b></p>
<p style='text-indent: 10px;'>
The 1-10-100 rule states that it takes 1 dollar to verify and correct data at the start, 10 dollars to identify and clean data after the fact and 100 dollars to correct a failure due to bad data.

In preprocessing this is especially true. It will take you much less effort to first look at your data, detect what might be wrong from the start and deal with it rather than apply everything blindly and notice after the fact that something went wrong. Always, ALWAYS look at your data before <u>anything and any analysis</u>. A surgeon always looks at the patient before operating, you should do the same: you are surgeons to your dataset, so please look at it carefully (it's craving for attention, the poor thing 💔).
These intermediary steps of controlling the health of your dataset are called quality control steps. You're doing exactly what you might expect to do: you check the quality of your data before and after a given step, to ensure that nothing went awry for example.
<br>
</p></span>
</div>

# 0. Getting a dataset

Before doing any analysis, we need to have data to analyse. There are many publicly available datasets in the wild, and even more ways to share them. We will show you first how to get data easily.

## 0.1 The BIDS standard and baby's first dataset

To avoid all different formats and dataset organisations, which would make it a pain to reproduce articles and share results, the MRI community came up with a way to organize systematically MRI datasets: the BIDS standard, a set of guidelines on how files should be named, in which folders should they be placed and so forth.

The BIDS standard is, at the core, a set of rules helping you to format a dataset such that other researchers in neuroimaging can reuse your data with the smallest overhead possible. It is a way to unify how files and acquisitions are organized in folders.

This unification comes with several advantages including many tools that ease our life.
Indeed, if your dataset is in such a format, it is very easy to conduct any analysis: your scripts will expect a specific structure, so you don't need to play a million times with paths, for example. A similar analysis can then be run on many different datasets, provided they follow the BIDS standard (we say they are <b>BIDS compliant</b> when they do).

A whole software ecosystem has evolved around this standard, including tools that enable us to load datasets that are BIDs compliant.

This is what we show you in the cell below. In the BIDs format, a dataset has an ID to identify it. With libraries such as <a href="https://openneuro.org/">open-neuro</a>, it is very easy to download such a dataset.

Run the cell below.

In [None]:
dataset_id = 'ds004226'
subject = '001' 

sample_path = "/home/jovyan/Data/dataset"
mkdir_no_exist(sample_path)
bids_root = op.join(sample_path, dataset_id)
deriv_root = op.join(bids_root, 'derivatives')
preproc_root = op.join(bids_root, 'derivatives','preprocessed_data')

mkdir_no_exist(bids_root)

subject_dir = 'sub-{}'.format(subject)

###################
# Openneuro download.
###################
subprocess.run(["openneuro-py", "download", "--dataset", dataset_id, # Openneuro has for each dataset a unique identifier
                "--target-dir", bids_root,  # The path where we want to save our data. You should save your data under /home/jovyan/Data/[your dataset ID] to be 100% fool-proof
                "--include", op.join(subject_dir, 'anat','*'),# We are asking to get all files within the subject_dir/anat folder by using the wildcard *
               "--include", op.join(subject_dir, 'func','sub-{}_task-sitrep_run-01_bold.*'.format(subject)),
               ], check=True)

###################
# Create folders relevant for preprocessing.
# In BIDs, ANYTHING we modify must go in the derivatives folder, to keep original files clean in case we make a mistake.
###################
mkdir_no_exist(op.join(bids_root, 'derivatives'))
preproc_root = op.join(bids_root, 'derivatives','preprocessed_data')
mkdir_no_exist(preproc_root)
mkdir_no_exist(op.join(preproc_root, 'sub-001'))
mkdir_no_exist(op.join(preproc_root, 'sub-001', 'anat'))
mkdir_no_exist(op.join(preproc_root, 'sub-001', 'func'))
mkdir_no_exist(op.join(preproc_root, 'sub-001', 'fmap'))


👋 Hello! This is openneuro-py 2024.2.0. Great to see you! 🤗

   👉 Please report problems 🤯 and bugs 🪲 at
      https://github.com/hoechenberger/openneuro-py/issues

🌍 Preparing to download ds004226 …


📁 Traversing directories for ds004226 : 1042 entities [00:31, 32.62 entities/s]


📥 Retrieving up to 5 files (5 concurrent downloads). 


                                                                                                                

✅ Finished downloading ds004226.
 
🧠 Please enjoy your brains.
 


                                                                                                                

Be mindful of waiting properly when downloading a dataset: you should avoid opening a file until it is fully downloaded, otherwise you have a high chance of corrupting it!

As you can see from the command above, what we have done is:
- Asked to download dataset with ID ds004226
- Asked to include only the folders of subject sub-001
- In these folders, we took all files in the anat/ folder (anatomical files), all files in func/ (functional folder) and in fmap/ (Fieldmaps, you will see in coming weeks)


Once the download is done, we can have a look at the resulting folder structure:

In [6]:
print_dir_tree(bids_root, max_depth=4)

|ds004226/
|--- CHANGES
|--- dataset_description.json
|--- participants.tsv
|--- derivatives/
|------ preprocessed_data/
|--------- sub-001/
|------------ anat/
|------------ fmap/
|------------ func/
|--- sub-001/
|------ sub-001_scans.tsv
|------ anat/
|--------- sub-001_T1w.json
|--------- sub-001_T1w.nii.gz
|------ fmap/
|--------- sub-001_acq-task_dir-AP_epi.json
|--------- sub-001_acq-task_dir-AP_epi.nii.gz
|--------- sub-001_acq-task_dir-PA_epi.json
|--------- sub-001_acq-task_dir-PA_epi.nii.gz
|------ func/
|--------- sub-001_task-sitrep_run-01_bold.json
|--------- sub-001_task-sitrep_run-01_bold.nii.gz
|--------- sub-001_task-sitrep_run-01_events.tsv
|--------- sub-001_task-sitrep_run-02_bold.json
|--------- sub-001_task-sitrep_run-02_bold.nii.gz
|--------- sub-001_task-sitrep_run-02_events.tsv
|--------- sub-001_task-sitrep_run-03_bold.json
|--------- sub-001_task-sitrep_run-03_bold.nii.gz
|--------- sub-001_task-sitrep_run-03_events.tsv
|--------- sub-001_task-sitrep_run-04_

This organization is typical of a BIDs dataset. 

Each subject's file is split between anatomical data and functional data. You are already a bit familiar with the .nii.gz file extension: it is the extension of MRI files, typically, but what might be the .json file? Well, let's open it to figure it out!

In [7]:
data = get_json_from_file(op.join(bids_root, 'sub-001', 'func', 'sub-001_task-sitrep_run-01_bold.json'))
data

{'Modality': 'MR',
 'MagneticFieldStrength': 3,
 'Manufacturer': 'Siemens',
 'ManufacturersModelName': 'Skyra',
 'InstitutionName': 'Princeton_University_-_Neuroscience_Institute',
 'InstitutionalDepartmentName': 'Department',
 'InstitutionAddress': 'Washington_and_Faculty_Rd._-_Building_25_25_Princeton_NJ_US_085',
 'DeviceSerialNumber': '45031',
 'StationName': 'AWP45031',
 'BodyPartExamined': 'BRAIN',
 'PatientPosition': 'HFS',
 'ProcedureStepDescription': 'TamirL_Mark',
 'SoftwareVersions': 'syngo_MR_E11',
 'MRAcquisitionType': '2D',
 'SeriesDescription': 'EPI_2.5mm_1.5TR_32TE_SMS4_Siemens',
 'ProtocolName': 'EPI_2.5mm_1.5TR_32TE_SMS4_Siemens',
 'ScanningSequence': 'EP',
 'SequenceVariant': 'SK',
 'ScanOptions': 'FS',
 'SequenceName': '_epfid2d1_78',
 'ImageType': ['ORIGINAL', 'PRIMARY', 'M', 'ND', 'NORM', 'MOSAIC'],
 'SeriesNumber': 9,
 'AcquisitionTime': '16:17:36.350000',
 'AcquisitionNumber': 1,
 'SliceThickness': 2.5,
 'SpacingBetweenSlices': 2.5,
 'SAR': 0.0635751,
 'EchoTime'

This JSON is extremely important: it is what we call a JSON **sidecar**, and it holds precious acquisition informations! Based **only on the text printed above**, are you able to determine any or all of the following?

In [8]:
interactive_MCQ(2,1)

HTML(value='<h3>Select all which apply</h3>')

VBox(children=(VBox(children=(Checkbox(value=False, description='TR ?', layout=Layout(width='99%')), HTML(valu…

<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b>🏍️ JSON sidecars 🏍️</b></p>
<p style='text-indent: 10px;'>
    A JSON sidecar is critical as you can see in the BIDS standard. It incorporates many relevant informations about your MRI acquisition. As such whenever you download a dataset, you should always make sure you don't include just the Nifti files (.nii.gz or .nii files), but also the .json, as otherwise you will be lacking critical information.</p>
</span>
</div>

## 0.2 Loading more datasets: how to

Great! Note that we've loaded only one subject and one file of each modality for the subject. You can have a look <a href="https://openneuro.org/datasets/ds004226/versions/1.0.0">here</a> for this dataset. As you can see, it is a big dataset; we've restricted our download to the bare minimum to spare your computer's disk space as much as possible.

Notice two things on the web page. The first is the dataset's accession number:
<img src="imgs/openneuro_access_nbr.png">
This number is the one we've put in our code earlier, to specify what dataset we wanted to load from:
```python
dataset_fmap = 'ds004226'
...
openneuro.download(dataset=dataset_fmap, ...)

```

Should you wish to download another openneuro dataset that piqued your interest, you'll simply need to change the above variable with the accession number of the dataset on the corresponding page!

Secondly, you can observe the entire folder structure, size and many other interesting information of this dataset by simply scrolling its page!
<img src="imgs/openneuro_full_view.png">
You can really decide whether a dataset is what you need this way, before burdening your connection with any heavy download :)

# 1. Anatomical preprocessing

Let's have a look at the nice anatomical we downloaded above.

In [9]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(op.join(bids_root, 'sub-001', 'anat', 'sub-001_T1w.nii.gz'))




Take some time to explore the exquisite anatomy of the brain. Notice around the brain, the human skull. It is full of regions which show up **in this contrast** whiter than others. Based on what you might know from class about the T1 contrast, can you identify the different regions annotated below taken from another T1?

As a hint, think about white matter: it is composed of axons and myelin. Contrast with grey matter, which is comprised of soma. Based on your understanding:
<img src="imgs/annotated_regions.png">

In [11]:
interactive_MCQ(2,2)
interactive_MCQ(2,3)
interactive_MCQ(2,4)
interactive_MCQ(2,5)

HTML(value='<h3>Select all which apply</h3>')

RadioButtons(description='Choose:', layout=Layout(width='100%'), options=('Tissues high in fat are bright in T…

Output()

HTML(value='<h3></h3>')

RadioButtons(description='Choose:', layout=Layout(width='100%'), options=('Region 1 is likely high in fibers a…

Output()

HTML(value='<h3></h3>')

RadioButtons(description='Choose:', layout=Layout(width='100%'), options=('Region 2 contains a mix of fat and …

Output()

HTML(value='<h3></h3>')

RadioButtons(description='Choose:', layout=Layout(width='100%'), options=('Region 3 contains air, which is why…

Output()

## 1.1 Skull stripping

### 1.1.1 Preprocessing and BIDs
An important part of **anatomical** preprocessing is to remove the skull around the brain.
To adhere to the BIDs format, all modified files should be put in a new folder, called derivatives, such that you always have clean data in the source directory. The derivatives folder can be used for different preprocessing and treatments, each needing their own subfolders. In our case, we've created a single folder, preprocessed_data, hence the following structure:

In [12]:
print_dir_tree(bids_root, max_depth=4)

|ds004226/
|--- CHANGES
|--- dataset_description.json
|--- participants.tsv
|--- derivatives/
|------ preprocessed_data/
|--------- sub-001/
|------------ anat/
|------------ fmap/
|------------ func/
|--- sub-001/
|------ sub-001_scans.tsv
|------ anat/
|--------- sub-001_T1w.json
|--------- sub-001_T1w.nii.gz
|------ fmap/
|--------- sub-001_acq-task_dir-AP_epi.json
|--------- sub-001_acq-task_dir-AP_epi.nii.gz
|--------- sub-001_acq-task_dir-PA_epi.json
|--------- sub-001_acq-task_dir-PA_epi.nii.gz
|------ func/
|--------- sub-001_task-sitrep_run-01_bold.json
|--------- sub-001_task-sitrep_run-01_bold.nii.gz
|--------- sub-001_task-sitrep_run-01_events.tsv
|--------- sub-001_task-sitrep_run-02_bold.json
|--------- sub-001_task-sitrep_run-02_bold.nii.gz
|--------- sub-001_task-sitrep_run-02_events.tsv
|--------- sub-001_task-sitrep_run-03_bold.json
|--------- sub-001_task-sitrep_run-03_bold.nii.gz
|--------- sub-001_task-sitrep_run-03_events.tsv
|--------- sub-001_task-sitrep_run-04_

### 1.1.2 Actual skull stripping

Perfect! Let's move on to actually extracting the brain! To make it easier for you to detect what was actually extracted, we will let the brain extraction proceed, using FSL's <a href="https://fsl.fmrib.ox.ac.uk/fsl/docs/#/structural/bet">BET</a> (brain extraction tool) and show you the mask of the region determined by FSL as brain.

In [13]:
def get_skull_stripped_anatomical(bids_root, preproc_root, subject_id, robust=False):
    """
    Function to perform skull-stripping (removing the skull around the brain).
    This is a simple wrapper around the brain extraction tool (BET) in FSL's suite
    It assumes data to be in the BIDS format (which we will cover in the following weeks).
    The method also saves the brain mask which was used to extract the brain.

    The brain extraction is conducted only on the T1w of the participant.

    Parameters
    ----------
    bids_root: string
        The root of the BIDS directory
    preproc_root: string
        The root of the preprocessed data, where the result of the brain extraction will be saved.
    subject_id: string
        Subject ID, the subject on which brain extraction should be conducted.
    robust: bool
        Whether to conduct robust center estimation with BET or not. Default is False.
    """
    # We perform here skull stripping (you'll learn more about it next week!).
    # For now all you need to do is that we remove the bones and flesh from the MRI to get the brain!
    subject = 'sub-{}'.format(subject_id)
    anatomical_path = op.join(bids_root, subject, 'anat', 'sub-{}_T1w.nii.gz'.format(subject_id))
    betted_brain_path = op.join(preproc_root, subject, 'anat', 'sub-{}_T1w'.format(subject_id))
    os.system('bet {} {} -m {}'.format(anatomical_path, betted_brain_path, '-R' if robust else ''))
    print("Done with BET.")

resulting_mask_path = op.join(preproc_root, 'sub-001', 'anat', 'sub-001_T1w_mask')
get_skull_stripped_anatomical(bids_root, preproc_root, "001")

In [14]:
fsleyesDisplay.load(resulting_mask_path)

Is the mask nicely fitting around the brain? What you would like is that the mask is taking all parts of the brain and excluding the rest.
To answer this one, play with the mask's opacity in FSL eyes.<br>
*Hint: have a look at the frontal regions. Inspect as well the superior parietal regions*<br>
<br><br>
What you are doing here is simply **Quality Control** (QC). It is a crucial step that you should **NEVER** skip when dealing with data preprocessing. As all steps are dependent on the success of previous steps, always make sure that everything is performing properly before moving on!

### 1.1.3 Improving the fit
If you look a bit into bet's documentation, you'll quickly find that there are parameters with which you can play; robust brain centre estimation and fractional intensity threshold. To demonstrate the importance and impact of these parameters, let's use a robust brain center estimation.

In [15]:
get_skull_stripped_anatomical(bids_root, preproc_root, "001", robust=True)

In [17]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(op.join(bids_root, 'sub-001', 'anat', 'sub-001_T1w.nii.gz'))
fsleyesDisplay.load(resulting_mask_path)


(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: file ../gio/gfileinfo.c: line 1655 (g_file_info_get_is_backup): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:47:48.818: file ../gio/gfileinfo.c: line 

How good is the mask now?

Is it perfect? (*Hint: have a look at voxels around 94-209-131*)

### 1.1.4 Manual corrections
If you really want good fit, you might want to resort to **manually correcting the mask**. 

FSLeyes readily allows you to do such things! While on FSLeyes, press **Alt + E** to open the editing interface.
<img src="imgs/editing_menu_fsl.png">
<center><i>FSLeyes editing menu</i></center>

We will work here on removing some unwanted voxels. Toggle the 'Select mode' first. This way, FSL will show us which voxels we currently have selected, before changing their value
<img src="imgs/selection_mode_toggle_fsl.png">
<center><i>Make sure to be in Select mode by clicking it</i></center>

Then let's pick the pencil tool, to select the voxels we want.
<img src="imgs/brush_tool.png">

Good, we're set and we can now select voxels. We'll try to select some **unwanted** voxels. Simply paint over them!
<br>
<center><img src="imgs/painted_voxels.png"></center>
<center><i>Selected voxels are shown in purple</i></center>

Now, we are dealing with a mask. We thus want to put the value of our selection to **0**, so as to remove it from the mask. To do so, we must change the fill value to 0, and then click to replace our selection with the provided value:
<br>
<img src="imgs/paint_steps.png">
<center><i>The two steps to set selection to a specific fill value.</i></center>
<br>
<img src="imgs/mask_painted.png">
<center><i>Painting with zero a mask means we remove the painted voxels from the mask.</i></center>

It remains now to apply the mask to our anatomical data. This is fortunately something that you now know how to do from the previous lab! Fill in the next cell with the appropriate code **and make sure to save the masked brain in the proper directory**.

You can then escape back to the non-edit mode by pressing again **Alt + E**

In [18]:
##################################
# Fill me with the code to use your mask
# To help you, we provide you with the skeleton of two potential approaches.
# You can fill either of them. Do not forget to test them by visualizing the result!
# For fslmaths, you can either read the documentation, or execute it without argument by running os.system('fslmaths')
##################################

# Option 1: Pythonic approach.
def apply_python_mask_approach(img_path, mask_path, masked_img_path):
    """
    The first approach, Pythonic way. The goal is, given a mask, to apply it to a T1 where the brain is to be extracted.
    
    YOU SHOULD COMPLETE THE METHOD AS IT DOES NOT WORK RIGHT NOW!

    Parameters
    ----------
    img_path: str
        Path to the image on which we would like to apply the mask (in your case, the T1 with the skull still on). Should be a .nii.gz file
    mask_path: str
        Path to the mask you would like to apply to your image. Should be a .nii.gz file, containing only binary values (0 or 1)
    masked_img_path: str
        Path to which the resulting image will be saved.
    """
    import nibabel as nib

    # Load both the T1 and the mask from disk
    img = nib.load(img_path)
    mask = nib.load(mask_path)
    
    # Load the data from both above images as numpy arrays
    img_data = img.get_fdata()
    mask_data = mask.get_fdata()

    #######################
    # Solution 1
    # Create an empty image and select all which falls in the mask (perhaps the most natural way to think about the mask)
    #######################
    # In all positions within the mask, get the image content
    saved_img_data = np.zeros(img_data.shape)
    saved_img_data[mask_data > 0] = img_data[mask_data > 0]

    # Save the image to disk, by creating a new Nifti image and then writing it out
    img_out = nib.Nifti1Image(saved_img_data,img.affine, img.header)
    nib.save(img_out, masked_img_path)

    #######################
    # Solution 2
    # Another approach is to remove from img_data all that is outside the mask.
    #######################
    
    # In all positions OUTSIDE the mask (where it is equal to 0), throw away the image
    img_data[mask_data == 0] = 0
    
    # Save the image to disk, by creating a new Nifti image and then writing it out
    img_out = nib.Nifti1Image(img_data,img.affine, img.header)
    nib.save(img_out, masked_img_path)
    
def apply_fsl_math_approach(img_path, mask_path, masked_img_path):
    ###########################
    # Solution
    # By reading fslmaths documentation, one can see that the -mas option is exactly what we desire.
    ###########################
    os.system('fslmaths {} -mas {} {}'.format(img_path, mask_path, masked_img_path))
    

anatomical_path = op.join(bids_root, 'sub-001', 'anat', 'sub-001_T1w.nii.gz') # The original brain
betted_brain_path = op.join(preproc_root, 'sub-001', 'anat', 'sub-001_T1w.nii.gz') # The brain without skull is in the derivatives folder
resulting_mask_path = op.join(preproc_root, 'sub-001', 'anat', 'sub-001_T1w_mask.nii.gz') # The mask to use

########################
# CHOOSE ONE OF THE TWO TO IMPLEMENT IT AND LAUNCH IT
########################
apply_fsl_math_approach(anatomical_path, resulting_mask_path, betted_brain_path)
apply_python_mask_approach(anatomical_path, resulting_mask_path, betted_brain_path)

As always, do not forget to visualize what you have done. If you did a proper job, you should now see the brain without any skull!

In [23]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(betted_brain_path)

Note: If you get a RuntimeError: wrapped C/C++ object of type OrthoEditActionToolBar has been deleted, do not worry, this is simply because you forgot to escape back to non-edit mode (Alt+E; option+E for MacOS) before resetting the overlay, but it is a benign error :)

## 1.2 Tissue segmentation

For the purpose of analysis, it can be useful to separate the tissues into tissue classes; in particular extracting the white matter, grey matter and cerebrospinal fluid (abreviated as CSF) is very interesting in fMRI analysis. Consider for example an analysis that you wish to restrict to the somas of your neurons, would it make sense to conduct your analysis on the CSF ?

You'll find that the segmentation is not done on fMRI volumes; it is done on the anatomical and the resulting tissue masks are then used on the functional data. Can you imagine why this is the case?

Let's perform tissue segmentation. To do so, we'll use FSL's <a href="https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST">FAST</a> (FMRIB's Automated Segmentation Tool).

The underlying idea of FAST is to try and model each voxel's intensity as being a mixture between the different tissue types.
Pay attention in the documentation to the following line:
<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b></b></p>
<p style='text-indent: 10px;'>
    Before running FAST an image of a head should first be brain-extracted, using BET. The resulting brain-only image can then be fed into FAST.</p>
</span>
</div>

Based on this, **in the cell below choose which image should be used as fast_target, between the anatomical_path and the brain_extracted_path images**.


<div class="warning" style='background-color:orange; color: #112A46; border-left: solid red 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b>🐞 Troubleshooting: FSL stopped responding </b></p>
<p style='text-indent: 10px;'>
    It is perfectly possible (even likely) that FSLeyes will stop responding over the course of this lab. This is perfectly normal! Simply wait for whichever function (such as FAST) to finish and it should start responding again, don't worry too quickly, be patient :)</p>
</span>
</div>

Note that FAST will take one or two minutes to run, this is expected, do not panic :)

P.S: Recently, deep learning tools started generating good enough tissue segmentations, such as the FSL-based <a href="https://surfer.nmr.mgh.harvard.edu/fswiki/SynthSeg">SynthSeg</a>.

In [24]:
anatomical_path = op.join(bids_root, 'sub-001', 'anat', 'sub-001_T1w.nii.gz')
bet_path = op.join(preproc_root, 'sub-001', 'anat', 'sub-001_T1w')

#########################
# Solution
# By reading above: we must apply FAST to the brain-extracted image. Thus we must use the BET path brain.
##########################
fast_target = bet_path # Replace with either anatomical_path or bet_path (note: you can try both and decide which is more reasonable!)

[os.remove(f) for f in glob.glob(op.join(preproc_root, 'sub-001', 'anat', '*fast*'))] # Just to clean the directory in between runs of the cell
segmentation_path = op.join(preproc_root, 'sub-001', 'anat', 'sub-001_T1w_fast')
fast(imgs=[fast_target], out=segmentation_path, n_classes=3)

{}

Let's check the quality of the segmentation, shall we?
We want to extract 3 tissue types here: the white matter, the grey matter and the csf. How well did fast perform?

If you look at the directories now, we have new files in our hierarchy:

In [25]:
print_dir_tree(bids_root, max_depth=5)

|ds004226/
|--- CHANGES
|--- dataset_description.json
|--- participants.tsv
|--- derivatives/
|------ preprocessed_data/
|--------- sub-001/
|------------ anat/
|--------------- sub-001_T1w.nii.gz
|--------------- sub-001_T1w_fast_mixeltype.nii.gz
|--------------- sub-001_T1w_fast_pve_0.nii.gz
|--------------- sub-001_T1w_fast_pve_1.nii.gz
|--------------- sub-001_T1w_fast_pve_2.nii.gz
|--------------- sub-001_T1w_fast_pveseg.nii.gz
|--------------- sub-001_T1w_fast_seg.nii.gz
|--------------- sub-001_T1w_mask.nii.gz
|------------ fmap/
|------------ func/
|--- sub-001/
|------ sub-001_scans.tsv
|------ anat/
|--------- sub-001_T1w.json
|--------- sub-001_T1w.nii.gz
|------ fmap/
|--------- sub-001_acq-task_dir-AP_epi.json
|--------- sub-001_acq-task_dir-AP_epi.nii.gz
|--------- sub-001_acq-task_dir-PA_epi.json
|--------- sub-001_acq-task_dir-PA_epi.nii.gz
|------ func/
|--------- sub-001_task-sitrep_run-01_bold.json
|--------- sub-001_task-sitrep_run-01_bold.nii.gz
|--------- sub-001_

The pve files correspond to our segmented tissues. We have exactly three files, because we set n_classes to 3 above:
```python
fast(..., n_classes=3)
```

Let's try to identify which segmentation is which tissue type in the brain. To do this, you'll have to visualize the tissues and decide for yourself.

To make it easier on you, we will display:

- pve_0 in <span style="color:red;">red</span>
- pve_1 in <span style="color:green;">green</span>
- pve_2 in <span style="color:blue;">blue</span>

In [26]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(bet_path)
fsleyesDisplay.load(glob.glob(op.join(preproc_root, 'sub-001', 'anat','*pve_0*'))[0])
fsleyesDisplay.load(glob.glob(op.join(preproc_root, 'sub-001', 'anat','*pve_1*'))[0])
fsleyesDisplay.load(glob.glob(op.join(preproc_root, 'sub-001', 'anat','*pve_2*'))[0])
fsleyesDisplay.displayCtx.getOpts(fsleyesDisplay.overlayList[1]).cmap = 'Red'
fsleyesDisplay.displayCtx.getOpts(fsleyesDisplay.overlayList[2]).cmap = 'Green'
fsleyesDisplay.displayCtx.getOpts(fsleyesDisplay.overlayList[3]).cmap = 'Blue'


(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.427: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.427: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.427: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.427: file ../gio/gfileinfo.c: line 1655 (g_file_info_get_is_backup): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.428: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.428: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.428: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 08:55:34.428: file ../gio/gfileinfo.c: line 

In [33]:
interactive_MCQ(2,6)

HTML(value='<h3></h3>')

RadioButtons(description='Choose:', layout=Layout(width='100%'), options=('pve_0 is white matter, pve_1 is gre…

Output()

In [None]:
interactive_MCQ(2,6)

<div class="warning" style='background-color:#90EE90; color: #112A46; border-left: solid #805AD5 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b>🏠 Tissues and contrast: Take home message 🏠</b></p>
<p style='text-indent: 10px;'>
    Tissues in T1-w or T2-w images will show up in different colour, dependent on their content. This is because their content affects their relaxation time and, in turn, the intensity captured during the acquisition. Here are different tissue types for both modalities (from <a href="https://www.researchgate.net/publication/324396120_Basic_MRI_for_the_liver_oncologists_and_surgeons">Vu, Lan N., John N. Morelli, and Janio Szklaruk. "Basic MRI for the liver oncologists and surgeons." Journal of hepatocellular carcinoma 5 (2017): 37.</a>):
    <table>
        <tr>
            <th>MR</th>
            <th>High signal (bright)</th>
            <th>Low signal (dark)</th>
        </tr>
        <tr>
            <th>T1-w</th>
            <th>Fat, melanin</th>
            <th>Iron</th>
        </tr>
        <tr>
            <th></th>
            <th>Blood</th>
            <th>Water</th>
        </tr>
        <tr>
            <th></th>
            <th>Proteinaceous fluid</th>
            <th>Air, bone</th>
        </tr>
        <tr>
            <th></th>
            <th>Paramagnetic substances</th>
            <th>Collagen</th>
        </tr>
        <tr>
            <th></th>
            <th>Chelated gadolinium contrast</th>
            <th>Most tumors</th>
        </tr>
        <tr>
            <th>T2-w</th>
            <th>Water</th>
            <th>Air</th>
        </tr>
        <tr>
            <th></th>
            <th>Edema</th>
            <th>Bone</th>
        </tr>
        <tr>
            <th></th>
            <th>Blood</th>
            <th>Hemosiderin, deoxyhemoglobin, methemoglobin</th>
        </tr>
    </table>
    Typical preprocessing steps of anatomical data starts by extracting the brain by removing skull tissues. This step can be conducted mostly automatically, but it is perfectly possible to manually correct the extracted brain, either to include more or less voxels when tweaking the parameters does not yield satisfactory results.
    The extracted brain can be segmented into different tissues. Using the difference in brightness due to contrast, we can separate the grey matter, the white matter and the CSF, which is useful for later analysis.</p>
</span>
</div>

# Part 2: Coregistration of images, a critical preprocessing step

Now that you're familiar with how fMRI works, we will have you conduct what is called coregistration.
As you will see in class, one of the key issues is to ensure MRI images are well aligned to each other. There are different reasons: comparison between participants during analysis, overlaying of anatomical MRI with functional MRI (to use for example an atlas based on anatomy), correction of motion...


We'll show you the following steps:
- Download a participant's dataset
- Visualize data together
- Remove the skull
- Coregister a brain to another

For our purpose, we will align an anatomical MRI with the MNI anatomical template you displayed before.
This process is typically called **normalization**: we put the MRI into a "standard" space of reference. However the principle we'll show you can be applied to just any pair of images.

<h2>2.1 Aligning images manually</h2>

We will exploit the dataset we've been using up until now. You could, if you want, download another dataset to make sure you understood everything up until now and work on it of course.

<h3>2.1.1 Visualizing the data together</h3>

The volume of interest will be the anatomical file of subj-001, located under /home/jovyan/Data/dataset/ds004226/derivatives/preprocessed_data/sub-001/anat/sub-001_T1w.nii.gz .
Remember: thanks to the BIDs standard, you can tell it is an anatomical by its folder (anat/) and its name (T1w).

We will use freeview to visualise the data, both to show you another tool besides FSLeyes but also because moving images manually is handier there :)

In [34]:
import threading

def launch_freeview(img_list):
    """
    Wrapper around freeview to launch it with several images.
    This wrapper is necessary to launch freeview in a separate thread, ensuring the notebook is free to do something else.

    Parameters
    ----------
    img_list: list of string
        List of images (files) to load. Assumed by default to be volume files.
    """
    args = []
    
    for i in range(len(img_list)):
        args.append("-v")
        args.append(img_list[i])
    # Run the command
    subprocess.run(["freeview"] + args)

imgList = [op.expandvars('$FSLDIR/data/standard/MNI152_T1_1mm.nii.gz'), 
           op.join(preproc_root, "sub-001", "anat", "sub-001_T1w.nii.gz:colormap=greyscale")] # Modify here this list to add any image you want, in .nii.gz format
freeview_thread = threading.Thread(target=launch_freeview, args=(imgList,)) # Remark the (imgList,) when passing to args. This is very important to make Thread work properly

# Start the thread
freeview_thread.start()

print("Freeview is running in a separate thread.")

Wait, we can't see anything apart from the top brain.
No worries! Simply set the background to be transparent by ticking the Clear Background option, as in the picture below:

<img src="imgs/freeview_tick.png" style="width: 1000px; height:auto; border: black 6px groove;"/>    

You should then notice the brains are not too aligned to each other. For example, in the coronal view:

<img src="imgs/bad_alignment.png" style="width: 1000px; height:auto; border: black 6px groove;"/>    

This is because there is no absolute system of reference. We need to align MRIs to one another ourselves, through coregistration.

<h3>2.1.3 Aligning images together</h3>

Let's start with a straightforward approach: you will align the images manually. In Freeview, click in Tools > Transform Volume.
You should get the following panel:

<img src="imgs/freeview_panel.png" style="width: 1000px; height:auto; border: black 6px groove;"/>    

<p style="font-size:25px;"><b>Now, play with the sliders of translation and rotation to align the anatomical to the reference. </b> Try to align the two brains as best you can.</p>

Congratulations, you have done your first brain coregistration! 

What features did you use to visually align the brains to each other?

<div class="warning" style='background-color:#90EE90; color: #112A46; border-left: solid #805AD5 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>💡 Image coregistration 💡</b></p>
<p style='text-indent: 10px;'>
What you have just done manually is a coregistration of a subject MRI (sub-001) to a reference image (the MNI152 template image). This step is critical when handling a group of participants.
In this way, you make sure that all brains are sitting in the same space and can thus be analyzed in equal manner.

This step is absolutely critical. You should always check how well your images are aligned to one another, even after automated approaches. You might sometimes need to align the reference yourself, which you now know how to do.
You will now learn how to do it automatically.</p></span>
</div>


<h2>2.2 A more principled way</h2>
The MRI and the template are not very well aligned, but we can try to make them more aligned. Specifically, we would like to find a transformation such that we can align our anatomical to the MNI template. This is the so-called normalization step.
So we need two ingredients to do this:
- A way to compute the transformation from anatomical to the MNI template (this step is called registration)
- A reference image in the space of the MNI template (here, actually, this is the MNI template)

#### 2.2.1 What transformation?

What should this transformation be?
It is a combination of translation, rotation, scaling and other possible modifications, applied on our anatomical, so that it ends up matched to the target (the MNI image). In essence, the transformation fully describes the process to align the two images!


#### 2.2.2 Why the reference ?
<p>
Let's pause for a moment. If the transformation already encapsulates all that there is to know about how to transform the volume, why do we need a reference from the target space?

To answer this, let's think about what happens when moving the anatomical image, shall we?
We rotate it, translate it, maybe shear it, cool, we have an anatomical well registered. But there's still an issue.
<br><b>What is the resolution of our anatomical?</b><br>
If you remember, the anatomical has an exquisite spatial resolution, but it might not be exactly the same as the MNI template: what if you used the MNI with 0.5mm voxel size from last week for example? In this case, we have a mismatch in resolutions. Let's get concrete with an example with Ducky!
</p>

#### 2.2.3 Ducky's sunglasses
<p>    
Imagine the following example: we have an image of a duck (Ducky!), and we want to align sunglasses to it. 
Notice that this is hard, algorithmically, right? Unlike the brain, there  are no landmarks to use such as white-matter to try and optimize some cost. Sunglasses could be worn on the beak, the head, even the neck: there are no rules to fashion!
Fortunately, we are told Ducky should wear the glasses on its beak like the cool kids and are even provided the transformation to do so, a combination of translation, rotation and scaling. Applying this transform to the sunglasses, we get the following:
    
<div>
<center>
<div style="display:inline-block;">
    <img src="imgs/ducky/ducky_alone.png" style="width: 200px; height:auto; border: blue 6px groove;"/>
    <p style="text-align:center;">Ducky (our reference!)</p>
</div>
<div style="display:inline-block;">
    <img src="imgs/ducky/sunglasses_alone.png"/ style="width: 150px; height:auto;border: green 6px groove;" />
    <p style="text-align:center;">The sunglasses (in their own space)</p>
</div>   
<div style="display:inline-block;">
    <img src="imgs/ducky/ducky_summary.png" style="width: 200px; height:auto; border: blue 6px groove;"/>
    <p style="text-align:center;">What the transformation will do</p>
</div>
    </center>
</div>

Okay, we do all this. We obtain this:
<br>
<center><img src="imgs/ducky/ducky_total_uncropped.png" style="width: 200px; height:auto; border: black 6px groove;"/></center>

Now, why is Ducky's image so big?
The answer is simple: the sunglasses and Ducky did not have the same:
- Resolution (meaning the glasses can get blurry, think of putting a 144px resolution screenshot on a 4K resolution image!)
- Field of view (canvas size if you're thinking of Photoshop layers for example)

Here are the field of view of the two images, kept as is :<br>
<center><img src="imgs/ducky/ducky_uncropped.png" style="width: 200px; height:auto; border: black 6px groove;"/></center>
 
But we would like to put the sunglasses on ducky without increasing the size around ducky. <b>So let's use Ducky's picture's resolution and field of view to correct our transformed sunglasses' own resolution and field of view</b>.
In other words, we will interpolate the sunglasses to match our duck's resolution and we will crop its field of view (or pad it if necessary) to match perfectly our duck. Final result:
    <center><img src="imgs/ducky/ducky_complete.png" style="width: 200px; height:auto; border: black 6px groove;"/>    <p style="text-align:center;">Much better.</p>
</center>

</p>

#### 2.2.4 Back to neurological data

We use exactly the same idea when applying transformations. It is for this reason that when applying a transformation in FSL, you will always need to pass a reference image of the space in which you want to end up. This way, FSL will adapt the field of view but also the resolution by interpolation. This interpolation parameter can be done through nearest neighbour, trilinear or sinc interpolation.


### 2.2.5 Types of normalization

So, you now know that you need a transformation and a reference. Great. Now, the transformation you allow can be of two types: it can be linear, meaning whatever you apply will be the same across the entire image, or non linear, where each voxel gets a separate treatment

(ducky linear and non linear)



## 2.3 Actually doing it: Linear normalization

To perform linear normalization, the idea is simple. The transformation we want should be linear - ie, affine.
Such a matching is usually called in image processing <a href="https://en.wikipedia.org/wiki/Image_registration">image registration</a>. Here, we're dealing with 3D data, so the problem is a bit more complicated. Fortunately all of this has been coded by very smart people, and to our rescue comes a tool specifically to register volumes to each other: <a href="https://fsl.fmrib.ox.ac.uk/fsl/docs/#/registration/flirt/index">FLIRT</a>!
<br>
This tool can allow many registrations and is extremely powerful. In its most basic form, it expects:
- An input volume, the volume you want to register (Ducky's sunglasses)
- A reference volume, to which the input is registered (Ducky's body)
- An output volume, the result of the transformation (Ducky's sunglasses once they are on Ducky's beak)

Here is how you can call it to register the patient's anatomical to some reference sitting in another space (here the MNI152 template):
```python
flirt()

```

<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b>💡 Pay attention! 💡</b></p>
<p style='text-indent: 10px;'>
    FLIRT expects the anatomical to be skull stripped to maximize normalization. Luckily, you already did it before with BET.</p>
</span>
</div>

Now we can compute the transformation, using flirt.

<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid #darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'>
<b>💡 Remember 💡</b></p>
<p style='text-indent: 10px;'>
To conduct coregistration, we need to remember our two key ingredients (besides the algorithm itself), namely the images:
<ul>
    <li>Reference image: the one which is fixed</li>
    <li>Source image: the one which will be moved</li>
</ul>
<b>Our goal</b>: to align the source (moving image) to the reference (fixed image) by moving the source.
</p></span>
</div>

In the following cell, we provide you with two images. One is the MNI152 template, the other is the anatomical T1 of the subject. On both images, the skull was removed.
Choose which one should be a reference and which one should be a source, and run flirt accordingly below.

In [35]:
from fsl.wrappers import flirt

# The two images
subject_id = '001'
subject_anatomical = op.join(preproc_root, 'sub-{}'.format(subject_id), 'anat', 'sub-001_T1w')
mni_template = op.expandvars(op.join('$FSLDIR', 'data', 'standard', 'MNI152_T1_1mm_brain'))

###################
# Select which image should be source or reference
# ANSWER:
# The subject anatomical will most often be the source, as the template is usually where we want to map our subjects for 
# group comparison.
# There are cases, however, where we may want the subject anatomical to be the reference. This is the case when we want to map an 
# atlas to a subject while preserving the subject as much as possible.
# We showcase here the case where the subject is chosen as source.
#Please note that the values in the transformation matrix of flirt refer to:
#Top-left 3×3 block: rotation, scaling, shearing.
#Last column: translation in x, y, z.
#Bottom row: always [0 0 0 1] for affine transforms.
##################
source = subject_anatomical # Fill me
reference = mni_template # Fill me
result = op.join(preproc_root, 'sub-{}'.format(subject_id), 'anat', 'sub-{}_T1w_mni'.format(subject_id))
flirt(source, reference, out=result)


Final result: 
0.960608 0.005215 0.007118 3.047266 
-0.028609 1.031014 0.237479 -45.851817 
-0.011549 -0.197684 1.137878 -36.700647 
0.000000 0.000000 0.000000 1.000000 



{}


(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:03:05.646: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:03:05.646: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:03:05.646: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:03:05.646: file ../gio/gfileinfo.c: line 1655 (g_file_info_get_is_backup): should not be reached


Now, **whenever** we do a preprocessing step, we should inspect the result to assess if we did a proper job. This is called **quality control**!

Visualize the result of flirt on top of the reference. What do you think of alignment?

In [36]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(reference) 
fsleyesDisplay.load(result)

#### 2.3.1 Choosing a cost
If you have a look at the options, you will notice that there is a cost option to flirt. Indeed, when performing registration, we have a function to measure how well the two images are matching one another. Flirt then attempts transformations to try and improve the fit. This function fit is defined through a cost, among different types.

Which cost should we use? If you were in a pure void, there would be no right or wrong answer from the get-go. No choice but to experiment and find out!

Hopefully, <a href="https://fsl.fmrib.ox.ac.uk/fsl/docs/#/registration/flirt/user_guide?id=flirt">the documentation</a> should give you some pointers. What you want here is to register a T1 to a T1: this is a <u>within</u> modality registration, so you should restrict yourself only to costs appropriate to this type of modality! 

To help you, we've set up a cell that will run the different coregistrations for you. Simply fill in the different costs to consider :)

In [37]:
#######
# Solution
# We consider only within-modalities costs, as the two images belong to the same modality: least squares and normalized correlation ratio.
# Remark correlation ratio also technically works
# So FSL FLIRT offers several cost functions for image registration. Mutual information measures statistical dependence between image intensity distributions. Correlation ratio evaluates how well one image can predict another using conditional variance statistics. Normalized correlation measures linear correlation between image intensities using the Pearson correlation coefficient. Normalized mutual information provides a scale-independent version of mutual information. Least squares minimizes squared intensity differences by assuming corresponding voxels should have similar intensities. Label difference minimizes mismatched labels for discrete segmentation images. Each function makes different assumptions about the relationship between corresponding image intensities.
#######
possible_costs = ['mutualinfo', 'corratio', 'normcorr', 'normmi', 'leastsq', 'labeldiff']
costs_to_consider = [ 'leastsq', 'normcorr' ] # fill me with the relevant costs

for c in costs_to_consider:
    flirt(source, reference, out=result + '_' + c, cost=c)


Final result: 
1.086353 0.007973 0.011315 -9.045721 
-0.013414 1.108506 0.296360 -62.791872 
-0.006203 -0.239784 1.354974 -58.293611 
0.000000 0.000000 0.000000 1.000000 


Final result: 
0.961249 0.004764 0.006706 3.135296 
-0.027534 1.033343 0.236322 -46.152920 
-0.010721 -0.197353 1.140527 -37.155305 
0.000000 0.000000 0.000000 1.000000 




(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.661: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.661: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.661: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.661: file ../gio/gfileinfo.c: line 1655 (g_file_info_get_is_backup): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.662: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.662: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.662: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:04:58.662: file ../gio/gfileinfo.c: line 

And let's perform lastly the QC step for each of these nice costs. We'll leave it to you to decide if you prefer any cost!

In [39]:
for c in costs_to_consider:
    fsleyesDisplay.load(result + '_' + c)

## 2.4 Going beyond: Non linear normalization

So, you know how to do it linearly. 
What if we wanted to do it non-linearly?

With FLIRT, <i>it's painfully hard</i>. To do it, you can use <a href="https://fsl.fmrib.ox.ac.uk/fsl/docs/#/registration/fnirt/index">FNIRT</a>. You can browse through, the take-home message is that it is complicated, many steps are involved.

But, there are other tools available, one of them being <a href="https://github.com/ANTsX/ANTs">ANTs (Advanced Normalization Tools)</a>.
For completeness, we will show you now how to use it (very succinctly) so that you know how to do it.

<div class="warning" style='background-color:#C1ECFA; color: #112A46; border-left: solid darkblue 4px; border-radius: 4px; padding:0.7em;'>
<span>
<p style='margin-top:1em; text-align:center'><b>💡 Pay attention! 💡</b></p>
<p style='text-indent: 10px;'>
    FNIRT does NOT expect the input data to be skull-stripped.</p>
</span>
</div>

In [40]:
moving_image = ants.image_read(source + '.nii.gz')
fixed_image = ants.image_read(reference + '.nii.gz')

# Compute the transformation (non linear) to put align the moving image to the fixed image
transformation = ants.registration(fixed=fixed_image, moving=moving_image, type_of_transform = 'SyN' )

# After the transformation has been computed, apply it
warpedImage = ants.apply_transforms(fixed=fixed_image, moving=moving_image, transformlist=transformation['fwdtransforms'])

# Save the image to disk
resultAnts = op.join(preproc_root, 'sub-{}'.format(subject_id), 'anat', 'sub-{}_T1w_mni_SyN.nii.gz'.format(subject_id))
ants.image_write(warpedImage, resultAnts)

# Inspect the results with FSLeyes or freeview, as you prefer :)


(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:07:23.584: GFileInfo created without standard::is-hidden

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:07:23.584: file ../gio/gfileinfo.c: line 1633 (g_file_info_get_is_hidden): should not be reached

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:07:23.584: GFileInfo created without standard::is-backup

(ipykernel_launcher.py:611): GLib-GIO-CRITICAL **: 09:07:23.584: file ../gio/gfileinfo.c: line 1655 (g_file_info_get_is_backup): should not be reached


Look at the results and compare it against the linear coregistration. Which one do you prefer? Why?

In [43]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(reference)
fsleyesDisplay.load(result)
fsleyesDisplay.load(resultAnts)

# Anatomical: conclusions

As a final note, all these steps (<u>including</u> non linear normalization!) can be done automatically for you with a single command: <a href="https://web.mit.edu/fsl_v5.0.10/fsl/doc/wiki/fsl_anat.html">fsl_anat</a>. So you might want to use this command, instead of running all of the above when conducting preprocessing.

We provide it here for convenience, but beware: it takes <b>several minutes</b> to complete, so you will need some patience!

Here are the different steps you've seen today:

<table>
    <tr><th style='text-align:justify;'>Data type</th><th style='text-align:justify;'>Preprocessing step name </th><th style='text-align:justify;'>Details of the step</th><th style='text-align:justify;'>FSL tool </th></tr>
    <tr><th>Anatomical</th><td></td><td></td></tr>
    <tr><td></td><td style='text-align:justify;'>Skull stripping</td><td style='text-align:justify;'>Removing skull and surrounding tissues to keep only the brain</td><td style='text-align:justify;'>BET</td></tr>
    <tr><td></td><td style='text-align:justify;'>Segmentation</td><td style='text-align:justify;'>Segmenting brain tissues (cerebro-spinal fluid, white matter, grey matter) based on their contrasts</td><td style='text-align:justify;'>FAST</td></tr>
    <tr><td></td><td style='text-align:justify;'>Normalization/Coregistration</td><td style='text-align:justify;'>Mapping participant's brain to a reference brain, making its orientation and scale match so that comparison across participants become feasible.</td><td style='text-align:justify;'>FLIRT (if linear), ANTs/FNIRT (if not linear)</td></tr>
</table>

There are other operations which can be conducted, depending on your analysis, but the ones you've seen are <b>always</b> done!

In [None]:
os.system('fsl_anat')

In [None]:
import shutil

def fsl_anat_wrapped(anatomical_target, output_path):
    os.system('fsl_anat -i {} --clobber --nosubcortseg -o {}'.format(anatomical_target,output_path))
    # Now move all files from the output_path.anat folder created by FSL to 
    # the actual output_path
    fsl_anat_path = output_path+'.anat'
    files_to_move = glob.glob(op.join(fsl_anat_path, '*'))
    for f in files_to_move:
        shutil.move(f, op.join(output_path, op.split(f)[1]))
    
    # Remove the output_path.anat folder
    os.rmdir(fsl_anat_path)

In [None]:
fsl_anat_wrapped(anatomical_path, op.join(preproc_root, 'sub-001', 'anat'))

Let's inspect the resulting files:

In [None]:
print_dir_tree(bids_root, max_depth=5)

That's a lot of files! But let's worry about mostly two of them. <br>
Notice the T1_to_MNI_lin and the T1_to_MNI_nonlin ?
<br>In the former's case, FLIRT was run to obtain a linear normalization, whereas FNIRT was used for the latter to obtain a non linear normalization. But what difference does it make, in practice? Well, let's inspect the results, shall we?

*Hint: consider the brain landmarks, such as the ventricles but also the overall shape of the brain to determine if there was a change and if so which one(s)*

In [None]:
fsleyesDisplay.resetOverlays()
fsleyesDisplay.load(op.expandvars(op.join('$FSLDIR', 'data', 'standard', 'MNI152_T1_1mm')))
fsleyesDisplay.load(op.join(preproc_root, 'sub-001', 'anat', 'T1_to_MNI_lin'))
fsleyesDisplay.load(op.join(preproc_root, 'sub-001', 'anat', 'T1_to_MNI_nonlin'))