# Training (Optional)

This section explains how to train a CNN on OASIS data that was processed in the previous sections. However if you do not have an access to a GPU, training the CNN will require too much time, hence pretrained models will be given to execute the last part of the course (inference).

In [None]:
import torch

# Check if a GPU is available
print('GPU is available', torch.cuda.is_available())

# You can execute this notebook on Colab to run it on a GPU.

In this notebook the goal is to train a classifier to differentiate the **AD label from the CN label** based on the associated **T1-MR images**. These images were preprocessed according to the <font color='green'>appropriate section</font>.

According to the literature review done in <font color='green'>give here the reference of AD-DL</font>, 4 main paradigms were implemented to train a CNN to perform a classification task between AD and CN labels based on T1-MRI. These paradigms depend of the input given to the network:
* 2D slices,
* 3D patches,
* ROIs (Regions of Interest).
* 3D images.

<div class="alert alert-block alert-info">
<b>Computational issues:</b><p>
    Depending on your computational ressources, you should adapt the parameters in <b>COMPUTATIONAL ISSUES</b> group:
    <ul>
        <li> -gpu flag allows the system to use a GPU if available. </li>
        <li> --nproc N sets the number of workers (parallel processes) in the Dataloader to N. </li>
        <li> --batch_size B will set the batch size to B. If the batch size is too high, it can raise memory errors. </li>
    </ul>
</div>

## 2D slice-level CNN

The 2D slice-level CNN in `clinicadl` is a slight modification of the ResNet-18 network used to train natural images on the ImageNet dataset:
* One fully-connected layer was added at the end of the network to reduce the dimension of the output from 1000 to 2 classes <font color='purple'>(purple dotted box)</font>.
* The last five convolutional layers and the last FC of ResNet are fine-tuned <font color='green'>(green dotted box)</font>.
* All other layers have their weights and biases fixed.

<img src="./images/2DCNN.png">

During training, the gradients update are done based on the loss computed on the slice level. Final performances are computed on the subject level by combining the outputs of the slices of the same subject.

<div class="alert alert-block alert-success">
<b>Transfer learning:</b> 
    <p>The layers of this network (except the last one) are initialized with the weights of the ResNet-18 trained on ImageNet.</p>
</div>

In [None]:
!clinicadl train slice -h
#!clinicadl train slice <caps_dir> data/labels_list results/slice_cnn resnet18 -gpu --batch_size 16 --prepare_dl

## 3D-patch level & ROI-based models

The 3D patch-level and ROI-based CNNs in `clinicadl` have the same architecture, including:
* 4 convolutional layers with kernel 3x3x3,
* 4 max pooling layers with stride and kernel of 2 and a padding value that automatically adapts to the input feature map size.
* 3 fully-connected layers.

For 3D-patch level networks, it is possible to train one CNN per patch location. In this case use the flag `--network_type "multi"` and specify the number of patches in `--num_cnn`.

<img src="./images/ROICNN.png">

As for the 2D slice-level model, the gradients update are done based on the loss computed on the slice level. Final performances are computed on the subject level by combining the outputs of the patches or the two hippocampi of the same subject.

<div class="alert alert-block alert-success">
<b>Transfer learning:</b> 
    <p>It is possible for this category to train an autoencoder derived from the CNN architecture. The encoder will share the same architecture than the CNN until the fully-connected layers.</p>
    <img src="./images/autoencoder.png">
    <p>Then the weights of the encoder will be transferred in the convolutions of the CNN to initialize it before its training. This procedure is called <i>autoencoder pretraining</i>.</p>
    <p>It is also possible to transfer weights between two CNNs with the same architecture.</p>
    <p>For 3D-patch multi-CNNs specifically, it is possible to initialize each CNN of a multi-CNN:
        <ul>
        <li> with the weights of a single-CNN,</li>
        <li> with the weights of the corresponding CNN in a multi-CNN. </li>
    </ul>
    <p>Transferring weights between CNNs can be useful when performing two classification tasks that are similar. This is what has been done in <font color='green'>give here the reference of AD-DL</font>: the sMCI vs pMCI classification network was initialized with the weights of the AD vs CN classification network.</p>
</div>

<div class="alert alert-block alert-danger">
<b>Transfer learning:</b> 
    <p>Transferring weights between tasks that are not similar enough can hurt the performance !</p>
</div>

### 3D-patch level models

See definition of patches in the <font color="green">extract</font> section of the course.

In [None]:
# 3D-patch autoencoder pretraining
!clinicadl train patch autoencoder -h
#!clinicadl train patch autoencoder <caps_dir> data/labels_list results/patch_autoencoder Conv4_FC3 -gpu --batch_size 16 --prepare_dl

In [None]:
# 3D-patch single-CNN training
!clinicadl train patch cnn -h

# With autoencoder pretraining
#!clinicadl train patch cnn <caps_dir> data/labels_list results/patch_single-cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl --transfer_learning_autoencoder --transfer_learning_path results/patch_autoencoder
# Without pretraining
#!clinicadl train patch cnn <caps_dir> data/labels_list results/patch_single-cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl

In [None]:
# 3D-patch multi-CNN training

# With autoencoder pretraining
#!clinicadl train patch cnn <caps_dir> data/labels_list results/patch_multi-cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl --transfer_learning_autoencoder --transfer_learning_path results/patch_autoencoder
# With single-CNN pretraining
#!clinicadl train patch cnn <caps_dir> data/labels_list results/patch_multi-cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl --transfer_learning_path results/patch_single-cnn
# Without pretraining
#!clinicadl train patch cnn <caps_dir> data/labels_list results/patch_multi-cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl

### ROI-based models

See definition of ROI in the <font color="green">extract</font> section of the course.

In [None]:
# ROI-based autoencoder pretraining
!clinicadl train roi autoencoder -h
#!clinicadl train roi autoencoder <caps_dir> data/labels_list results/roi_autoencoder Conv4_FC3 -gpu --batch_size 16 --prepare_dl

In [None]:
# ROI-based CNN training
!clinicadl train roi cnn -h

# With autoencoder pretraining
#!clinicadl train roi cnn <caps_dir> data/labels_list results/roi_cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl --transfer_learning_autoencoder --transfer_learning_path results/roi_autoencoder
# Without pretraining
#!clinicadl train roi cnn <caps_dir> data/labels_list results/roi_cnn Conv4_FC3 -gpu --batch_size 16 --prepare_dl

## 3D-image level model

The 3D image-level CNN in `clinicadl` is designed as follows:
* 5 convolutional layers with kernel 3x3x3,
* 5 max pooling layers with stride and kernel of 2 and a padding value that automatically adapts to the input feature map size.
* 3 fully-connected layers.

<img src="./images/imageCNN.png">

Depending on the preprocessing, the size of the fully connected layers must be adapted. This is why two models exist in `clinicadl`:
* `Conv5_FC3` adapted to `linear` preprocessing, <font color="green">names might change</font>
* `Conv5_FC3_mni` adapted to `mni` preprocessing.


<div class="alert alert-block alert-success">
<b>Transfer learning:</b> 
    <p>It is possible for this category to train an autoencoder derived from the CNN architecture, or to transfer weights between CNNs. See the section on patches for more details on this topic !</p>
</div>

In [None]:
# 3D-image level autoencoder pretraining
!clinicadl train image autoencoder -h
#!clinicadl train image autoencoder <caps_dir> data/labels_list results/image_autoencoder Conv5_FC3 -gpu

In [None]:
# 3D-image level autoencoder pretraining
!clinicadl train image cnn -h

# With autoencoder pretraining
#!clinicadl train image cnn <caps_dir> data/labels_list results/image_cnn Conv5_FC3 -n -gpu --transfer_learning_autoencoder --transfer_learning_path results/image_autoencoder
# Without pretraining
#!clinicadl train image cnn <caps_dir> data/labels_list results/image_cnn Conv5_FC3 -n -gpu

## Custom models

You want to add your custom architecture and train it to clinicadl ? Please fork and clone the [github repo](https://github.com/aramis-lab/AD-DL]) and code your model in pytorch in `clinicadl/tools/deep_learning/models` and import it in `clinicadl/tools/deep_learning/models/__init__.py` !