# Perform Classification (Inference)

<SCRIPT SRC='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></SCRIPT>
<SCRIPT>MathJax.Hub.Config({ tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}})</SCRIPT> 

This notebook shows how to perform classification on preprocessed data using pretrained models of [Wen et al, 2020](https://www.sciencedirect.com/science/article/abs/pii/S1361841520300591).

## Structure of the pretrained models

All the pretrained model folders are organized as follows:
<pre>
<b>results</b>
├── commandline.json
├── <b>fold-0</b>
├── ...
└── <b>fold-4</b>
    ├── <b>models</b>
    │      └── <b>best_balanced_accuracy</b>
    │          └── model_best.pth.tar
    └── <b>cnn_classification</b>
           └── <b>best_balanced_accuracy</b>
               └── validation_{patch|roi|slice}_level_prediction.tsv
</pre>
This file system is a part of the output of `clinicadl train` and `clinicadl classify` relies on three files:
<ul>
    <li> <code>commandline.json</code> contains all the options that were entered for training (type of input, architecture, preprocessing...)</li>
    <li> <code>model_best.pth.tar</code> corresponds to the model selected when the best validation balanced accuracy was obtained.</li>
    <li> <code>validation_{patch|roi|slice}_level_prediction.tsv</code> is specific to patch, roi and slice frameworks and is necessary to perform <b>soft-voting</b> thus finding the label on the image level in unbiased way. Indeed weighting the patches based on their performance of input data would bias the result as the classification framework would exploit knowledge on test data.</li>
</ul>

<div class="alert alert-block alert-info">
<b>Soft voting:</b><p>
    For classification tasks which take as input a part of the MRI volume (<i>patch, roi or slice</i>), an ensemble operation is needed to obtain the label at the image level.</p>
    <p>For example, size and stride of 50 voxels on linear preprocessing leads to the classification of 36 patches, but they are not all equally meaningful. Patches that are in the corners of the image are mainly composed of background and skull and may be misleading, whereas patches in the brain may be more useful.</p>
    <img src="./images/patches.png">
    <p>Then the image-level probability of AD <i>p<sup>AD</sup></i> will be:</p>
    $$ p^{AD} = {\sum_{i=0}^{35} bacc_i * p_i^{AD}}.$$
    where:<ul>
    <li> <i>p<sub>i</sub><sup>AD</sup></i> is the probability of AD for patch <i>i</i></li>
    <li> <i>bacc<sub>i</sub></i> is the validation balanced accuracy for patch <i>i</i></li>
    </ul>

</div>

## Download the pretrained models


<div class="alert alert-block alert-warning">
    
**Warning:** For the sake of the demostration, this tutoriel uses truncated versions of the models, containing only the first fold.

</div>

In this notebook, we propose to use 4 specific models , all of them where trained to predict a classification task AD vs CN. (The experiment corresponding to the pretrained model in eTable 4 of the paper mentioned above is shown besides):

1. **3D image-level model**, pretrained with the baseline data and initialized with an autoencoder (_cf._ exp. 3).
2. **3D roi-based model**, pretrained with the baseline data and initialized with an autoencoder (_cf._ exp. 8).
3. **3D patch-level model**, multi-cnn, pretrained with the baseline data and initialized with an autoencoder (_cf._ exp. 14).
4. **2D slice-level model**, pretrained with the baseline data and initialized with an autoencoder (_cf._ exp. 18).

Commands in the next code cell will automatically download these models and uncompress them.

In [None]:
# Download here the pretrained models stored online
# Model 1
!curl -k https://aramislab.paris.inria.fr/files/data/models/dl/models_v002/model_exp3_splits_1.tar.gz  -o model_exp3_splits_1.tar.gz
!tar xf model_exp3_splits_1.tar.gz

# Model 2
!curl -k https://aramislab.paris.inria.fr/files/data/models/dl/models_v002/model_exp8_splits_1.tar.gz  -o model_exp8_splits_1.tar.gz
!tar xf model_exp8_splits_1.tar.gz

# Model 3
!curl -k https://aramislab.paris.inria.fr/files/data/models/dl/models_v002/model_exp14_splits_1.tar.gz  -o model_exp14_splits_1.tar.gz
!tar xf model_exp14_splits_1.tar.gz

# Model 4
!curl -k https://aramislab.paris.inria.fr/files/data/models/dl/models_v002/model_exp18_splits_1.tar.gz  -o model_exp18_splits_1.tar.gz
!tar xf model_exp18_splits_1.tar.gz

## Run `clinicadl classify`

Running classification on a dataset is extremly simple using `clinicadl`. In this case, we will continue using the data preprocessed in the [previous notebook](./preprocessing). The models have been trained exclusively on ADNI dataset, all the subjects of OASIS-1 can be used to evaluate the model (without risking data leackage).

If you ran the previous notebook, you must have a folder called `OasisCaps_example` in the current directory. In the following steps we will classify these images using the pretrained models. The input necessary for `clinica classify` are:
* A Caps directory (`OasisCaps_example`)
* A tsv file with subjects/sessions to process, containing the diagnosis.
* The path to the pretrained model.

Some optional parameters includes:
* An output prefix for the output file, defined by the user (`-pre`)
* The posibility of process non labeled data (without diagnosis).
* The option to use previously extracted patches/slices.

For these examples, we will use the simplest case, by adding a prefix for the output files:

<div class="alert alert-block alert-warning">
If your computer is not equiped with a GPU card add the option `-cpu` to the command.
</div>

In [None]:
# Execute classify on OASIS dataset
# Model 1
!clinicadl classify ./OasisCaps_example ./OasisBids_example/OasisParticipants.tsv ./model_exp3_splits_1 -pre 'test_DB'

Classification results for the subjects of this dataset can be displayed here:

In [None]:
import pandas as pd

df = pd.read_csv("./model_exp3_splits_1/fold-0/cnn_classification/best_balanced_accuracy/test_DB_image_level_prediction.tsv", sep="\t")
df.head()

In the same way, we can process the dataset with all the other models:

In [None]:
# Model 2
!clinicadl classify ./OasisCaps_example ./OasisBids_example/OasisParticipants.tsv ./model_exp8_splits_1 -pre 'test_DB'

# Model 3
!clinicadl classify ./OasisCaps_example ./OasisBids_example/OasisParticipants.tsv ./model_exp14_splits_1 -pre 'test_DB'

# Model 4
!clinicadl classify ./OasisCaps_example ./OasisBids_example/OasisParticipants.tsv ./model_exp18_splits_1 -pre 'test_DB'