<div style="text-align: left;">
<table style="width:100%; background-color:transparent;">
  <tr style="background-color:transparent;">
    <td style="background-color:transparent;">[<img src="http://project.inria.fr/saclaycds/files/2017/02/logoUPSayPlusCDS_990.png" width="70%">](http://www.datascience-paris-saclay.fr)</td>
    <td style="background-color:transparent;">[<img src="https://paris-saclay-cds.github.io/autism_challenge/images/institut_pasteur_logo.svg" width="30%">](https://research.pasteur.fr/en/team/group-roberto-toro/)</td>
  </tr>
</table> 
</div>

<center><h1>Imaging-psychiatry challenge: predicting autism</h1></center>

<center><h3>A data challenge on Autism Spectrum Disorder detection</h3></center>
<br/>
<center>_Roberto Toro (Institut Pasteur), Nicolas Traut (Institut Pasteur), Anita Beggiato (Institut Pasteur), Katja Heuer (Institut Pasteur),<br /> Gael Varoquaux (Inria, Parietal), Alex Gramfort (Inria, Parietal), Balazs Kegl (LAL),<br /> Guillaume Lemaitre (CDS), Alexandre Boucaud (CDS), and Joris van den Bossche (CDS)_</center>

This notebook intends to replicate the training and evaluation done for the ten best submissions. To have more details regarding the different steps, refer to `autism_starting_kit.ipynb`.

### 1. Load the data

In [None]:
from problem import get_train_data
from problem import get_test_data

data_train, labels_train = get_train_data()
data_test, labels_test = get_test_data()

### 2. Define the evaluation

In [None]:
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import cross_validate
from problem import get_cv

def evaluation(X, y):
    pipe = make_pipeline(FeatureExtractor(), Classifier())
    cv = get_cv(X, y)
    results = cross_validate(pipe, X, y, scoring=['roc_auc', 'accuracy'], cv=cv,
                             verbose=1, return_train_score=True,
                             n_jobs=1)
    
    return results

### 3. Load the submission

Each submission defines a `FeatureExtractor` and a `Classifier`. It relies on:

* the file `submissions/<submission_name>/feature_extractor.py` corresponding to the feature extractor;
* the file `submission/<submission_name>/classifier.py` corresponding to the classifier.

In the cells below, you can change the name of the `<submission_name>` to load on the desired solution and later run it.

#### 3.1 Feature extractor

In [None]:
# %load submissions/starting_kit/feature_extractor.py

#### 3.2 Classifier

In [None]:
# %load submissions/starting_kit/classifier.py

### 4. Run the evaluation

In [None]:
import numpy as np

In [None]:
results = evaluation(data_train, labels_train)

print("Training score ROC-AUC: {:.3f} +- {:.3f}".format(
    np.mean(results['train_roc_auc']), np.std(results['train_roc_auc'])))
print("Validation score ROC-AUC: {:.3f} +- {:.3f} \n".format(
    np.mean(results['test_roc_auc']), np.std(results['test_roc_auc'])))

print("Training score accuracy: {:.3f} +- {:.3f}".format(
    np.mean(results['train_accuracy']), np.std(results['train_accuracy'])))
print("Validation score accuracy: {:.3f} +- {:.3f}".format(
    np.mean(results['test_accuracy']), np.std(results['test_accuracy'])))

### 5. Alternative evaluation

Alternatively, you can run the `ramp_test_submission --submission <submission_name>` command. It will load the data, define the cross-validation, load the submission, and evaluate it automatically.

In [None]:
!ramp_test_submission --submission starting_kit