# Model Training Tutorial

Welcome to the model training tutorial! In this tutorial, we will train a neural network to classify tiles from our toy data set and visualize its efficacy. Our model is essentially a wrapper around PyTorch's ResNet 18 deep residual network; the LUNA team modified it to suit their work with tiling the slides. 


In [None]:
# setup home directory
import os
HOME = os.environ['HOME']

In [None]:
env DATASET_URL=file:///$HOME/vmount/PRO_12-123/

### Model Training

The model will be used to classify tiles into the different tissue types we've annotated (tumor, stroma and fat). These tissue classifier models can be trained using the `train_tissue_classifier` CLI tool

In [None]:
!train_tissue_classifier --help

This CLI tool has a many command line arguments. The main input is the labled tile dataset, which is the data used to train and valdiate the model. For validation, the tiles are stratified by patient id and by slide id, and the split is contoleled by the `num_splits` parameter. The `label_set` parameter is used to map the tissue types to numerical quantities. These models can use none, one, or many GPUs/CPUs using Ray. The arguments used to modify the resources are `num_gpus, num_cpus, num_workers, num_cpus_per_worker, num_gpus_per_worker`. If you want to experiment with different hyperparameters, you can supply a list of values to certian arguments, such as `learning_rate` or `batch_size` and Ray will perform a hyperparameter search or sweep accordingly. 

In the following example, we're going to train a ResNet18 model (though any model available from [PyTorch](https://pytorch.org/vision/stable/models.html) can be used) for two epochs. 

In [None]:
%%bash

train_tissue_classifier ~/vmount/PRO_12-123/datasets/PRO_TILES_LABLED/ \
--label_set "{'tumor':0, 'stroma':1, 'fat':2}" \
--label_col regional_label --stratify_col slide_id \
--num_epochs 2 --network 'torchvision.models.resnet18' \
--num_splits 2 \
--batch_size 4 \
-lr 1e-4  \
-cw 4 -gw 0 -nw 1 -ng 0 -nc 4 -ns 1 \
--output_dir ../PRO_12-123/tissue_classifier_results


### Results

Now that we have a trained model, we can inspect the output

In [None]:
!ls -lat ../PRO_12-123/tissue_classifier_results

For every time the model is trained, Ray will put together a set of output directories to manage your runs. You can inspect the results using Ray's ExperimentAnalysis dataframe by loading a particular output directory. This dataframe will store various performance metrics as well as the hyperparameters used to configure the model among other output metadata

In [None]:
from ray.tune import ExperimentAnalysis
RAY_OUTPUT = "tune_function_2022-05-17_00-11-34" # change this to the output folder you want to insepct
output_dir = "../PRO_12-123/tissue_classifier_results"

ray_output_dir = os.path.join(output_dir, RAY_OUTPUT)
analysis = ExperimentAnalysis(ray_output_dir)
display(analysis.results_df)


We can use the output to put together a confusion matrix.

In [None]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

label_dict = {'tumor':0, 'stroma':1, 'fat':2}
labels = list(label_dict.keys())
cm = analysis.results_df['val_ConfusionMatrix'].iloc[0]

# normalize 
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
df_cm = pd.DataFrame(cm, index=labels, columns=labels)
df_cm
sns.heatmap(df_cm, annot=True)
plt.show()

The ray output directory also contains a tensorboard file (`events.out.tf.events.*`) in the `'tune_function_*'` subdirectory that can be used to further evaluate the performance of the model. 


In [None]:
!tree $output_dir
# %load_ext tensorboard
# ! tensorboard --logdir os.path.join(ray_output_dir,'tune_function_ff99e_00000_0_batch_size=4,learning_rate=0.0001,num_epochs=2_2022-05-10_13-49-23') --bind_all

This output directory directory also contains our model checkpoints `checkpoint_*.pt` that we'll need for inference. Now, with our trained model and model checkpoints, we can move on the next notebook!
