# Luna Training Dashboard

__This notebook is meant to explicit every step of a training session setup and performance. However, in practice it is suitable not to gather all of those steps in a single notebook.__

1) Building the network

2) Setting up training environement and parameters

3) Train the model

4) Evaluate

In the following, we demonstrate these steps based on what was proposed by [Christodoulis et al. 2018](https://arxiv.org/abs/1809.06226)
______

Load python libraries

In [1]:
import os
import sys
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import keras
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
import keras.losses as klosses
import keras.optimizers as optimizers
import keras.callbacks as kcallbacks
import keras.initializers as initializers

Using TensorFlow backend.


Load and setup local dependencies

In [2]:
cur_dir = os.getcwd()
base_dir = os.path.dirname(cur_dir)

import sys
sys.path.append(base_dir)
from utils.LungsLoader import LungsLoader
from utils.ScanHandler import ScanHandler

from src.networks.networks_utils import blocks
from src.networks.MariaNet import MariaNet
from src.training.config_file import ConfigFile
from src.training.luna_training import LunaTrainer
import src.metrics.metrics as metrics

import warnings
warnings.filterwarnings("ignore")

loader = LungsLoader()
handler = ScanHandler(plt)

### Network hyperparameters

First hyperparameter showing up would be the input size which is here set to 256x256x256

In [3]:
input_shape = (256, 256, 256)

To build the network defined by [Christodoulis et al. 2018](https://arxiv.org/abs/1809.06226) we need to define :

- A sequence of encoding convolutional blocks which are to be concatenated. The original architecture consists of 5 dilated (conv + instanceNorm + LeakyRelu) block

    - Conv parameters :
        - kernel size : 3x3
        - Number filters : 32 - 64 - 128 - 32 - 32
        - Dilation rate : 1 - 1 - 2 - 3 - 5
    - LeakyRelu : `alpha` = 0.3
    
- A Squeeze Excitation block for the deformable decoder
- A sequence of decoding convolutional blocks for deformable registration. The original architecture consists of 5 non-dilated (conv + instanceNorm + LeakyRelu) blocks 
    - Conv parameters:
        - kernel size : 3x3
        - Number filters : 128 - 64 - 32 - 32 - 32
    - LeakyRelu : `alpha` = 0.3

Definition of the core blocks

In [4]:
conv_block_kwgs = {
                    "activation": "LeakyReLU",
                    "activation_kwargs":{
                        "alpha": 0.3
                        },
                    "normalize": True,
                    "conv_kwargs": {
                      "kernel_size": 3,
                      "padding": "same",
                      "kernel_initializer": initializers.RandomNormal(mean=0.0, stddev=1e-5)
                      }
                    }
squeeze_ratio = 16
conv_block = blocks.ConvBlock(**conv_block_kwgs)
squeeze_block = blocks.SqueezeExciteBlock(ratio=squeeze_ratio)

Encoding and decoding sequences of hyperparameters

In [5]:
enc_filters = [32, 64, 128, 32, 32]
enc_dilation = [(1, 1, 1), (1, 1, 1), (2, 2, 2), (3, 3, 3), (5, 5, 5)]
enc_params = [{"filters": n_filter, "dilation_rate": dil_rate} for (n_filter, dil_rate) in zip(enc_filters, enc_dilation)]

dec_filters = [128, 64, 32, 32, 32]
dec_params = [{"filters": n_filter} for n_filter in dec_filters]

Finally, the original paper adds an ultimate regularized convolution blocks at the end of the linear and deformable decoders. The deformable one consists of a (conv + sigmoid) block with 3 filters and the linear one of a (conv + linear) block with 12 filters.

In [6]:
def_flow_nf = 3
lin_flow_nf = 12

The model can now be generated

In [7]:
marianet = MariaNet(input_shape,
                    enc_params,
                     dec_params,
                     conv_block,
                     squeeze_block,
                     def_flow_nf,
                     lin_flow_nf)
model = marianet.build()
# model.summary()

ValueError: Input 0 is incompatible with layer linear_flow: expected ndim=5, found ndim=2

In [None]:
model.get_layer("linear_flow").output

In [None]:
model.get_layer("diffeomorphic_transformer3d_1").input

# Setup training environment

Training environnements is designed as :

```
bin
|__session_name
   |__builder.pickle
   |__config.pickle
   |__training_history.json
   |__checkpoints
   |  |__chkpt_{epoch}.h5
   |__tensorboard
      |__tensorboard log files
```

where : 

- `builder.pickle` is a serialized file containing all needed information about the neural network used in the session. It allows to instantiate an object of class `MariaNet`
- `config.pickle` is a serialized file containing all needed information about training performed in the session except model architecture. It allows to instantiate an object of class `ConfigFile`
- `training_history.json` is a record of losses and metrics evolution for training and validation set, dumped when training is completed
- `checkpoints` is a directory containing model weights checkpoints for different epochs
- `tensorboard` is a directory containing tensorboard log files

The latter directory is created as follows :

In [None]:
session_name = "sandbox_session"
config = ConfigFile(session_name)
config.setup_session()

We can already save the neural net in there

In [None]:
builder_path = os.path.join(config.session_dir, "builder.pickle")
marianet.serialize(builder_path)

The main idea here is that all the training configurations except for the model and dataset should be attributes of a `ConfigFile` which allows us to store all those configurations under a single `pickle` serialized file. In the following we will go through the several training configurations and iteratively set them as attributes of `config`.

### Training Hyperparameters

__Input shape :__

This is not properly a training configuration, as it was done previously for the model we need to set the shape toward the inputs need to be resized

In [None]:
config.set_input_shape(input_shape)

__Losses :__

We propose to use a loss defined by:

$$
\mathcal{L}(\theta) = \|R-\mathcal{W}(S, G)\|^2_2 - \alpha\, CC(T, R) + \beta \|A-A_I\|_1 + \gamma\|\Phi - \Phi_I\|_1
$$

where $R$ is the predicted moving image, $S$ the source image, $T$ the target image, $G$ the deformation field, $\mathcal{W}$ the neural net, $CC$ the cross-correlation, $A$ the affine transformation, $A_I$ the identity affine transformation, $\Phi$ the spatial gradient and $\Phi_I$ the spatial gradient od the identity transformation.


The network's output being of the shape `[deformed, deformable_grad_flow, linear_flow]`, we will define the losses and their weights accordingly. $\beta$ and $\gamma$ are set to $10^{-6}$. In practice, $A_I$ and $\Phi_I$ are represented by a three dimensional array with 3 channels (one by deformation axis), we will hence embedd them as null arrays.

In [None]:
def registration_loss(vol1, vol2):
    return klosses.mean_squared_error(vol1, vol2) - 1. * metrics.cross_correlation(vol1, vol2)


losses = [registration_loss, klosses.mean_absolute_error, klosses.mean_absolute_error]
loss_weights = [1., 1e-5, 1e-5]

config.set_losses(losses)
config.set_loss_weights(loss_weights)

__Optimizer :__

We choose an Adam optimizer with initial learning rate of $10^{-3}$ and decay $10^{-6}$

In [None]:
optimizer = optimizers.Adam(lr=1e-2, decay=1e-5)
config.set_optimizer(optimizer)

__Callbacks :__

In [None]:
checkpoints_dir = os.path.join(config.session_dir, ConfigFile.checkpoints_dirname)
tensorboard_dir = os.path.join(config.session_dir, ConfigFile.tensorboard_dirname)


save_callback = kcallbacks.ModelCheckpoint(os.path.join(checkpoints_dir, ConfigFile.checkpoints_format), 
                                           verbose=1, 
                                           save_best_only=True)
early_stopping = kcallbacks.EarlyStopping(monitor='val_loss',
                                          min_delta=1e-3,
                                          patience=20,
                                          mode='auto')
tbCallBack = kcallbacks.TensorBoard(log_dir=tensorboard_dir, 
                                    histogram_freq=0, 
                                    write_graph=True, 
                                    write_images=True)

callbacks = [save_callback, early_stopping, tbCallBack]
config.set_callbacks(callbacks)

__Metrics :__ (TO BE ADDED)

In [None]:
# No metrics involved so far

__Training Scope :__

Number of training epochs and steps per epoch also have to be defined

In [None]:
epochs = 100
steps_per_epoch = 50

config.set_epochs(epochs)
config.set_steps_per_epoch(steps_per_epoch)

__Dumping configuration :__

We can now finally dump the so told serialized file.

In [None]:
config.serialize()

### Train Model

Model and training configuration have been defined, we can now combine both to actually perform the training. To do so by initiating an `LunaTrainer` instance with the previously defined objects. But first we need to setup the device we are going to perform the training on

In [None]:
gpu_id = 0
gpu = '/gpu:' + str(gpu_id)
os.environ["CUDA_VISIBLE_DEVICES"] = str(gpu_id)
tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
tf_config.allow_soft_placement = True
set_session(tf.Session(config=tf_config))

In [None]:
trainer = LunaTrainer(model=model,
                      device=gpu,
                      config_path=os.path.join(config.session_dir, ConfigFile.pickle_filename))

We need to define a list of training and validation scans by splitting the Luna dataset with 80/20 ratio. 

In [None]:
all_ids = loader.get_scan_ids()
train_ids, val_ids = train_test_split(all_ids, test_size=0.2)

Run training.

__Warning__ : this is just a demonstration, you better not run the training cell

In [None]:
trainer.fit(train_ids, val_ids)

In [None]:
# model.summary()