# Overview

Congratulations! You have reached the end of the CS 190 curriculum. As a final summative exercise, you will be tasked to develop a model to predict brain tumor patient survival using any of the approaches and tools you have learned this quarter. The brief tutorial will simply introduce the dataset and provide some strategies to help guide exploration. Once you are familiar with the task, you are welcome to move onto the assignment which contains more details regarding algorithm design requirements and submission.

This tutorial is part of the class **Introduction to Deep Learning for Medical Imaging** at University of California Irvine (CS190); more information can be found at: https://github.com/peterchang77/dl_tutor/tree/master/cs190.

# Google Colab

The following lines of code will configure your Google Colab environment for this tutorial.

### Enable GPU runtime

Use the following instructions to switch the default Colab instance into a GPU-enabled runtime:

```
Runtime > Change runtime type > Hardware accelerator > GPU
```

# Environment

### Jarvis library

In this notebook we will Jarvis, a custom Python package to facilitate data science and deep learning for healthcare. Among other things, this library will be used for low-level data management, stratification and visualization of high-dimensional medical data.

In [None]:
# --- Install jarvis (only in Google Colab or local runtime)
% pip install jarvis-md

### Imports

Use the following lines to import any additional needed libraries:

In [None]:
import numpy as np
from jarvis.train import datasets
from jarvis.utils.display import imshow

# Data

The data used in this tutorial will consist of brain tumor MRI exams derived from the MICCAI Brain Tumor Segmentation Challenge (BRaTS). More information about he BRaTS Challenge can be found here: http://braintumorsegmentation.org/. Each single 3D volume will consist of one of four different sequences (T2, FLAIR, T1 pre-contrast and T1 post-contrast). The custom `datasets.download(...)` method can be used to download a local copy of the dataset. By default the dataset will be archived at `/data/raw/mr_brats_2020`; as needed an alternate location may be specified using `datasets.download(name=..., path=...)`. 

In [None]:
# --- Download dataset
datasets.download(name='mr/brats-2020-096')

Once downloaded, the `datasets.prepare(...)` method can be used to generate the required python Generators to iterate through the dataset, as well as a `client` object for any needed advanced functionality.

### Datasets

To specificy the correct Generator template file, pass a designated `keyword` string. In this exercise, we will be using brain MRI volumes that have been cropped to the boundaries of the tumor and resampled to a uniform 3D volume of shape (96, 96, 96, 4). Using this input, two separate target labels have been prepared:

* survival scores (use `096*glb-org` keyword)
* tumor segmentation labels (use `096*vox-org` keyword)

To select the correct template and generators for this task, use the keyword string as above.

#### Survival scores

To create generators yielding target survival score labels, use the `096*glb-org` keyword: 

In [None]:
# --- Prepare survival score generators
gen_train, gen_valid, client = datasets.prepare(name='mr/brats-2020-096', keyword='096*glb-org')

Let us examine the generator data:

In [None]:
# --- Yield one example
xs, ys = next(gen_train)

# --- Print dict keys
print('xs keys: {}'.format(xs.keys()))
print('ys keys: {}'.format(ys.keys()))

# --- Print data shape
print('xs shape: {}'.format(xs['dat'].shape))
print('ys shape: {}'.format(ys['survival'].shape))

**Survival scores**: Note that total days of patient survival have been converted to floating point scores between `[0, 1]` using the following formula:

```
score = log ( days ) / 10
```

In [None]:
# --- Print score values
print(ys['survival'])

#### Tumor segmentation

To create generators yielding target tumor segmentation labels, use the `096*vox-org` keyword. In addition, use the following `configs` dictionary to ensure that the 4-class tumor segmentation labels are binarized. As an alternative to the `configs` dictionary you can use the same nested Python generator strategy that was used in the previous week 5 and 6 tutorials.

In [None]:
# --- Prepare tumor segmentation generators
configs = {'specs': {'ys': {
    'tumor': {
        'dtype': 'uint8',
        'loads': 'lbl-crp',
        'norms': {'clip': {'min': 0, 'max': 1}},
        'shape': [96, 96, 96, 1]}}}}

gen_train, gen_valid, client = datasets.prepare(name='mr/brats-2020-096', keyword='096*vox-org', configs=configs)

Let us examine the generator data:

In [None]:
# --- Yield one example
xs, ys = next(gen_train)

# --- Print dict keys
print('xs keys: {}'.format(xs.keys()))
print('ys keys: {}'.format(ys.keys()))

# --- Print data shape
print('xs shape: {}'.format(xs['dat'].shape))
print('ys shape: {}'.format(ys['tumor'].shape))

### Combined

To create a custom generator yielding *both* tumor segmentation and survival scores simultaneously, consider the following code:

In [None]:
# --- Create custom configs dict
configs = {'specs': {'ys': {
    'tumor': {
        'dtype': 'uint8',
        'loads': 'lbl-crp',
        'norms': {'clip': {'min': 0, 'max': 1}},
        'shape': [96, 96, 96, 1]},
    'survival': {
        'dtype': 'float32',
        'loads': 'survival_days_norm',
        'shape': [1]}}}}

# --- Create generators
gen_train, gen_valid, client = datasets.prepare(
    name='mr/brats-2020-096', 
    keyword='096*vox-org',
    configs=configs)

Note that in this code example, the `ys` dictionary is replaced by the custom `configs` specification above, so it does not matter which `keyword` you choose.

# Model

The goal of this project is to perform **global survival prediction** for each patient (e.g., 3D volume of data). In other words, regardless of algorithm choice, the final objective is to predict a global survival score. This however does **not** mean that you are required to use global regression networks only; in fact it very well may be the case that a hybrid algorithm will overall perform better on this task.

The task is designed to be open-ended on purpose. The only requirements are to:

* test at minimum three different network architectures
* one algorithm must use (at least) a global regression type loss function
* one algorithm must use (at least) a pretrained autoencoder strategy

While you can choose to be creative and employ any architecture that you like, the following discussion may help guide your development process.

### Global regression

As a baseline, create a global regression network to predict survival from raw data. A global regression network is identical to a standard classifier except instead of a categorical prediction (trained using a cross-entropy loss) a continuous variable prediction is generated (trained using a regression loss).

To build a global regression network, consider a standard CNN backbone:

```python
# --- Define series of blocks
l1 = conv1(16, inputs['dat'])
l2 = conv1(24, conv2(24, l1))
l3 = conv1(32, conv2(32, l2))
l4 = conv1(48, conv2(48, l3))

# --- Continue blocks as needed...
```

After a series of convolutional blocks, the final feature map needs to be transitioned to feature vectors using one of the following approaches:

* serial convolutions (with stride > 1 or VALID type padding)
* global pool operations (mean or max)
* reshape / flatten operation

For example, the flattening operation can be implemented as follows:

```python
f0 = layers.Flatten()(...)
```

See notes from week 3 for further information.

After a number of (optional) hidden layer operations, the final logit score is generated as a single-element prediction normalized using a *sigmoid* function (e.g., all survival scores are between `[0, 1]`):

```python
logits['survival'] = layers.Dense(1, activation='sigmoid', name='survival')(...)
```

After creating the `model` object, ensure that the model is trained using a regression loss such as *mean absolute error* or *mean squared error*:

```python
model.compile(..., loss={'survival': losses.MeanSquaredError()}, ...)
```

### Autoencoder

The autoencoder architecture is designed to create a compressed latent feature representation of the original input data. By compressing the raw data using an autoencoder, only the most relevant, critical details of the original input are retained in the low dimensional feature representation. See notes from week 9 for overview of creating a pretrained autoencoder network.

For your final (third) model, if you choose to modify and/or extend the autoencoder approach, consider any of the following strategies.

#### Modifications to the autoencoder network

The baseline autoencoder network may be modified using:

* increased feature map depth (channel size)
* increased total number layers
* added complexity through motifs (e.g., ResNet, Inception, DenseNet, etc)
* improved optimization (e.g., more training iterations, learning rate decay, etc)
* alternate loss functions (e.g., MAE)

Recall that given the autoencoder design (e.g., the "compressed" latent representation), it is relatively unlikely to overfit the autoencoder network itself so model complexity may be increased quite a bit using many of these modifications simultaneously.

#### Modifications to the autoencoder task

The standard autoencoder is trained to complete a reconstruction task, however any task may be used to guide feature learning. In this dataset, the tumor segmentation task may provide an alternate (possibly more meaningful) feature representation (e.g., the assumption is that features used to perform tumor segmentation are *also* useful for predicting patient survival).

To do so, first create a standard U-Net derived model:

```python
# --- Define contracting layers
l1 = conv1(8, inputs['dat'])
l2 = conv1(16, conv2(16, l1))
l3 = conv1(32, conv2(32, l2))
l4 = conv1(48, conv2(48, l3))
l5 = conv1(64, conv2(64, l4))

# --- Define expanding layers
l6  = tran2(48, l5)
l7  = tran2(32, conv1(48, concat(l4, l6)))
l8  = tran2(16, conv1(32, concat(l3, l7)))
l9  = tran2(8,  conv1(16, concat(l2, l8)))
l10 = conv1(8,  l9)
```

From these defined layers, we create both the standard U-Net and a separate encoder just as before:

```python
# --- Create U-net
logits = {'tumor': layers.Conv3D(filters=2, name='tumor', **kwargs)(l10)}
unet = Model(inputs=inputs, outputs=logits)

# --- Create encoder
encoder = Model(inputs=inputs, outputs=l5)
```

This `unet` model may be compiled and trained using standard strategies. See week 5 and 6 notes for further information. Be sure to use the correct generator to train this model with segmentation mask outputs (e.g., use keyword `096*vox-org`). Afterwards, use the `encoder` to build a fine-tuned prediction network.

#### Modifications to the fine-tuned prediction network

Regardless of pretrained model of choice (e.g., standard autoencoder or U-Net) the final goal is create a fine-tuned model for survival score prediction. See week 9 notes for further information.

The baseline fine-tuned prediction network may be modified using:

* modified method for transition from feature maps to feature vectors (e.g., pooling)
* increased *or* decreased size or number of hidden layer nodes
* use of reguarlization (e.g., dropout, L2 normalization)
* alternate loss function

Note that while the pretrained model (e.g., autoencoder or tumor segmentation) is unlikely to overfit significantly, this fine-tuned prediction network will very easily overfit with the relatively small number of patients in this cohort. Thus, the strategies chosen above should be carefully tuned to optimize model performance.

### Dual Loss

Instead of pretraining a model and fine-tuning as separate (asynchronous) training sessions, it is possible and potentially beneficial to optimize both loss functions simultaneously. 

Consider the following autoencoder and survival prediction model with two separate outputs:

```python
# --- Define contracting layers
l1 = conv1(8, inputs['dat'])
l2 = conv1(16, conv2(16, l1))
l3 = conv1(32, conv2(32, l2))
l4 = conv1(48, conv2(48, l3))
l5 = conv1(64, conv2(64, l4))

# --- Define expanding layers
l6  = tran2(48, l5)
l7  = tran2(32, conv1(48, l6))
l8  = tran2(16, conv1(32, l7))
l9  = tran2(8,  conv1(16, l8))
l10 = conv1(8,  l9)

# --- Define survival prediction
h0 = layers.Flatten()(l5)
h1 = layers.Dense(32, activation='relu')(h0)

# --- Define all logits
logits = {}
logits['survival'] = layers.Dense(1, activation='sigmoid', name='survival')(h1)
logits['recon'] = layers.Conv3D(filters=4, name='recon', **kwargs)(l10)
```

After creating our model, be sure to compile the model with two separate loss functions:

```python
# --- Create model
model = Model(inputs=inputs, outputs=logits)

# --- Compile model
model.compile(
    optimizer=optimizers.Adam(learning_rate=1e-3),
    loss={
        'survival': losses.MeanSquaredError(),
        'recon': losses.MeanSquaredError()},
    experimental_run_tf_function=False)
```

In this example, both the reconstruction and the survival scores are trained with a `mean squared error` loss function, but as you can see, any loss function can be used. Additionally, any number of training `metrics` can be used using a similar approach by specifying metrics as Python dictionaries. 

### Fully-convolutional network

Both the autoencoder and tumor segmentation models (e.g., fully convolutional contracting-expanding networks) are highly *regularized* through the prediction of a dense matrix of outputs (e.g., 1-to-1 mapping in number of input and output matrix shapes). Is there a method to adopt this approach to a regression task?

The simplest approach is to reuse the standard U-Net network and simply replace the categorical loss (e.g., cross-entropy) with a regression loss.

```python
# --- Define contracting layers
l1 = conv1(8, inputs['dat'])
l2 = conv1(16, conv2(16, l1))
l3 = conv1(32, conv2(32, l2))
l4 = conv1(48, conv2(48, l3))
l5 = conv1(64, conv2(64, l4))

# --- Define expanding layers
l6  = tran2(48, l5)
l7  = tran2(32, conv1(48, concat(l4, l6)))
l8  = tran2(16, conv1(32, concat(l3, l7)))
l9  = tran2(8,  conv1(16, concat(l2, l8)))
l10 = conv1(8,  l9)

# --- Create model
logits = {'survival': layers.Conv3D(filters=1, name='survival', **kwargs)(l10)}
model = Model(inputs=inputs, outputs=logits)

# --- Compile model
model.compile(
    optimizer=optimizers.Adam(learning_rate=2e-4),
    loss={'survival': losses.MeanSquaredError()},
    experimental_run_tf_function=False)
```

To train this network, you will need a new custom Python generator that yields a single element `ys` output that is a dense `(96, 96, 96, 1)` tensor with identical survival score values. Assuming you are starting with the standard generator `survival` generator above, then consider the following code:

```python
def generator(G):
    """
    Method to modify combined generator 
    
    """
    for xs, ys in G:

        # --- Reshape survival to 5D tensor
        survival = ys['survival'].reshape(-1, 1, 1, 1, 1)
        
        # --- Replace ys['survival']
        ys['survival'] = np.zeros((survival.shape[0], 96, 96, 96, 1), dtype='float32')
        ys['survival'][:] = survival

        yield xs, ys
```

Once your algorithm is trained, each single patient 3D volume will result in a 3D matrix of survival score predictions. To generate a single global prediction, consider taking the mean, median or other statistical aggregation method across all pixel values.

Finally, note that in combining several strategies from this entire document, one may create a fully-convolutional model with *multiple* loss functions that can be trained simultaneously. For example, a simple modification to the above can be used to create a model that predictions both survival and tumor segmentations:

```python
# --- Create model
logits = {}
logits['survival'] = layers.Conv3D(filters=1, name='survival', activation='sigmoid', **kwargs)(l10)
logits['tumor'] = layers.Conv3D(filters=2, name='tumor', **kwargs)(l10)
model = Model(inputs=inputs, outputs=logits)

# --- Compile model
model.compile(
    optimizer=optimizers.Adam(learning_rate=2e-4),
    loss={
        'survival': losses.MeanSquaredError(),
        'tumor': losses.SparseCategoricalCrossentropy(from_logits=True)},
    experimental_run_tf_function=False)
```

What is the appropriate custom generator code that will yield the correct two target outputs?