# Overview

In this tutorial we will explore how to create a simple convolutional neural network (CNN) for classification of pneumonia (lung infection) from chest radiographs, the most common imaging modality used to screen for pulmonary disease. For any patient with suspected lung infection, including viral penumonia such as as COVID-19, the initial imaging exam of choice is a chest radiograph.

## Workshop Links

Use the following link to access materials from this workshop: https://github.com/peterchang77/dl_tutor/tree/master/workshops

*Tutorials*

* Introduction to Tensorflow 2.0 and Keras: https://bit.ly/2VSYaop
* CNN for pneumonia classification: https://bit.ly/2D9ZBrX (**current tutorial**)
* CNN for pneumonia segmentation: https://bit.ly/2VQMWk9

# Environment

The following lines of code will configure your Google Colab environment for this tutorial.

### Enable GPU runtime

Use the following instructions to switch the default Colab instance into a GPU-enabled runtime:

```
Runtime > Change runtime type > Hardware accelerator > GPU
```

### Jarvis library

In this notebook we will Jarvis, a custom Python package to facilitate data science and deep learning for healthcare. Among other things, this library will be used for low-level data management, stratification and visualization of high-dimensional medical data.

In [None]:
# --- Install Jarvis library
%pip install jarvis-md

### Imports

Use the following lines to import any needed libraries:

In [None]:
import numpy as np, pandas as pd
from tensorflow import losses, optimizers
from tensorflow.keras import Input, Model, models, layers, metrics
from jarvis.train import datasets
from jarvis.utils.display import imshow

# Data

The data used in this tutorial will consist of (frontal projection) chest radiographs from a subset of the RSNA / Kaggle pneumonia challenge (https://www.kaggle.com/c/rsna-pneumonia-detection-challenge). From the complete cohort, a random subset of 1,000 exams will be used for training and evaluation.

### Download

The custom `datasets.download(...)` method can be used to download a local copy of the dataset. By default the dataset will be archived at `/data/raw/xr_pna`; as needed an alternate location may be specified using `datasets.download(name=..., path=...)`. 

In [None]:
# --- Download dataset
datasets.download(name='xr/pna-512')

### Python generators

Once the dataset is downloaded locally, Python generators to iterate through the dataset can be easily prepared using the `datasets.prepare(...)` method:

In [None]:
# --- Prepare generators
gen_train, gen_valid, client = datasets.prepare(name='xr/pna-512', keyword='cls-512')

The created generators, `gen_train` and `gen_valid`, are designed to yield two variables per iteration: `xs` and `ys`. Both `xs` and `ys` each represent a dictionary of NumPy arrays containing model input(s) and output(s) for a single *batch* of training. The use of Python generators provides a generic interface for data input for a number of machine learning libraries including Tensorflow 2.0 / Keras.

Note that any valid Python iterable method can be used to loop through the generators indefinitely. For example the Python built-in `next(...)` method will yield the next batch of data:

In [None]:
# --- Yield one example
xs, ys = next(gen_train)

### Data exploration

To help facilitate algorithm design, each original chest radiograph has been resampled to a uniform `(512, 512)` matrix. Overall, the dataset comprises a total of `1,000` 2D images: a total of `500` negaative exams and `500` positive exams.

### `xs` dictionary

The `xs` dictionary contains a single batch of model inputs:

1. `dat`: input chest radiograph resampled to `(1, 512, 512, 1)` matrix shape

In [None]:
# --- Print keys 
for key, arr in xs.items():
    print('xs key: {} | shape = {}'.format(key.ljust(8), arr.shape))

### `ys` dictionary

The `ys` dictionary contains a single batch of model outputs:

1. `pna`: binary classification of pneumonia vs. not pneumonia chest radiographs

* 0 = negative
* 1 = positive of pneumonia

In [None]:
# --- Print keys 
for key, arr in ys.items():
    print('ys key: {} | shape = {}'.format(key.ljust(8), arr.shape))

### Visualization

Use the following lines of code to visualize a single input image using the `imshow(...)` method:

In [None]:
# --- Show labels
xs, ys = next(gen_train)
imshow(xs['dat'][0])

Use the following lines of code to visualize an N x N mosaic of all images in the current batch using the `imshow(...)` method:

In [None]:
# --- Show "montage" of all images
imshow(xs['dat'], figsize=(12, 12))

### Model inputs

For every input in `xs`, a corresponding `Input(...)` variable can be created and returned in a `inputs` dictionary for ease of model development:

In [None]:
# --- Create model inputs
inputs = client.get_inputs(Input)

In this example, the equivalent Python code to generate `inputs` would be:

```python
inputs = {}
inputs['dat'] = Input(shape=(1, 512, 512, 1))
```

# Convolutional Operations

In this tutorial, a CNN will be created using 3D convolutional operations. A convolutional operation is defined by the following minimum specifications:

* filter / channel depth
* kernel size
* strides
* padding

For a good review of convolutional math, see: http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html

To instantiate a convolutional layer in Keras:

In [None]:
# --- Define regular convolution
l1 = layers.Conv3D(
    filters=16, 
    kernel_size=(1, 3, 3), 
    strides=(1, 1, 1), 
    padding='same')(inputs['dat'])

# --- Define strided convolution
l1 = layers.Conv3D(
    filters=16, 
    kernel_size=(1, 3, 3), 
    strides=(1, 2, 2), 
    padding='same')(inputs['dat'])

To reuse identical function arguments, consider maintaining a `kwargs` dictionary and pass using the `**` symbol:

In [None]:
# --- Define kwargs dictionary
kwargs = {
    'kernel_size': (1, 3, 3),
    'padding': 'same'}

# ---- Define stack of convolutions
l1 = layers.Conv3D(filters=16, strides=(1, 1, 1), **kwargs)(inputs['dat'])
l2 = layers.Conv3D(filters=32, strides=(1, 1, 1), **kwargs)(l1)

## Blocks

In addition to the requisite convolutional operationss and activation functions, batch normalization is almost universally used in modern CNN architectures. Thus at minimum, a common baseline *block* pattern of operations can be defined as:

* convolutional operation
* batch normalization
* activation function (e.g. ReLU)

Let us define a block:

In [None]:
# --- Define block
c1 = layers.Conv3D(filters=16, **kwargs)(inputs['dat'])
n1 = layers.BatchNormalization()(c1)
r1 = layers.ReLU()(n1)

During the course of buildling CNNs, we will be writing **many** blocks over time. Thus for brevity, let us use lambda functions to define modular, reusable components:

In [None]:
# --- Define lambda functions
conv = lambda x, filters, strides : layers.Conv3D(filters=filters, strides=strides, **kwargs)(x)
norm = lambda x : layers.BatchNormalization()(x)
relu = lambda x : layers.ReLU()(x)

Great! Now let us rewrite a block using lambda shorthand:

In [None]:
# --- Define block
b1 = relu(norm(conv(inputs['dat'], 16, (1, 1, 1))))

In practice, the two most common **block patterns** will be regular convolutional block and a strided convolutional block (for subsampling). Let us then create two more high-level lambda functions for this:

In [None]:
# --- Define stride-1, stride-2 blocks
conv1 = lambda filters, x : relu(norm(conv(x, filters, strides=1)))
conv2 = lambda filters, x : relu(norm(conv(x, filters, strides=(1, 2, 2))))

Now let us see how easy it is to create series of alternating stride-1 and stride-2 blocks:

In [None]:
# --- Define series of blocks
l1 = conv1(16, inputs['dat'])
l2 = conv1(24, conv2(24, l1))
l3 = conv1(32, conv2(32, l2))
l4 = conv1(48, conv2(48, l3))
l5 = conv1(64, conv2(64, l4))
l6 = conv1(80, conv2(80, l5))
l7 = conv1(96, conv2(96, l6))

# Fully Connected Layers

After a series of convolutional block operations, a CNN must transition to MLP type operations (e.g. matrix multiplications). To convert 3D (or 4D) feature maps into vectors, consider one of the following approaches:

* serial convolutions (with stride > 1 or VALID type padding)
* global pool operations (mean or max)
* reshape / flatten operation

Of these, the flatten operation is perhaps the most simple and easiest to use, and requires no additional learnable parameters. The following line of code implements the reshape / flatten operation:

In [None]:
f0 = layers.Flatten()(l7)

### MLP

Now that the intermediate layer is defined as a vector, a number of standard MLP type operations may be performed, including creation of an arbitrary (optional) number of hidden layers. Once ready, the final layer should be implemented using a standard matrix multiplication operation (`layers.Dense(...)` object) that yields a vector of logit scores **without** any activation function applied.

In [None]:
# --- Final logit scores
logits = {}
logits['pna'] = layers.Dense(2, name='pna')(f0)

# Model

Putting everything together, use the following cell to create and compile the convolutional neural network:

In [None]:
# --- Create model
model = Model(inputs=inputs, outputs=logits)

# --- Compile model
model.compile(
    optimizer=optimizers.Adam(learning_rate=2e-4), 
    loss={'pna': losses.SparseCategoricalCrossentropy(from_logits=True)}, 
    metrics={'pna': 'sparse_categorical_accuracy'})

# Model Training

### In-Memory Data

The following line of code will load all training data into RAM memory. This strategy can be effective for increasing speed of training for small to medium-sized datasets.

In [None]:
# --- Load data into memory
client.load_data_in_memory()

### Training

Once the model has been compiled and the data prepared (via a generator), training can be invoked using the `model.fit(...)` method. Ensure that both the training and validation data generators are used. In this particular example, we are defining arbitrary epochs of 100 steps each. Training will proceed for 8 epochs in total. Validation statistics will be assess every fourth epoch. As needed, tune these arugments as need.

In [None]:
model.fit(
    x=gen_train, 
    steps_per_epoch=100, 
    epochs=8,
    validation_data=gen_valid,
    validation_steps=100,
    validation_freq=4)

# Evaluation

To test the trained model, the following steps are required:

* load data
* use `model.predict(...)` to obtain logit scores
* use `np.argmax(...)` to obtain prediction
* compare prediction with ground-truth

Recall that the generator used to train the model simply iterates through the dataset randomly. For model evaluation, the cohort must instead be loaded manually in an orderly way. For this tutorial, we will create new **test mode** data generators, which will simply load each example individually once for testing. 

In [None]:
# --- Create validation generator
test_train, test_valid = client.create_generators(test=True)

Use the following lines of code to loop through the test set generator and run model prediction on each example:

In [None]:
# --- Test model
pred = []
true = []

for x, y in test_valid:
    
    logits = model.predict(x)
    
    if type(logits) is dict:
        logits = logits['pna']
        
    argmax = np.argmax(logits[0], axis=-1)
    
    pred.append(np.squeeze(argmax))
    true.append(np.squeeze(y['pna']))

Use the following lines of code to calculate validataion cohort performance:

In [None]:
# --- Calculate accuracy
pred = np.array(pred)
true = np.array(true)

correct = np.sum(pred == true)

print('Accuracy: {:0.5f}'.format(correct / pred.size))

## Saving and Loading a Model

After a model has been successfully trained, it can be saved and/or loaded by simply using the `model.save()` and `models.load_model()` methods. 

In [None]:
# --- Serialize a model
model.save('./cnn.hdf5')

In [None]:
# --- Load a serialized model
del model
model = models.load_model('./cnn.hdf5', compile=False)