# Tutorial 2: Convolutional Neural Networks

In this tutorial we will focus on Convolutional Neural Networks (CNNs).  
We will train a model to label Brain MRI scans as either healthy or containing a tumor. 
We will use [Pytorch](https://pytorch.org/tutorials/recipes/recipes_index.html) and a [Pytorch Lightning](https://lightning.ai/docs/pytorch/stable/levels/core_skills.html) to build and train our model.  
The notebook will give you a framework with some parts of the code left blank. Fill in the missing code to make it work. 
It will be helpful to look up Pytorch or Lightning commands on the go. The packages usually offer easy-to-use methods for everything Deep Learning related. 

Before we start, let's explore how CNNs work in this [CNN visualizer](https://adamharley.com/nn_vis/cnn/3d.html)

## Step 0: Imports

Let's import all the Python modules that we need.

In [None]:
import os
import numpy as np
import pandas as pd

# for plotting
import matplotlib.pyplot as plt
import seaborn as sns 

# dealing with images
from PIL import Image

# Pytorch
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision import transforms
from torch.utils.data import DataLoader, Dataset
from torchvision.utils import make_grid
import torchmetrics

# Pytorch Lightning
import lightning as L

# for visualization
import tensorboard

In [None]:
# check if we can use GPUs
torch.cuda.is_available()

## Step 1: Loading the data

Download the dataset from: https://polybox.ethz.ch/index.php/s/cXXOTIJowCJqMbz

The dataset contains images of brain MRI scans. Some of them are from healthy patients, others from patients with brain tumors.  
We will train a model that can classify the images correctly.

Unzip the downloaded folder. 
Then, add the correct path in the cell below, pointing to the data. 

In [None]:
####### TODO #########
# enter the path of the data directory
data_dir = "/path/of/datafolder" 

######################

# read the labels into a dataframe with pandas
labels_df = pd.read_csv(data_dir + "/metadata.csv", index_col=0)
labels_df

How large is our dataset?
Plot the class distribution below:

In [None]:
##### TODO ######
# get the percentage of normal and tumor classes in the dataset

normal_percentage = 
tumor_percentage = 

#################

print("Normal percentage: {}%".format(normal_percentage))
print("Tumor percentage: {}%".format(tumor_percentage))

sns.histplot(labels_df, x="class")


Now, that we have loaded the labels, we will load the images.  
As we have seen above the images can have different file types and dimensions.   
Next we will load the data into datasets that we can use for training. 
To simplify training the models, we will transform the pictures, to the same size and normalize the data.

In [None]:
image_dir = data_dir + "/Brain Tumor Data Set/Brain Tumor Data Set/"
# seed everything 
torch.manual_seed(42) # set random seed to have reproducibility between the tutorials

# adding transforms to have same dimensions and some random rotations/flips to get more robust predictions
transform = transforms.Compose(
                [
                transforms.Resize((256,256)),
                transforms.RandomHorizontalFlip(p=0.5),
                transforms.RandomVerticalFlip(p=0.5),
                transforms.RandomRotation(30),
                transforms.ToTensor(),
                transforms.Normalize(mean = [0.485, 0.456, 0.406],std = [0.229, 0.224, 0.225]) # from ImageNet
                ]  
            )

# load the complete dataset
full_dataset = torchvision.datasets.ImageFolder(image_dir, transform=transform) 

########## TODO ###############
# create the train val and test set, e.g. using a 70%, 15%, 15% split
# use a pytorch function to do this
train_set, val_set, test_set = 

# define the dataloaders for train, validation and test (use shuffle for train only)
batch_size_train = 256
batch_size = 128 # for eval and test

# usinf DataLoader from Pytorch
test_loader = DataLoader(test_set, batch_size = batch_size, shuffle = False, num_workers = 16)
val_loader = 
train_loader = 

###############################

Let's take a look at some of the images we have loaded.

In [None]:
# plots n random brain MRI images from the passed dataset
def plot_images(dataset, n=16):
    CLA_label = {
    0 : 'Brain Tumor',
    1 : 'Healthy'
    } 
    cols, rows = 4, int(np.ceil(n/4))
    figure = plt.figure(figsize=(10, 10))
    for i in range(1, n + 1):
        sample_idx = torch.randint(len(dataset), size=(1,)).item()
        # read out image and label from item
        img, label = dataset[sample_idx]
        figure.add_subplot(rows, cols, i)
        plt.title(CLA_label[label])
        plt.axis("off")
        img_np = img.numpy().transpose((1, 2, 0))
        # Clip pixel values to [0, 1]
        img_valid_range = np.clip(img_np, 0, 1)
        plt.imshow(img_valid_range)
    plt.show()

In [None]:
plot_images(train_set)

Under the hood the images are just vectors though.  
In the dataset each item is saved together with it's label. 
The images in the dataset all have a size of 256x256 and 3 color channels (RBG). 

In [None]:
# get first item from dataset, the item is just a tuple
first_item = train_set[0]

####### TODO #########
# get image at index 0 and label at index 1 from the item
image_tensor = 
label = 

# print the shape of the image tensor and the tensor 
print(image_tensor)
print("Shape: ", image_tensor.shape)
#####################

The above vector/tensor encodes the picture below.

In [None]:
# showing tensor as image
img_valid_range = np.clip(image_tensor.numpy().transpose((1, 2, 0)), 0, 1)
plt.imshow(img_valid_range)

## Step 2: Creating a CNN architecture

Here is a recap of how convolutions work and CNNs work:
https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 

### Helper function

This function computes the new dimensions of our data after convolution and max pooling.  
It will be helpful later when we build the model.

In [None]:
# get the output shape of our data after a convolution and pooling of a certain size

def get_conv2d_out_shape(tensor_shape, conv, pool=2):
    # return the new shape of the tensor after a convolution and pooling
    # tensor_shape: (channels, height, width)
    # convolution arguments
    kernel_size = conv.kernel_size
    stride=conv.stride # 2D array
    padding=conv.padding # 2D array
    dilation=conv.dilation # 2D array
    out_channels = conv.out_channels

    height_out = np.floor((tensor_shape[1]+2*padding[0]-dilation[0]*(kernel_size[0]-1)-1)/stride[0]+1)
    width_out = np.floor((tensor_shape[2]+2*padding[1]-dilation[1]*(kernel_size[1]-1)-1)/stride[1]+1)
    
    if pool:
        # adjust dimensions to pooling
        height_out/=pool
        width_out/=pool
        
    return int(out_channels),int(height_out),int(width_out)

In [None]:
# some simple tests
t1 = torch.randn(3, 256, 256)
t2 = torch.randn(5, 256, 256)
conv1 = nn.Conv2d(in_channels=3, out_channels = 256, kernel_size=10)
conv2 = nn.Conv2d(in_channels=3, out_channels = 64, kernel_size=4)
conv3 = nn.Conv2d(in_channels=5, out_channels = 13, kernel_size=7)
print(get_conv2d_out_shape(t1.shape, conv1, pool=2))
print(get_conv2d_out_shape(t1.shape, conv2, pool=1))
print(get_conv2d_out_shape(t2.shape ,conv3, pool=2))

**Can you explain the output above? How does the size change after a convolution and why?**

Your answer:

### CNN Model

We can now build our Convolutional Neural Network. 
It will have two convolutional layers and two fully connected layers.  
You will be able to mostly use Pytorch methods to fill in the blanks. 

In [None]:
class MRIModel(nn.Module):
    
    # Network Initialisation
    def __init__(self, params):
        
        super(MRIModel, self).__init__() #initialize parent pytorch module

        # read parameters
        shape_in = params["shape_in"]
        channels_out = params["initial_depth"] 
        fc1_size = params["fc1_size"]
        
        #### Convolution Layers

        # Max pooling layer
        self.pool = nn.MaxPool2d(2, 2)

        ##conv layer 1
        # convolution with kernel size 8, goes from three channels to 
        # number defined by initial_depth in params
        self.conv1 = nn.Conv2d(shape_in[0], channels_out, kernel_size=8)

        ############### TODO ################
        # get current shape after conv1, use helper function get_conv2d_out_shape, use pool=2
        current_data_shape = 

        ##conv layer 2
        # convolution with kernel size 4, double the amount of channels
        self.conv2 = nn.Conv2d(current_data_shape[0], current_data_shape[0]*2, kernel_size=4)
        # get current shape after conv2, use pool=2 again
        current_data_shape = 

        #### Fully connected layers
        # compute the flattened size as input for fully connected layer
        flat_size = current_data_shape[0] * current_data_shape[1] * current_data_shape[2]
        
        # linear layer reduces data from flat_size to fc1_size
        self.fc1 = 
        # last linear layer reduces data to output size 2
        self.fc2 = nn.Linear(fc1_size, 2)
        
        #####################################
        

    def forward(self,X):
        # our network's forward pass
        
        # Convolution & Pool Layers
        ############# TODO ###############
        # convolution (conv1), then relu, then max pool 
        X = F.relu(self.conv1(X))
        X = self.pool(X)
        # convolution (conv2), then relu, then max pool 
        X =

        X = torch.flatten(X, 1) # flatten all dimensions except batch

        # fully connected layer and ReLU
        X = 
        # second fully connected layer, no relu needed
        X = 

        #####################################
        # return log softmax to fit classification problem, no relu needed
        return F.log_softmax(X, dim=1)

The code above is defining a class. This will allow us to create objects of that class. 

**What does ``self.`` do in the code? What is it good for in a class?**

Your answer:

Let's try an example and see how the model works.

In [None]:
# take first batch from the train loader
batch = next(iter(train_loader))[0]
# create the model
cnn_model = MRIModel(params={"shape_in":batch[0].shape,"initial_depth":4,"fc1_size":128})
# forward pass
out = cnn_model(batch)
# print shape of the input batch
print("Shape of the input batch: ", batch.shape)
# print the output shape
print("Shape of the output: ", out.shape)
# prediction output for first image, exp to get from log back to probabilities
print(torch.exp(out[0].detach()))

**Explain the shapes of the batch and the output shown above.**

Your answer:

**How do you interpret the prediction for the first image?**

Your answer:

### Train and validation with Lightning

Now that we have created our model architecture, we will wrap a Lightning module around it. 
This will make the training procedure much easier.  
Instead of programming the whole training loops ourselves, we will define how one step should be handled at training, validation and testing.  
We only need to define how to retreive data from the batch, how to pass it through our model and how/when to compute the loss. 
The rest will be all handled by Lightning.  
Make sure to use the Lightning docs and Google, to find the right methods.  

In [None]:
class LitMRIModel(L.LightningModule):
    def __init__(self, model_parameters, learning_rate=1e-2):
        super().__init__()
        ######## TODO ##########
        # Instantiate our model like above
        self.model = MRIModel(model_parameters)
        #pass the learning rate
        self.lr = 
        # define loss function
        self.loss_function = nn.NLLLoss(reduction="mean")
        # define accuracy metric (torchmetrics)
        self.accuracy = torchmetrics.classification.Accuracy(task="multiclass", num_classes=2)
        ########################

    def training_step(self, batch, batch_idx):
        # training_step defines the train loop.
        ######### TODO #############
        
        # read from batch
        x, y = 

        # run data through model
        predictions = 
        
        # compute loss
        loss = 
        # compute accuracy
        acc = 
        ##############################

        # logging the values (will appear in progress bar and on dashboard)
        self.log("train_loss", loss, on_epoch=True, prog_bar=True)
        self.log("train_acc", acc, on_epoch=True, prog_bar=True)

        return loss

    def configure_optimizers(self):
        ############## TODO ################
        # define the optimizer, let's use Adam
        optimizer = 
        ####################################
        return optimizer

    def test_step(self, batch, batch_idx):
        # this is the test loop

        ############### TODO #############
        # read from batch
        x, y = 

        # run data through model
        predictions = 
        
        # compute loss
        loss = 
        # compute accuracy
        acc = 
        ##############################

        # logging
        self.log("test_loss", loss, prog_bar=True)
        self.log("test_acc", acc, prog_bar=True)
        return loss, acc


    def validation_step(self, batch, batch_idx):
        # this is the validation loop
        ############### TODO #############
        # read from batch
        x, y = 

        # run data through model
        predictions = 
        
        # compute loss
        loss = 
        # compute accuracy
        acc = 
        ##############################

        # logging
        self.log("val_loss", loss, on_epoch=True, prog_bar=True)
        self.log("val_acc", acc, on_epoch=True, prog_bar=True)
        return loss 


### Visualization
To visualize the training procedure, we will use Tensorboard.  
In the code above we can see some values being logged. Tensorboard will display these values in nice graphs for us to follow our learning curves.

In [None]:
# create a tensorboard session
# new tab should open in your browser
%reload_ext tensorboard
%tensorboard --logdir=lightning_logs/

## Step 3. Training and evaluating 

Now that everything is ready, we can start training the model.  
Make sure to follow its performance on Tensorboard. 

In [None]:
# define parameters
model_parameters={
        "shape_in": (3,256,256), # size of our images
        "initial_depth": 4,    
        "fc1_size": 128}

In [None]:
# train model
########## TODO #############
# instantiate lightning model with the cnn_model and learning_rate=1e-3
model = 
############################

# instantiate the lightning trainer 
trainer = L.Trainer(max_epochs=20, log_every_n_steps=1)
# train
trainer.fit(model, train_loader, val_loader)

Look at the learning curves in tensorboard. (You might need to click the refresh button on the website). 
Answer the following questions below. Write the answers in this cell.

1. **How many steps are there in one epoch? How do you compute it?**

    Your answer:

2. **What is the difference between the metrics per step and per epoch?**

    Your answer:

3. **Which metrics/graphs can help you understand whether your model is learning something useful from the data?**

    Your answer

4. **How well did your model train? Would you improve something? Explain your answer.**

    Your answer:

5. **How could you see from the graphs if your model is overfitting?**

    Your answer:

### Validate and visualize

Let us evaluate the model now. As we might still make changes, and tune parameters, we should not use the test set, yet. 
The test set is only for the final evaluation and should never be looked at before to ensure unbiased models.

In [None]:
# Test the model on the validation set
trainer.validate(model, val_loader)

We will visualize our predictions in a confusion matrix to get a feeling of how well the model performs in specific cases. 

In [None]:
from sklearn.metrics import classification_report, confusion_matrix

# get the predictions and plot a confusion matrix

# function to retrieve the predictions of the model and return them with the true labels
def get_predictions(val_loader, model):
    y_true = []
    y_pred = []

    for images, labels in val_loader:
        images = images#.to(device)
        labels = labels.numpy()
        outputs = model.model(images)
        _, pred = torch.max(outputs.data, 1)
        pred = pred.detach().cpu().numpy()
        
        y_true = np.append(y_true, labels)
        y_pred = np.append(y_pred, pred)
    
    return y_true, y_pred

########## TODO #############
# get predictions from the cnn_model on the val_loader
y_true, y_pred = 
############################

# print summary
print(classification_report(y_true, y_pred), '\n\n')
cm = confusion_matrix(y_true, y_pred)

sns.heatmap(cm, annot=True)

**How does the information we get from the confusion matrix compare to what we can learn from the training curves?**

Your answer:

In [None]:
# Plot a ROC curve
from sklearn.metrics import roc_curve, auc

# get predictions (as probabilities)
def get_prediction_probs(val_loader, model):
    y_true = []
    y_pred = []

    for images, labels in val_loader:
        images = images#.to(device)
        labels = labels.numpy()
        outputs = model.model(images)
        # exp() because we use log softmax as last layer
        # get the probabilities for tomor class 
        prediction_probabilities = torch.exp(outputs)[:,1] 
        pred = prediction_probabilities.detach().cpu().numpy()
    
        y_true = np.append(y_true, labels)
        y_pred = np.append(y_pred, pred)
    
    return y_true, y_pred

y_true, y_pred_probabilities = get_prediction_probs(val_loader, model)

########## TODO #############
# compute ROC curve and ROC area for each class
# use sklearn roc_curve and auc functions
fpr, tpr, _ = 
roc_auc = au
##############################

# Plot ROC curve
plt.figure()
plt.plot(fpr, tpr, lw=2, label='AUC = %0.2f' % roc_auc)
plt.plot([0, 1], [0, 1], lw=2, linestyle='--', color="grey")
plt.legend()

In the U.S. 21.97 of 100,000 people are diagnosed with brain tumors.  
Assume a doctor uses our model to screen 100,000 persons from the U.S. 

**Based on the computed values above, how many healthy people do we expect to be wrongly diagnosed with brain cancer?**

Your answer:

## Step 4: Improving

### Finetuning training parameters

The training procedure could use some improvements.  
Adjust the number of epochs, batch size and learning rate and rerun the model.  
Analyze how performance changes.  

In [None]:
# train a model with a large learning rate (e.g. 1e-1)
# make sure to name your lightning model variable in a way to not overwrite the previously trained model

########## TODO #############
# instantiate lightning model
model_large_lr = 
# define trainer, 20 epochs
trainer =
# train

##############################


In [None]:
# train a model with a small learning rate (e.g. 1e-5)

########## TODO #############
# instantiate lightning model
model_small_lr = 
# define trainer, 20 epochs
trainer =
# train

##############################

**How does the learning rate influence training performance?**

Your answer:




Let's use the original learning rate of 1e-3 again. 
Now we will change the batch size in the dataloaders

In [None]:
# lightning model
model_small_batches = LitMRIModel(model_parameters, learning_rate=1e-3)

#### TODO ####
## create train dataloader with a small batch size
train_loader_small = 

# train model
trainer =
# train with smaller batch size


In [None]:
# lightning model
model_big_batches = LitMRIModel(model_parameters, learning_rate=1e-3)

#### TODO ####
# train with a large batch size, what's the largest batch size you can use?
train_loader_big = 

# train model
trainer = 
# train


**How does the batch size influence the model performance?**

Your answer:

Now let's train for more epochs.

In [None]:
# lightning model
model_long_training = LitMRIModel(model_parameters, learning_rate=1e-3)

# train the model for 100 epochs
trainer = 
# train on train_loader again


**How does the model perform? How do more epochs influence performance and how many epochs are enough?**

Your answer:

### Model improvements (optional)

Simply finding the best training parameters improves the performance to some degree.  
Especially in more complex problems and with larger datasets the architecture and the amount and size of the layers also matter. 
Now you can experiment with the CNN architecture. 
Use the code from a bove and create a deeper or larger model (Eg. 4 conv layers and 2 fully connected).  
Or simply experiment around with the model parameters. Maybe use different kernel sizes. Try to see if you can further improve the preformance.

In [None]:
################# TODO ##################
# create a more complex model 


In [None]:
################# TODO ##################
# train 

In [None]:
################# TODO ##################
# validate

## Step 5: Final testing

Now that we have a well performing model, we can run the model on the test set and see how it performs on unseen data.

In [None]:
############### TODO #################
# pass the best performing model here
best_model =

# test your best performing model on the test set
trainer.test(best_model, test_loader)

# print the confusion matrix and classification report
y_true, y_pred = 

print(classification_report(y_true, y_pred), '\n\n')
# confusion matrix
cm = 

sns.heatmap(cm, annot=True)