The contents of this notebook are copied/redistributed and modified from [RSNA 2022 tutorials](https://github.com/RSNA/AI-Deep-Learning-Lab-2022/blob/main/sessions/custom-dl-pytorch/RSNA_PyTorch_Model_Architecture.ipynb) as per the [MIT License](https://github.com/RSNA/AI-Deep-Learning-Lab-2022/blob/main/LICENSE).

<a href="https://colab.research.google.com/github/AFRICAI-MICCAI/model_development_1_data/blob/main/Notebooks/custom-DL-PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hands-on: Creating Custom Neural Network Architectures in PyTorch


### Task
To show how to create custom neural network architectures in Pytorch using the intracranial hemorrhage dataset from the RSNA ICH challenge.

### Requirements

1. Basic understanding of machine learning and deep learning
2. Programming in Python
3. Prior experience building simple classification models


### Learning Objectives

At the end of this activity, you will be able to:

1. Explain the syntax of PyTorch model architecture
2. Load pretrained architectures and modify them

### Acknowledgements

The dataset used in this notebook is derived from the RSNA Intracranial Hemorrhage Detection challenge hosted on Kaggle in 2019 (https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection). 

This notebook was authored by Ian Pan (ianpan358@gmail.com), and Felipe Kitamura (kitamura.felipe@gmail.com). 

# Introduction
In this notebook, we will train a deep learning model to classify intracranial hemorrhage in head CT on a slice-wise basis. We will start by installing and importing the libraries we will use.

### Conda environment
It is suggested to create a conda environment for the summer school's notebooks. Please find conda installation instructions [here](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html) (miniconda would be enough).  
If you have not created/initialized the africai conda environment, run in a terminal from the parent  directory *model_development_1_data*:  
> conda env create -f africai.yml  
> conda activate africai

*Other useful commands*:  
To deactivate a conda environment 
> conda deactivate

To delete a conda environment (e.g. africai conda environment, replace *ENV_NAME* with *africai*)
> conda remove --name *ENV_NAME* --all 

### Install PyTorch. It is recommended to use light-the-torch.  
In a terminal, run:
> pip install light-the-torch  
> ltt install torch

## Install and Import Packages

In [None]:
!pip install lightning --quiet
!ltt install torchvision --quiet
!pip install albumentations --quiet
!pip install opencv-python --quiet
!pip install timm --quiet
!pip install pandas --quiet
!pip install matplotlib --quiet
!pip install seaborn --quiet

In [None]:
import warnings

warnings.filterwarnings("ignore")
import os
import albumentations as A
import pandas as pd
import seaborn as sn
import torch
from IPython.core.display import display
from pytorch_lightning import LightningModule, Trainer
from pytorch_lightning.callbacks.progress import TQDMProgressBar
from pytorch_lightning.loggers import CSVLogger
from pytorch_lightning.callbacks import Callback
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader, random_split
from torchmetrics import Accuracy
from torchvision import transforms
from torchvision.datasets import MNIST
import glob
import matplotlib.pyplot as plt
import numpy as np
import cv2
import timm
from tqdm import tqdm
from pathlib import Path
from zipfile import ZipFile


BATCH_SIZE = 32 if torch.cuda.is_available() else 16
print("GPU Backend" if torch.cuda.is_available() else "CPU Backend")

## Download Data

The dataset for this notebook is derived from the 2019 RSNA Intracranial Hemorrhage Detection challenge hosted on Kaggle (https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/data). 

To streamline this notebook and reduce the size of the dataset, 1,600 head CT slices from the original dataset were randomly sampled (800 positive, 800 negative), converted from DICOM to PNG, and downsampled to a resolution of 256 x 256 pixels. During the conversion, a standard brain window (WL=40, WW=80) was used to convert the image into 8-bit. The size of this reduced dataset is around 24 MB.

In [None]:
data_dir = os.path.join(os.path.dirname(os.getcwd()), "data")
Path(data_dir).mkdir(parents=True, exist_ok=True)

!wget -P {data_dir} https://github.com/kitamura-felipe/RSNA2022-ICH-CustomPyTorchModel/raw/main/bin2.zip

file_name = os.path.join(data_dir, "bin2.zip")
with ZipFile(file_name, "r") as zip:
    zip.extractall(data_dir)
    print("Input data file unzipped!")

## Explore Data

In the process of converting the DICOMS to PNGs, the filenames were changed to "IM_{XXXX}.png". Currently, there is no mapping in the DataFrame between the PNG filename and the labels, so we need to map them from the parent folders "Normal" or "Hemorrhage".

In [None]:
all_images = glob.glob("../data/bin/*/*.png")
all_images_df = pd.DataFrame(dict(imgfile=all_images))
all_images_df["SOPInstanceUID"] = all_images_df.imgfile.apply(lambda x: Path(x).stem)
all_images_df["ICH"] = all_images_df.imgfile.apply(
    lambda x: 0 if "Normal" in str(Path(x).parent) else 1
)
label_cols = "ICH"
df = all_images_df
df.head()

We can look at some example images.

Re-run code block to see different images.

In [None]:
positives = df[df["ICH"] == 1]
negatives = df[df["ICH"] == 0]

plt.subplot(1, 2, 1)
plt.imshow(cv2.imread(np.random.choice(positives.imgfile)), cmap="gray")
plt.subplot(1, 2, 2)
plt.imshow(cv2.imread(np.random.choice(negatives.imgfile)), cmap="gray")
plt.show()

## Split the Data

We will now split the data into a single **train-validation-test split** using a **70%-10%-20%** partition. 

When the dataset was processed, only 1 study per patient was obtained. 

In [None]:
train_frac, val_frac, test_frac = 0.7, 0.1, 0.2

all_studies = df.SOPInstanceUID
n_train = int(train_frac * len(all_studies))
n_test = int(test_frac * len(all_studies))
train_studies = np.random.choice(all_studies, n_train, replace=False)
# Remove train studies from available studies and sample test studies
not_train_studies = list(set(all_studies) - set(train_studies))
test_studies = np.random.choice(not_train_studies, n_test, replace=False)
# Validation studies are just the leftover ones
val_studies = list(set(not_train_studies) - set(test_studies))

print(f"TRAIN: N={len(train_studies)} studies")
print(f"VAL:   N={len(val_studies)} studies")
print(f"TEST:  N={len(test_studies)} studies")

overlap = list(set(train_studies) & set(val_studies) & set(test_studies))
if len(overlap) == 0:
    print("\nThere is no overlap across the 3 sets.")


train_df = df[df.SOPInstanceUID.isin(train_studies)]
val_df = df[df.SOPInstanceUID.isin(val_studies)]
test_df = df[df.SOPInstanceUID.isin(test_studies)]

print(f"\nTRAIN: N={len(train_df)} images")
print(f"VAL:   N={len(val_df)} images")
print(f"TEST:  N={len(test_df)} images")

## Creating a PyTorch Dataset

We are using PyTorch to train our model. The first step is to create a Dataset class that loads the data into memory, performs any necessary transforms, and returns tensors for the model inputs.

We will also be using the Albumentations library to create a list of transforms, including data augmentation during training.

In [None]:
IMG_SIZE = (256, 256)


class ICHDataset(Dataset):
    def __init__(self, imgfiles, labels, transforms):
        self.imgfiles = imgfiles
        self.labels = labels
        self.transforms = transforms

    def __len__(self):
        return len(self.imgfiles)

    def __getitem__(self, i):
        # Load image as grayscale
        img = cv2.imread(self.imgfiles[i], 0)
        # Currently image shape is (H,W)- add extra dimension so it is (H,W,1)
        img = np.expand_dims(img, axis=-1)
        # Apply any necessary transforms, using albumentations (see below)
        img = self.transforms(image=img)["image"]
        # Convert to channels-FIRST torch tensor
        img = torch.from_numpy(img.transpose(2, 0, 1)).float()
        # Get labels and convert to torch tensor
        labels = torch.from_numpy(np.array([self.labels[i]])).float()
        return img, labels


train_transforms = A.Compose(
    [
        A.Resize(*IMG_SIZE),
        A.Normalize(mean=0.5, std=0.5),
        A.HorizontalFlip(p=0.5),
        A.VerticalFlip(p=0.5),
        A.RandomBrightnessContrast(p=0.2),
        A.ShiftScaleRotate(
            shift_limit=0.0625, scale_limit=0.1, rotate_limit=30, border_mode=0, p=0.2
        ),
    ],
    p=1,
)


infer_transforms = A.Compose([A.Resize(*IMG_SIZE), A.Normalize(mean=0.5, std=0.5)], p=1)

Now, we can instantiate our training, validation, and test datasets.

In [None]:
train_dataset = ICHDataset(
    imgfiles=train_df.imgfile.values,
    labels=train_df[label_cols].values,
    transforms=train_transforms,
)
val_dataset = ICHDataset(
    imgfiles=val_df.imgfile.values,
    labels=val_df[label_cols].values,
    transforms=infer_transforms,
)
test_dataset = ICHDataset(
    imgfiles=test_df.imgfile.values,
    labels=test_df[label_cols].values,
    transforms=infer_transforms,
)

By default, the instantiated model will output 1,000 classes. We will modify the fully-connected layer so that it outputs 1 class.


We will also add a dropout layer for regularization.

### Listing all the pretrained models available in the timm library.

In [None]:
timm.list_models()

In [None]:
# We can load any of the models above by:
cnn_model = timm.create_model("efficientnet_b0", pretrained=True)

# The code below prints the model structure. It is important to notice the name of the final linear layer of the model so we can modify it.
cnn_model

In the case above, the last linear layer is called "classifier". This may be different for other models.

In [None]:
class CustomCNN(LightningModule):  # nn.Module):
    def __init__(self, model, lr, dropout=0.1, num_classes=1):
        # Get the final output with 1000 classes and add a dropout layer
        # Add a final linear layer with 1 class
        # You can print the model to the console to see the different layers
        super().__init__()
        # Get the number of output features from the last layer
        n_feat = model.classifier.out_features
        # Define "model" as a backbone
        self.backbone = model
        # Create a dropout layer
        self.dropout = nn.Dropout(p=dropout)
        # Create a linear layer
        self.fc = nn.Linear(n_feat, num_classes)
        # Set the initial learning rate
        self.lr = lr

    def forward(self, x):
        x = self.backbone(x)
        x = self.dropout(x)
        x = self.fc(x)
        return x

    def training_step(self, batch, batch_nb):
        x, y = batch
        loss = F.binary_cross_entropy_with_logits(self(x), y)
        return loss

    def validation_step(self, batch, batch_nb):
        x, y = batch
        loss = F.binary_cross_entropy_with_logits(self(x), y)
        self.log("val_loss", loss, prog_bar=True)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=self.lr)


cnn_model = CustomCNN(cnn_model, lr=0.001)

Another thing to consider is that our input is a grayscale image (1-channel), rather than an RGB image (3-channels), which the default model is formatted to accept. While we can simply repeat our grayscale image along the channel axis 3 times to form a pseudo-RGB image, we can also modify the first layer in our network to accept 1-channel images. 

In [None]:
def change_num_input_channels(model, in_channels=1):
    for i, m in enumerate(model.modules()):
        if isinstance(m, (nn.Conv2d, nn.Conv3d)) and m.in_channels == 3:
            m.in_channels = in_channels
            # First, sum across channels
            W = m.weight.sum(1, keepdim=True)
            # Then, divide by number of channels
            W = W / in_channels
            # Then, repeat by number of channels
            size = [1] * W.ndim
            size[1] = in_channels
            W = W.repeat(size)
            m.weight = nn.Parameter(W)
            break
    return model


cnn_model = change_num_input_channels(cnn_model, 1)

# Test model
test_input = torch.from_numpy(np.ones((2, 1, *IMG_SIZE))).float()
test_output = cnn_model(test_input)
test_output.shape

In [None]:
train_loader = DataLoader(
    train_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)
val_loader = DataLoader(
    val_dataset, batch_size=BATCH_SIZE * 2, shuffle=False, drop_last=False
)
cnn_test_loader = DataLoader(
    test_dataset, batch_size=BATCH_SIZE * 2, shuffle=False, drop_last=False
)

## Logging the validation loss and accuracy

Below we create a callback that will be used during training to calculate the loss and the accuracy in the validation set and log it for each epoch.

In [None]:
acc_lst = []
loss_lst = []


class ImagePredictionLogger(Callback):
    def __init__(self, val_samples, num_samples=32):
        super().__init__()
        self.num_samples = num_samples
        self.val_imgs, self.val_labels = val_samples

    def on_validation_epoch_end(self, trainer, pl_module):
        # Bring the tensors to CPU
        val_imgs = self.val_imgs.to(device=pl_module.device)
        val_labels = self.val_labels.to(device=pl_module.device)
        # Get model prediction
        logits = pl_module(val_imgs)
        loss = F.binary_cross_entropy_with_logits(logits, val_labels)

        preds = (logits > 0.0) * 1.0
        accuracy = (preds == val_labels).sum().item() / len(preds)

        # print("val_loss: ", loss.item())
        # print("val_acc: ", accuracy)
        acc_lst.append(accuracy)
        loss_lst.append(loss.item())
        return loss

## Train the CNN


Now we can start training! Using the default settings, this should take around 10 seconds per epoch (including validation). Feel free to play around with the hyperparameters above (number of epochs, batch size, learning rate).

In [None]:
val_samples = next(iter(val_loader))

# Initialize a trainer
trainer = Trainer(
    accelerator="auto",
    devices=1 if torch.cuda.is_available() else None,  # limiting got iPython runs
    max_epochs=5,
    callbacks=[
        TQDMProgressBar(refresh_rate=1),
        ImagePredictionLogger(val_samples, len(val_samples)),
    ],
    check_val_every_n_epoch=1,
)

# Train the model ⚡

trainer.fit(model=cnn_model, train_dataloaders=train_loader, val_dataloaders=val_loader)

plt.figure()
plt.plot(acc_lst)
plt.show()

plt.figure()
plt.plot(loss_lst[1:])
plt.show()

### Create your own Neural Network

Now, let's go back to the section "Creating a PyTorch Model" so we can create a different architecture

We suggest you the following experiments, one at a time:

1. Change the backbone (the catch is to remember the final linear layer might have different names)
2. Change the size of the linear layer
3. Create a completely custom NN.