## AIMI High School Internship 2023
### Notebook 2: Training a Vision Model to Predict ET Distances

**The Problem**: Given a chest X-ray, our goal in this project is to predict the distance from an endotracheal tube to the carina. This is an important clinical task - endotracheal tubes that are positioned too far (>5cm) above the carina will not work effectively.

**Your Second Task**: You should now have a training dataset consisting of (a) chest X-rays and (b) annotations indicating the distance of the endotracheal tube from the carina. Now, your goal is to train a computer vision model to predict endotracheal tube distance from the image. You have **two options** for this task, and you may attempt one or both of these:
- *Distance Categorization* : Train a model to determine whether the position of a tube is abnormal (>5.0 cm) or normal (≤ 5.0 cm).
- *Distance Prediction*: Train a model that predicts the distance of the endotracheal tube from the carina in centimeters.

In this notebook, we provide some simple starter code to get you started on training a computer vision model. You are not required to use this template - feel free to modify as you see fit.

**Submitting Your Model**: We have created a leaderboard where you can submit your model and view results on the held-out test set. We provide instructions below for submitting your model to the leaderboard. **Please follow these directions carefully**.

We will evaluate your results on the held-out test set with the following evaluation metrics:
- *Distance Categorization* : We will measure AUROC, which is a metric commonly used in healthcare tasks. See this blog for a good explanation of AUROC: https://glassboxmedicine.com/2019/02/23/measuring-performance-auc-auroc/
- *Distance Prediction*: We will measure the mean average error (also known as L1 distance) between the predicted distances and the true distances.


## Load Data
Before you begin, make sure to go to `Runtime` > `Change Runtime Type` and select a T4 GPU. Then, upload `data.zip`. It should take about 10 minutes for these files to be uploaded. Then, run the following cells to unzip the dataset (which should take < 10 seconds)

## Import Libraries
We are leveraging the PyTorch framework to train our models. For more information and tutorials on PyTorch, see this link: https://pytorch.org/tutorials/beginner/basics/intro.html

In [136]:
# Some libraries that you may find useful are included here.
# To import a library that isn't provided with Colab, use the following command: !pip install torchmetrics
import torch
import pandas as pd
from PIL import Image
import numpy as np
from tqdm import tqdm
from torchvision import transforms
from torch.utils.data import DataLoader, random_split


## Create Dataloaders
We will implement a custom Dataset class to load in data. A custom Dataset class must have three methods: `__init__`, which sets up any class variables, `__len__`, which defines the total number of images, and `__getitem__`, which returns a single image and its paired label.

In [137]:
from torch.utils.data import Dataset
from PIL import Image

device = "cuda:0" if torch.cuda.is_available() else "cpu"

class ChestXRayDataset(Dataset):
    def __init__(self, img_paths, labels, distances):
        super(ChestXRayDataset, self).__init__()
        self.img_paths = img_paths
        self.labels = labels
        self.distances = distances
        # Fill in __init__() here

    def __len__(self):

        # Fill in __len__() here
        return self.labels.shape[0]

    def __getitem__(self, idx):
        out_dict = {"idx": torch.tensor(idx),}

        # Fill in __getitem__() here
        im = Image.open(f"data/{self.img_paths[idx]}")

        w, h = im.size
        ima = Image.new('RGB', (w,h))
        data = zip(im.getdata(), im.getdata(), im.getdata())
        ima.putdata(list(data))
        convert_tensor = transforms.Compose([
            transforms.Resize(size=224),
            transforms.ToTensor(),
        ])

        img_as_tensor = convert_tensor(ima)
        img_as_tensor.requires_grad_ = True
        out_dict["img"] = img_as_tensor
        out_dict["labels"] = self.labels[idx]
        out_dict["distance"] = self.distances[idx]

        return out_dict
print(device)

cpu


## Define Training Components
Here, define any necessary components that you need to train your model, such as the model architecture, the loss function, and the optimizer.

In [138]:
import torch.nn as nn

data = pd.read_csv("mimic_train_labels_pruned.csv")
img_paths = data["image_path"].to_numpy()
labels = data["positioning"].to_numpy()
distances = data["measures"].to_numpy()

dataset = ChestXRayDataset(img_paths=img_paths, labels=labels, distances=distances)

def get_train_val_split(dataset, batch_size=10, train_prop=0.9):
    dataset_length = len(dataset)
    train_length = int(dataset_length * train_prop)
    val_length = dataset_length - train_length
    train_dataset, val_dataset = random_split(
            dataset, [train_length, val_length]
        )

    train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
    val_loader = DataLoader(dataset=val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
    return train_loader, val_loader

train_loader, val_loader = get_train_val_split(dataset, batch_size=32)

print(len(train_loader.dataset))
print(len(val_loader.dataset))


9252
2313


## Training Code
We provide starter code below that implements a simple training loop in PyTorch. Feel free to modify as you see fit.

In [139]:
from sklearn.metrics import f1_score, roc_auc_score, accuracy_score

def calculate_scores(y_true, y_pred):
   y_pred = y_pred.flatten()
   locs_positive = np.where(y_pred >= 0.5)
   y_pred = np.zeros(y_pred.shape[0])
   y_pred[locs_positive] = 1
   return f1_score(y_true, y_pred), accuracy_score(y_true, y_pred)

def validate(model, loss_fn, val_loader):

    f1_scores, acc_scores = [], []
    total_loss = 0
    for data in tqdm(val_loader):
        model.eval()
        with torch.no_grad():
            inputs = data["img"].type(torch.FloatTensor).to(device)
            labels = data["labels"].type(torch.FloatTensor).to(device)
            outputs = model(inputs)
            loss_val = loss_fn(torch.flatten(outputs), labels)
            total_loss += loss_val.item()

        f1, acc = calculate_scores(labels.detach().cpu().numpy(), torch.sigmoid(outputs).detach().cpu().numpy())
        f1_scores.append(f1)
        acc_scores.append(acc)
    return np.mean(f1_scores), np.mean(acc_scores), total_loss

def train(model, loss_fn, train_loader, opt):
    f1_scores, acc_scores = [], []
    total_loss = 0
    for data in tqdm(train_loader):
        model.train()
        inputs = data["img"].type(torch.FloatTensor).to(device)
        labels = data["labels"].type(torch.FloatTensor).to(device)
        opt.zero_grad()
        outputs = model(inputs)
        loss_val = loss_fn(torch.flatten(outputs), labels)
        total_loss += loss_val.item()
        loss_val.backward()
        opt.step()

        f1, acc = calculate_scores(labels.detach().cpu().numpy(), torch.sigmoid(outputs).detach().cpu().numpy())
        f1_scores.append(f1)
        acc_scores.append(acc)
    return np.mean(f1_scores), np.mean(acc_scores), total_loss

def batch_progress(epoch, tr_f1, tr_acc, tr_loss, val_f1, val_acc, val_loss):
    # Batch train data
    print(f"Epoch {epoch} Training Statistics")
    print(f"F1 Score: {tr_f1}\n Accuracy: {tr_acc}\n Loss: {tr_loss}\n")
    # Batch validation data
    print(f"Epoch {epoch} Validation Statistics")
    print(f"F1 Score: {val_f1}\n Accuracy: {val_acc}\n Loss: {val_loss}\n")



In [140]:
# Model definition
import torch.nn as nn
# Load resnet-50 here

# FineTuning Architecture
# From https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8933872/#b50

from torchvision import datasets, transforms, models

# We have resnet -- outputs 1000 things. But, we only want to output ONE.

# We only output one thing.

# Why? We want to output the PROBABILITY that a particular image has an ETT device with good positioning.
# The probability will be between 0 and 1. We can apply the sigmoid function to the output to receive a probability that a particular
# image has an ETT device in good poisitioning. We can use a threshold to determine TRUE or FALSE.
# A good starting threshold is if it is less than or equal to 0.5, it is FALSE
# otherwise, it is TRUE.


def get_model(num_classes):
    resnet50_aimi = models.resnet50(pretrained=True)
    n_features = resnet50_aimi.fc.in_features
    try:
        # model.fc = nn.Linear(n_features, K)
        resnet50_aimi.fc = nn.Sequential(
            nn.Linear(n_features, n_features),
            nn.Dropout(p=0.4),
            nn.Linear(n_features, n_features),
            nn.ReLU(),
            nn.Linear(n_features, num_classes),
        )
        for param in resnet50_aimi.parameters():
            param.requires_grad = True
    except Exception as e:
        print("ERROR at: model.fc = nn.Linear(n_features, K)")
        raise e
    return resnet50_aimi

def save_model(model, epoch, optimizer, val_f1, val_acc, val_loss,
               tr_f1, tr_acc, tr_loss, file_name):
  print("Saving model checkpoint at epoch", epoch)
  torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'val_f1': val_f1,
            'val_acc': val_acc,
            'val_loss': val_loss,
            'train_f1': tr_f1,
            'train_acc': tr_acc,
            'train_loss': tr_loss,
            }, f"{file_name}.pth")

model = get_model(num_classes=1).to(device)

print(model)




ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [141]:
import torch.nn as nn
import numpy as np

opt = torch.optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999))
loss_fn = nn.BCEWithLogitsLoss()
CHECKPOINTS = [25]

EXPERIMENT_NAME = "resnet50_tl_exp_2"
NUM_EPOCHS = 50

train_f1, train_acc, train_loss = [], [], []
val_f1, val_acc, val_loss = [], [], []
best_model_metrics = {"best_val_acc": 0.0,
                      "best_val_f1": 0.0,
                      "best_val_loss":0.0}
for epoch in range(NUM_EPOCHS):
    batch_tr_f1, batch_tr_acc, batch_tr_loss = train(model, loss_fn, train_loader, opt)
    batch_val_f1, batch_val_acc, batch_val_loss = validate(model, loss_fn, val_loader)

    batch_progress(
        epoch, batch_tr_f1, batch_tr_acc, batch_tr_loss,
        batch_val_f1, batch_val_acc, batch_val_loss
        )

    train_f1.append(batch_tr_f1)
    train_acc.append(batch_tr_acc)
    train_loss.append(batch_tr_loss)

    val_f1.append(batch_val_f1)
    val_acc.append(batch_val_acc)
    train_loss.append(batch_val_loss)

    if best_model_metrics["best_val_acc"] < batch_val_acc :
      best_model_metrics["best_val_acc"] = batch_val_acc
      best_model_metrics["best_val_f1"] = batch_val_f1
      best_model_metrics["best_val_loss"] = batch_val_loss
      save_model(
           model, epoch, opt, val_f1, val_acc, val_loss,
           train_f1, train_acc, train_loss,
           f"{EXPERIMENT_NAME}_epoch_{epoch}"
          )
    print(f'Your best model has \n \
        Val Acc: {best_model_metrics["best_val_acc"]} \n \
        Val F1: {best_model_metrics["best_val_f1"]}')

## TRAINING COMPLETE ##
save_model(
           model, epoch, opt, val_f1, val_acc, val_loss,
           train_f1, train_acc, train_loss,
           f"{EXPERIMENT_NAME}_epoch_{NUM_EPOCHS}"
          )

train_f1 = np.array(train_f1)
train_acc = np.array(train_acc)
train_loss = np.array(train_loss)

val_f1 = np.array(val_f1)
val_acc = np.array(val_acc)
val_loss = np.array(val_loss)

  0%|          | 0/289 [00:03<?, ?it/s]


KeyboardInterrupt: 

## Submitting Your Results
Once you have successfully trained your model, generate predictions on the test set and save your results as a `.csv` file. This file can then be uploaded to the leaderboard.

Your final `.csv` file **must** have the following format:
- There must be a column titled `image_path` with the paths to the test set images. This column should be identical to the one provided in `mimic_test_student.csv`.
- There must be a column titled `pred` with your model outputs.
  - If you are running the `distance categorization` task, this column must have floating point numbers ranging between 0 and 1. Higher numbers should indicate a greater likelihood that the tube distance is abnormal. Hint: You can convert model outputs to the 0 to 1 range by applying the sigmoid activation function (torch.nn.sigmoid())
  - If you are running the `distance prediction` task, this column must have numbers representing the tube distance in centimeters.
- Double check that there are 500 rows in your output file

In [None]:
model = # Model Architecture
ckpt = torch.load("/content/best.pkl")
model.load_state_dict(ckpt["state_dict"])

test_dataset = ChestXRayDataset("""Fill in args here""")
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=4, shuffle=False, drop_last=False)

test_results = {"image_path": [], "pred": []}
# Write method to load in data from test_loader, compute model predictions, and append results to test_results dict


In [None]:
test_results = pd.DataFrame(test_results)
test_results.to_csv(f"/content/test.csv")