## AIMI High School Internship 2023
### Notebook 2: Training a Vision Model to Predict ET Distances

**The Problem**: Given a chest X-ray, our goal in this project is to predict the distance from an endotracheal tube to the carina. This is an important clinical task - endotracheal tubes that are positioned too far (>5cm) above the carina will not work effectively.

**Your Second Task**: You should now have a training dataset consisting of (a) chest X-rays and (b) annotations indicating the distance of the endotracheal tube from the carina. Now, your goal is to train a computer vision model to predict endotracheal tube distance from the image. You have **two options** for this task, and you may attempt one or both of these:
- *Distance Categorization* : Train a model to determine whether the position of a tube is abnormal (>5.0 cm) or normal (≤ 5.0 cm).
- *Distance Prediction*: Train a model that predicts the distance of the endotracheal tube from the carina in centimeters.

In this notebook, we provide some simple starter code to get you started on training a computer vision model. You are not required to use this template - feel free to modify as you see fit.

**Submitting Your Model**: We have created a leaderboard where you can submit your model and view results on the held-out test set. We provide instructions below for submitting your model to the leaderboard. **Please follow these directions carefully**.

We will evaluate your results on the held-out test set with the following evaluation metrics:
- *Distance Categorization* : We will measure AUROC, which is a metric commonly used in healthcare tasks. See this blog for a good explanation of AUROC: https://glassboxmedicine.com/2019/02/23/measuring-performance-auc-auroc/
- *Distance Prediction*: We will measure the mean average error (also known as L1 distance) between the predicted distances and the true distances.


## Load Data
Before you begin, make sure to go to `Runtime` > `Change Runtime Type` and select a T4 GPU. Then, upload `data.zip`. It should take about 10 minutes for these files to be uploaded. Then, run the following cells to unzip the dataset (which should take < 10 seconds)

In [None]:
!wget https://storage.googleapis.com/misc_jb/drive-download-20230620T044532Z-001.zip

--2023-06-30 15:41:36--  https://storage.googleapis.com/misc_jb/drive-download-20230620T044532Z-001.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 108.177.119.128, 108.177.126.128, 108.177.127.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|108.177.119.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 320879178 (306M) [application/zip]
Saving to: ‘drive-download-20230620T044532Z-001.zip’


2023-06-30 15:41:50 (23.4 MB/s) - ‘drive-download-20230620T044532Z-001.zip’ saved [320879178/320879178]



In [None]:
!unzip -qq drive-download-20230620T044532Z-001.zip

In [None]:
!unzip -qq /content/data.zip

unzip:  cannot find or open /content/data.zip, /content/data.zip.zip or /content/data.zip.ZIP.


In [None]:
!unzip -qq /content/mimic-train.zip

In [None]:
!unzip -qq /content/mimic-test.zip

## Import Libraries
We are leveraging the PyTorch framework to train our models. For more information and tutorials on PyTorch, see this link: https://pytorch.org/tutorials/beginner/basics/intro.html

In [None]:
# Some libraries that you may find useful are included here.
# To import a library that isn't provided with Colab, use the following command: !pip install torchmetrics
import torch
import pandas as pd
from PIL import Image
import numpy as np
from tqdm import tqdm

# Import libraries
import torch
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# PyTorch dataset
from torchvision import datasets, transforms, models
import torchvision.transforms as transforms
from torch.utils.data.sampler import SubsetRandomSampler

# PyTorch model
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim


In [None]:
data = pd.read_csv('/content/data.csv')

In [None]:
data

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,patient_id,image_path,report_path,measures,positioning
0,0,0,13282,mimic-train/13282/56112/91263.jpg,mimic-train/13282/56112.txt,6.0,0
1,2,2,13360,mimic-train/13360/54397/84764.jpg,mimic-train/13360/54397.txt,5.6,0
2,3,3,13360,mimic-train/13360/57560/92873.jpg,mimic-train/13360/57560.txt,4.6,1
3,4,4,13360,mimic-train/13360/62326/88457.jpg,mimic-train/13360/62326.txt,5.0,1
4,5,5,13360,mimic-train/13360/59248/87908.jpg,mimic-train/13360/59248.txt,1.8,1
...,...,...,...,...,...,...,...
11560,12240,12240,13795,mimic-train/13795/60202/87633.jpg,mimic-train/13795/60202.txt,3.7,1
11561,12241,12241,13795,mimic-train/13795/60202/82617.jpg,mimic-train/13795/60202.txt,3.7,1
11562,12242,12242,13818,mimic-train/13818/59053/93743.jpg,mimic-train/13818/59053.txt,4.7,1
11563,12243,12243,13906,mimic-train/13906/62812/85124.jpg,mimic-train/13906/62812.txt,3.5,1


In [None]:
'''
data['Labels'] = data['Distance']<=5
data
'''

Unnamed: 0.3,Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,patient_id,study_id,image_id,image_path,report_path,Distance,Labels
0,0,0,0,13282,56112,91263,mimic-train/13282/56112/91263.jpg,mimic-train/13282/56112.txt,6.0,False
1,2,2,2,13360,54397,84764,mimic-train/13360/54397/84764.jpg,mimic-train/13360/54397.txt,5.6,False
2,3,3,3,13360,57560,92873,mimic-train/13360/57560/92873.jpg,mimic-train/13360/57560.txt,4.6,True
3,4,4,4,13360,62326,88457,mimic-train/13360/62326/88457.jpg,mimic-train/13360/62326.txt,5.0,True
4,5,5,5,13360,59248,87908,mimic-train/13360/59248/87908.jpg,mimic-train/13360/59248.txt,1.8,True
...,...,...,...,...,...,...,...,...,...,...
10948,12240,12240,12240,13795,60202,87633,mimic-train/13795/60202/87633.jpg,mimic-train/13795/60202.txt,3.7,True
10949,12241,12241,12241,13795,60202,82617,mimic-train/13795/60202/82617.jpg,mimic-train/13795/60202.txt,3.7,True
10950,12242,12242,12242,13818,59053,93743,mimic-train/13818/59053/93743.jpg,mimic-train/13818/59053.txt,4.7,True
10951,12243,12243,12243,13906,62812,85124,mimic-train/13906/62812/85124.jpg,mimic-train/13906/62812.txt,3.5,True


In [None]:
data['measures'].shape[0]

11565

## Create Dataloaders
We will implement a custom Dataset class to load in data. A custom Dataset class must have three methods: `__init__`, which sets up any class variables, `__len__`, which defines the total number of images, and `__getitem__`, which returns a single image and its paired label.

In [None]:
from torch.utils.data import Dataset
device = "cuda:0" if torch.cuda.is_available() else "cpu"

class ChestXRayDataset(Dataset):
    def __init__(self, imgs, labels, distances):
        super(ChestXRayDataset, self).__init__()
        self.img_paths = imgs #data['report_path']
        self.labels = labels #data['Labels']
        self.distances = distances #data['Distance']
        # Fill in __init__() here

    def __len__(self):

        # Fill in __len__() here
        return self.distances.shape[0]

    def __getitem__(self, idx):
        out_dict = {"idx": torch.tensor(idx),}
        img = Image.open(f"/content/{self.img_paths[idx]}")
        # Fill in __getitem__() here
        w, h = img.size
        ima = Image.new('RGB', (w,h))
        data = zip(img.getdata(), img.getdata(), img.getdata())
        ima.putdata(list(data))
        transformations =  transforms.Compose([transforms.Resize(224),
                                          transforms.ToTensor()])
        img_tensor = transformations(ima)
        img_tensor.requires_grad = True
        out_dict["img"] = img_tensor
        out_dict["distance"] = self.distances[idx]
        out_dict["labels"] = self.labels[idx]

        return out_dict
print(device)

cuda:0


## Define Training Components
Here, define any necessary components that you need to train your model, such as the model architecture, the loss function, and the optimizer.

In [None]:
dataset = ChestXRayDataset(imgs = data['image_path'], labels = data['positioning'], distances = data['measures'])

In [None]:
import torch.utils.data
def get_train_val_split(dataset, batch_size=10, train_prop=0.8):
    dataset_length = len(dataset)
    train_length = int(dataset_length * train_prop)
    val_length = dataset_length - train_length
    train_dataset, val_dataset = torch.utils.data.random_split(
            dataset, [train_length, val_length]
        )
    train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
    val_loader = torch.utils.data.DataLoader(dataset=val_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
    return train_loader, val_loader

train_loader, val_loader = get_train_val_split(dataset)

print(len(train_loader.dataset))
print(len(val_loader.dataset))

9252
2313


In [None]:
import torch.nn.functional as F

In [None]:
model = models.resnet101(pretrained=True)# Model Architecture (make sure to load the model on GPU, not CPU!)
model

Downloading: "https://download.pytorch.org/models/resnet101-63fe2227.pth" to /root/.cache/torch/hub/checkpoints/resnet101-63fe2227.pth
100%|██████████| 171M/171M [00:00<00:00, 244MB/s]


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [None]:
import torch.nn as nn
loss_fn = nn.BCEWithLogitsLoss()
model = models.resnet101(pretrained=True)
# Model Architecture (make sure to load the model on GPU, not CPU!)
for param in model.parameters():
  param.requires_grad = False
from collections import OrderedDict
classifier = nn.Sequential(OrderedDict([
    ('fc1', nn.Linear(2048, 1024)),
    ('relu', nn.ReLU()),
    ('d1', nn.Dropout(0.2)),
    ('fc2', nn.Linear(1024, 500)),
    ('rel2', nn.ReLU()),
    ('d2', nn.Dropout(0.2)),
    ('fc3', nn.Linear(500, 1))#,
   # ('sig', nn.Sigmoid())
]))
model.fc = classifier
opt = torch.optim.AdamW(model.parameters(), lr=1e-4)

#dataset = ChestXRayDataset(""" Fill in args here """)
#dataloader = torch.utils.data.DataLoader(dataset=dataset, batch_size=10, shuffle=True, drop_last=True)
model.to(device)

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

## Training Code
We provide starter code below that implements a simple training loop in PyTorch. Feel free to modify as you see fit.

In [None]:
'''
def train(model, loss_fn, train_loader, opt, max_epoch):

    best_val_loss = np.inf
    best_val_metrics = []
    for epoch in range(0, max_epoch):
        model.train()

        for step, sample in tqdm(enumerate(train_loader)):
            opt.zero_grad()
            pred = # Compute prediction

            loss = # Compute Loss

            loss.backward()
            opt.step()
'''

'\ndef train(model, loss_fn, train_loader, opt, max_epoch):\n\n    best_val_loss = np.inf\n    best_val_metrics = []\n    for epoch in range(0, max_epoch):\n        model.train()\n\n        for step, sample in tqdm(enumerate(train_loader)):\n            opt.zero_grad()\n            pred = # Compute prediction\n\n            loss = # Compute Loss\n\n            loss.backward()\n            opt.step()\n'

In [None]:
def train(model, loss_fn, train_loader, opt, max_epoch):

    best_val_loss = np.inf
    best_val_metrics = []
    for epoch in range(0, max_epoch):
        train_loss = 0.0
        valid_loss = 0.0
        print("epoch", epoch)
        for steps, data in tqdm(enumerate(train_loader)):
            #model.train()
            if (steps%100==0):
              print("processing ", steps)
            inputs = data["img"].type(torch.FloatTensor).to(device)
            labels = data["labels"].type(torch.FloatTensor).to(device)
            output = model(inputs) #.forward(inputs)
            loss = loss_fn(torch.flatten(output), labels.float())
            opt.zero_grad()

            pred = output

            #loss = # Compute Loss

            loss.backward()
            opt.step()
            train_loss += loss.item()*labels.shape[0]#data.size(0)

        model.eval()
        with torch.no_grad():
          for steps, ndata in enumerate(val_loader):
            inputs = ndata["img"].type(torch.FloatTensor).to(device)
            labels = ndata["labels"].type(torch.FloatTensor).to(device)
            print("processing validation ", steps)
            output = model.forward(inputs) #kg said model(inputs)
            loss = loss_fn(torch.flatten(output), labels.float())#loss(torch.flatten(output), labels.float())
            valid_loss += loss.item()*labels.size
        train_loss = train_loss/len(train_loader.sampler)
        valid_loss = valid_loss/len(valid_loader.sampler)
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
        epoch, train_loss, valid_loss))

        print("got here")
    # save model if validation loss has decreased
        if valid_loss <= valid_loss_min:
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
            valid_loss_min,
            valid_loss))
            torch.save(model.state_dict(), 'model_cifar.pt')
            valid_loss_min = valid_loss



In [None]:
train(model, loss_fn, train_loader, opt, max_epoch=5)

epoch 0


1it [00:00,  1.41it/s]

processing  0


101it [01:13,  1.30it/s]

processing  100


201it [02:24,  1.50it/s]

processing  200


301it [03:39,  1.50it/s]

processing  300


401it [04:52,  1.31it/s]

processing  400


501it [06:04,  1.50it/s]

processing  500


601it [07:15,  1.50it/s]

processing  600


701it [08:28,  1.50it/s]

processing  700


801it [09:39,  1.49it/s]

processing  800


901it [10:51,  1.49it/s]

processing  900


925it [11:09,  1.38it/s]


processing validation  0


TypeError: ignored

## Submitting Your Results
Once you have successfully trained your model, generate predictions on the test set and save your results as a `.csv` file. This file can then be uploaded to the leaderboard.

Your final `.csv` file **must** have the following format:
- There must be a column titled `image_path` with the paths to the test set images. This column should be identical to the one provided in `mimic_test_student.csv`.
- There must be a column titled `pred` with your model outputs.
  - If you are running the `distance categorization` task, this column must have floating point numbers ranging between 0 and 1. Higher numbers should indicate a greater likelihood that the tube distance is abnormal. Hint: You can convert model outputs to the 0 to 1 range by applying the sigmoid activation function (torch.nn.sigmoid())
  - If you are running the `distance prediction` task, this column must have numbers representing the tube distance in centimeters.
- Double check that there are 500 rows in your output file

In [None]:
model = # Model Architecture
ckpt = torch.load("/content/best.pkl")
model.load_state_dict(ckpt["state_dict"])

test_files = pd.read_csv('/content/mimic-test')

test_dataset = ChestXRayDataset(imgs="", labels, distances)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=4, shuffle=False, drop_last=False)

test_results = {"image_path": [], "pred": []}
# Write method to load in data from test_loader, compute model predictions, and append results to test_results dict


In [None]:
test_results = pd.DataFrame(test_results)
test_results.to_csv(f"/content/test.csv")