# 50.039 Theory and Practice of Deep Learning Project 2024

Group 10
- Issac Jose Ignatius (1004999)
- Mahima Sharma (1006106)
- Dian Maisara (1006377)


## Motivation

Chest radiography is an essential diagnostic tool used in medical imaging to visualise structures and organs within the chest cavity. It is crucial for diagnosing various respiratory and heart-related conditions. However, with the increased demand for radiological reports within shorter timeframes to detect and treat illnesses, there have been insufficient radiologists available to perform such tasks at scale. Therefore, automated chest radiograph interpretation could provide substantial benefits supporting large-scale screening and population health initiatives. Deep-learning algorithms can be used to bridge this gap. They have been used for image classification, anomaly detection, organ segmentation, and disease progression prediction.

*In this project, we aim to train a deep neural network to perform multi-label image classification on a wide array of chest radiograph images that exhibit various pathologies.*<br><br>

---




## Import all relevant libraries

In [None]:
!pip install --upgrade pip 
!pip install --upgrade torchmetrics 

# Visualisation libraries
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import seaborn as sns
import PIL
from IPython.display import Image 

# Numpy
import numpy as np
# Pandas
import pandas as pd

# Torch
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import torchvision
print(torchvision.__version__)
from torchvision.transforms import ToTensor
import torchvision.transforms.functional as fn
from torchvision.io import read_image, ImageReadMode
from torchmetrics.classification import BinaryAccuracy

# File Operations
from glob import glob

# Import Helpers
import sys
sys.path.insert(1, '/work')
import importlib
import helpers

%load_ext autoreload
%autoreload 2

0.17.1+cu121


## Setup environmental variables

In [None]:
LOCAL_PATH = "/datasets/chexphoto-v1"

## Data Loading

The training and validation datasets are from the **CheXphoto dataset** (Philips et al., 2020). <br><br> CheXphoto comprises a training set of natural photos and synthetic transformations of 10,507 X-ray images from 3,000 unique patients (32,521 data points) sampled at random from the CheXpert training dataset and an accompanying validation set of natural and synthetic transformations applied to all 234 X-ray images from 200 patients with an additional 200 cell phone photos of x-ray films from another 200 unique patients (952 data points).

### Loading Preprocessed Data

In [None]:
train_data = helpers.load_df('/work/preprocessed_train.csv')
valid_data = helpers.load_df('/work/preprocessed_valid.csv')

### Visualize dataframes

In [None]:
train_data

Unnamed: 0,Path,Sex,Age,Frontal/Lateral,AP/PA,No Finding,Enlarged Cardiomediastinum,Cardiomegaly,Lung Opacity,Lung Lesion,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
0,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,20.0,Frontal,PA,[0. 0. 1.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
1,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,20.0,Lateral,,[0. 0. 1.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
2,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,46.0,Frontal,PA,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
3,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,46.0,Lateral,,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
4,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,50.0,Frontal,AP,[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[0. 0. 1.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[0. 0. 1.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[0. 1. 0.]
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32516,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
32517,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
32518,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]
32519,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.],[0. 0. 1.],[1. 0. 0.],[1. 0. 0.],[1. 0. 0.]


In [None]:
valid_data

Unnamed: 0,Path,Sex,Age,Frontal/Lateral,AP/PA,No Finding,Enlarged Cardiomediastinum,Cardiomegaly,Lung Opacity,Lung Lesion,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
0,CheXphoto-v1.0/valid/synthetic/digital/patient...,Male,73.0,Frontal,AP,[0. 1. 0.],[0. 0. 1.],[0. 0. 1.],[0. 0. 1.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.]
1,CheXphoto-v1.0/valid/synthetic/digital/patient...,Male,70.0,Frontal,PA,[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 0. 1.]
2,CheXphoto-v1.0/valid/synthetic/digital/patient...,Male,70.0,Lateral,,[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 0. 1.]
3,CheXphoto-v1.0/valid/synthetic/digital/patient...,Male,85.0,Frontal,AP,[0. 1. 0.],[0. 0. 1.],[0. 1. 0.],[0. 0. 1.],[0. 1. 0.],[0. 0. 1.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.]
4,CheXphoto-v1.0/valid/synthetic/digital/patient...,Female,42.0,Frontal,AP,[0. 0. 1.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.]
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
697,CheXphoto-v1.0/valid/natural/oneplus/patient64...,Female,57.0,Frontal,AP,[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 0. 1.]
698,CheXphoto-v1.0/valid/natural/oneplus/patient64...,Male,65.0,Frontal,AP,[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 0. 1.]
699,CheXphoto-v1.0/valid/natural/oneplus/patient64...,Male,71.0,Frontal,AP,[0. 1. 0.],[0. 0. 1.],[0. 0. 1.],[0. 0. 1.],[0. 1. 0.],[0. 0. 1.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 0. 1.]
700,CheXphoto-v1.0/valid/natural/oneplus/patient64...,Female,45.0,Frontal,AP,[0. 1. 0.],[0. 0. 1.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.],[0. 1. 0.]


### Custom Dataset

In [None]:
from torchvision.transforms import v2, InterpolationMode

# Implementation of Custom Dataset Class for CheXPhoto Dataset
class CheXDataset(torch.utils.data.Dataset):
    # Accepts dataframe object and str
    def __init__(self, df: pd.DataFrame, split: str): #previously csv_path but after data preprocessing, we can accept directly
        self.dataframe = df #pd.read_csv(csv_path)
        # Note: self.split is used to identify test set in __getitem__ method due to the arrangement of values in tensor
        self.split = split

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        x_path = LOCAL_PATH + "/" + self.dataframe.iloc[idx, 0].split("CheXphoto-v1.0", 1)[-1]

        # Set ImageReadMode to RGB to handle both Synthetic Images (Grayscale) and Natural Images (RGB)
        x_tensor = read_image(x_path, mode=ImageReadMode.RGB) / 255
        # Resize img tensor to ensure same size for all - Size is set at (1024,1024)
        resized_x_tensor = v2.Resize(size=(70,70), interpolation=InterpolationMode.BICUBIC)(x_tensor)

        # Perform different slicing operations based on the split - focus is on handling test split
        # Additional input features e.g. sex, age, FoL, AoP are removed for now
        if self.split.lower() == "test":
            y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)
        else:
            y = torch.tensor(self.dataframe.iloc[idx, 5:], dtype=torch.float64)
        
        # return [x_path, x_tensor], y
        return resized_x_tensor, y

## Preliminary Model Training

For our initial model, we will be using "Edema" as our main target label and reduce this task to a binary classification. From this, we will build up the model to handle multi-head classification.

### Replace One-Hot Encoding in Training Dataset 

We would need to preprocess our training datasets to only contain either 0 or 1 for the Edema labels. The uncertain labels would be grouped along with the negative ones.

In [None]:
# Drop all other labels (does not mutate original train data)
subset_data = train_data.copy()
subset_data.drop(columns=['No Finding', 'Enlarged Cardiomediastinum', 'Cardiomegaly', 'Lung Opacity', 'Lung Lesion', 'Consolidation', 'Pneumonia', 'Atelectasis', 'Pneumothorax', 'Pleural Effusion', 'Pleural Other', 'Fracture', 'Support Devices'], inplace=True)

# Convert the one-hot encodings back to 0 or 1 
subset_data['Edema'].replace("[1. 0. 0.]", 0, inplace=True)
subset_data['Edema'].replace("[0. 1. 0.]", 0, inplace=True)
subset_data['Edema'].replace("[0. 0. 1.]", 1, inplace=True)

subset_data

Unnamed: 0,Path,Sex,Age,Frontal/Lateral,AP/PA,Edema
0,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,20.0,Frontal,PA,0
1,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,20.0,Lateral,,0
2,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,46.0,Frontal,PA,0
3,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,46.0,Lateral,,0
4,CheXphoto-v1.0/train/synthetic/digital/patient...,Female,50.0,Frontal,AP,0
...,...,...,...,...,...,...
32516,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,0
32517,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,0
32518,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,0
32519,CheXphoto-v1.0/train/natural/nokia/patient6446...,Male,77.0,Frontal,AP,0


### Sub-setting Training Data (TODO)

In [None]:
# Load into custom Dataset
subset = CheXDataset(subset_data, "train")

In [None]:
# Subset 
odds = list(range(1, len(subset_data), 2))
subset_odds = torch.utils.data.Subset(subset, odds)

In [None]:
# Load into DataLoader
batch_size = 4
subset_loader = DataLoader(subset_odds, batch_size, shuffle=True)

### Simple Feed-Forward Neural Network

In [None]:
# We will be using Binary Classification, with output of 0 or 1
class BinaryClassifier(nn.Module):
    def __init__(self, n_x, n_h, n_y, n_z):
        super(BinaryClassifier, self).__init__()

        # Model Layers
        self.fc1 = nn.Linear(n_x,n_h, dtype = torch.float64)
        self.fc2 = nn.Linear(n_h,n_y, dtype = torch.float64)
        self.fc3 = nn.Linear(n_y,n_z, dtype = torch.float64)
        self.sigmoid = nn.Sigmoid()

        # Loss and Accuracy metrics
        self.loss = nn.BCELoss()
        self.accuracy = BinaryAccuracy()

    
    def forward(self,x):
        x = x.to(torch.float64)

        #All  forward operations 
        out1 = self.fc1(x)
        out2 = self.fc2(out1)
        out3 = self.fc3(out2)
        out4 = self.sigmoid(out3)

        return out4

In [None]:
# Create Model
n_x = 14700 # 3*70*70 channels*height*width like in a MLP input layer for images
n_h = 7350
n_y = 3675
n_z = 1

model = BinaryClassifier(n_x, n_h, n_y, n_z)
print(model)

BinaryClassifier(
  (fc1): Linear(in_features=14700, out_features=7350, bias=True)
  (fc2): Linear(in_features=7350, out_features=3675, bias=True)
  (fc3): Linear(in_features=3675, out_features=1, bias=True)
  (sigmoid): Sigmoid()
  (loss): BCELoss()
  (accuracy): BinaryAccuracy()
)


### Training Feed-Forward Model (TODO)

In [None]:
#Gradient Descent 
Epochs = 1
optimizer = torch.optim.Adam(model.parameters(),
                             lr = 1e-3,
                             betas = (0.9, 0.999),
                             eps = 1e-08)
optimizer.zero_grad()

for epoch in range(Epochs):
    for batch in subset_loader:
        # Unpack the mini-batch data
        inputs_batch, outputs_batch = batch
        outputs_re = outputs_batch.reshape(-1, 1) #[128, 1]
        inputs_re = inputs_batch.reshape(inputs_batch.size(0), -1) #[128, 14700]
        
        # Forward pass
        pred = model(inputs_re)
        loss_value = model.loss(pred, outputs_re)
    
        # Compute binary accuracy
        binary_accuracy_value = model.accuracy(pred, outputs_re)
    
        # Backward pass and optimization
        loss_value.backward()
        optimizer.step()
        optimizer.zero_grad()
        
    # Print loss and accuracy
    print(f'Epoch [{epoch+1}/{Epochs}], Training Loss: {loss_value.item():.4f}, Training Accuracy: {binary_accuracy_value.item():.4f}')

  y = torch.tensor(self.dataframe.iloc[idx, 5:], dtype=torch.float64)


KernelInterrupted: Execution interrupted by the Jupyter kernel.

## Model Tuning

Our final model is a CNN with multiple heads (14 heads) capable of classifying each observation for the various pathologies. We utilise the Cross-entropy loss function to optimise the model during training.

**This is a TODO since it can change**


### First iteration - Convolutional neural network (CNN)

#### Model

#### Training

#### Evaluation

Gradually, we moved the model into a traditional CNN-based architecture to see if we can surpass the performance from above. Briefly discuss what we needed to add to the model (filtering, convolution blablabla)

## Observations

**TODO** Discuss whether its right for us to pluck all our evaluation and training together and discuss it here or break up the code without any descriptions


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=d056a7b8-1929-4f43-a228-a643b0e765c5' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>