# 50.039 Theory and Practice of Deep Learning Project 2024

Group 10
- Issac Jose Ignatius (1004999)
- Mahima Sharma (1006106)
- Dian Maisara (1006377)


## Motivation

Chest radiography is an essential diagnostic tool used in medical imaging to visualise structures and organs within the chest cavity. It is crucial for diagnosing various respiratory and heart-related conditions. However, with the increased demand for radiological reports within shorter timeframes to detect and treat illnesses, there have been insufficient radiologists available to perform such tasks at scale. Therefore, automated chest radiograph interpretation could provide substantial benefits supporting large-scale screening and population health initiatives. Deep-learning algorithms can be used to bridge this gap. They have been used for image classification, anomaly detection, organ segmentation, and disease progression prediction.
<br><br>

*In this project, we aim to train a deep neural network to perform multi-label image classification on a wide array of chest radiograph images that exhibit various pathologies.*<br><br>



---




## Import all relevant libraries

In [1]:
# Numpy
import numpy as np
# Pandas
import pandas as pd

# Torch
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, WeightedRandomSampler
import torchvision
import torchvision.transforms as T
from torchvision.transforms import v2
from torchvision.io import read_image, ImageReadMode
from torchmetrics.classification import BinaryAccuracy


# File Operations
import os

# Helper scripts
from tqdm.notebook import tqdm
# import sys
# sys.path.insert(0, '../src')
# from saver_loader import *
# %reload_ext autoreload
# %autoreload 2

print(torchvision.__version__)

0.17.2+cu121


In [2]:
# Use GPU if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# device = torch.device("cpu") 
print(device)

cuda


## Data Loading

The training and validation datasets are from the **CheXphoto dataset** (Philips et al., 2020). <br><br> CheXphoto comprises a training set of natural photos and synthetic transformations of 10,507 X-ray images from 3,000 unique patients (32,521 data points) sampled at random from the CheXpert training dataset and an accompanying validation set of natural and synthetic transformations applied to all 234 X-ray images from 200 patients with an additional 200 cell phone photos of x-ray films from another 200 unique patients (952 data points).

### Retrieving dataset from Google Cloud Storage (GCS)




In [3]:
# # Connect to GCS to access data
# from google.colab import auth
# auth.authenticate_user() # TODO: everyone to send me gmail so I can have you authed for bucket access

# project_id = 'tpdl-414711'
# bucket_name = 'chexphoto-v1'
# !gcloud config set project {project_id}

# # Install Cloud Storage FUSE.
# !echo "deb https://packages.cloud.google.com/apt gcsfuse-`lsb_release -c -s` main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
# !curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
# !apt -qq update && apt -qq install gcsfuse

# # Mount a Cloud Storage bucket or location, without the gs:// prefix.
# mount_path = "chexphoto-v1"  # or a location like "my-bucket/path/to/mount"
# local_path = f"/mnt/gs/{mount_path}"

# !mkdir -p {local_path}
# !gcsfuse --implicit-dirs {mount_path} {local_path}

### Setup environment variables


In [4]:
data_path = os.path.join(os.path.abspath(''), "../ChexPhoto/chexphoto-v1")
print(data_path)

c:\Users\User\Desktop\50.039 TPDL\2024_TPDL\notebooks\../ChexPhoto/chexphoto-v1


### Loading dataset (image and labels)

In [5]:
# Bug in Path present in training dataset
def fix_error_paths(row):
    row = row.replace("//", "/")
    return row

def str_to_array(row):
    ndarray = np.fromstring(
                row.replace('\n','')
                    .replace('[','')
                    .replace(']','')
                    .replace('  ',' '), 
                    sep=' ')
    return ndarray

In [6]:
train_df = pd.read_csv("../data/processed/train_one_hot_encoded.csv", index_col=False)
labels = train_df.columns[1:]
print(len(labels), labels)

13 Index(['Enlarged Cardiomediastinum', 'Cardiomegaly', 'Lung Opacity',
       'Lung Lesion', 'Edema', 'Consolidation', 'Pneumonia', 'Atelectasis',
       'Pneumothorax', 'Pleural Effusion', 'Pleural Other', 'Fracture',
       'Support Devices'],
      dtype='object')


In [7]:
train_df["Path"] = train_df["Path"].apply(fix_error_paths)

valid_df = pd.read_csv("../data/processed/valid_one_hot_encoded.csv", index_col=False)
#test_df = pd.read_csv("../data/processed/test_one_hot_encoded.csv", index_col=False)

for label in labels:
    train_df[label] = train_df[label].apply(str_to_array)
    valid_df[label] = valid_df[label].apply(str_to_array)
    #test_df[label] = test_df[label].apply(str_to_array)

display(train_df)
display(valid_df)
#display(test_df)

Unnamed: 0,Path,Enlarged Cardiomediastinum,Cardiomegaly,Lung Opacity,Lung Lesion,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
0,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
1,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
2,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
3,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
4,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 1.0, 0.0]"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32516,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[1.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
32517,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[1.0, 0.0, 0.0]","[1.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
32518,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[1.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[1.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
32519,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"


Unnamed: 0,Path,Enlarged Cardiomediastinum,Cardiomegaly,Lung Opacity,Lung Lesion,Edema,Consolidation,Pneumonia,Atelectasis,Pneumothorax,Pleural Effusion,Pleural Other,Fracture,Support Devices
0,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
1,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
2,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
3,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
4,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
697,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
698,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
699,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
700,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 0.0, 1.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"


### Subsetting dataset (Binary Classification for Pleural Effusion)


In [8]:
def keep_observations(df, cols):
    return df[cols].copy()

In [9]:
# Drop all other labels
cols = ["Path", "Pleural Effusion", "Cardiomegaly"]
pEff_train = keep_observations(train_df, cols)
pEff_valid = keep_observations(valid_df, cols)
# pEff_test = keep_observations(test_df, cols)

In [10]:
display(pEff_train)
display(pEff_valid)
# display(pEff_test)

Unnamed: 0,Path,Pleural Effusion,Cardiomegaly
0,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]"
1,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 0.0]"
2,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
3,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 0.0]","[0.0, 0.0, 0.0]"
4,CheXphoto-v1.0/train/synthetic/digital/patient...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 1.0]"
...,...,...,...
32516,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]"
32517,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]"
32518,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]"
32519,CheXphoto-v1.0/train/natural/nokia/patient6446...,"[0.0, 0.0, 1.0]","[0.0, 0.0, 0.0]"


Unnamed: 0,Path,Pleural Effusion,Cardiomegaly
0,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
1,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
2,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
3,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
4,CheXphoto-v1.0/valid/synthetic/digital/patient...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
...,...,...,...
697,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
698,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"
699,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 0.0, 1.0]"
700,CheXphoto-v1.0/valid/natural/oneplus/patient64...,"[0.0, 1.0, 0.0]","[0.0, 1.0, 0.0]"


### Custom Dataset implementation

In [11]:
# Implementation of Custom Dataset Class for CheXPhoto Dataset
class CheXDataset(Dataset):
    # Accepts dataframe object and str
    def __init__(self, df: pd.DataFrame):
        self.dataframe = df.copy()

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        x_path = data_path + "/" + self.dataframe.iloc[idx, 0].split("CheXphoto-v1.0", 1)[-1]
        
        transform = T.Compose([
            v2.Resize((512, 512), interpolation=T.InterpolationMode.BICUBIC)
        ])

        resized_x_tensor = transform(read_image(x_path, mode = ImageReadMode.RGB)) /255

        y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)
        return resized_x_tensor, y

### Custom Dataloader

In [12]:
# Load into custom Dataset
pEff_train_data = CheXDataset(pEff_train)
pEff_valid_data = CheXDataset(pEff_valid)
#pEff_test_data = CheXDataset(pEff_test)

# Prepare random sampler for training subset [19582/1873, 19582/4976, 19582/12733]
train_sampler = WeightedRandomSampler([1873/19582, 4976/19582, 12733/19582], int(len(pEff_train_data)))

# Load into DataLoader
batch_size = 16 # changed from 32 to 16
pEff_train_loader = DataLoader(pEff_train_data, batch_size, sampler=train_sampler)
pEff_valid_loader = DataLoader(pEff_valid_data, batch_size)
#pEff_test_loader = DataLoader(pEff_test_data, batch_size)

In [13]:
x, y = pEff_train_data[0]
print(x, x.shape, x.dtype)
print(y, y.shape, y.dtype)

tensor([[[0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         ...,
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588]],

        [[0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         ...,
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588]],

        [[0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.0588],
         [0.0588, 0.0588, 0.0588,  ..., 0.0588, 0.0588, 0.

  y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)
  y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)


## Model Tuning

Our initial model is a simple feedforward neural network with multiple heads (2 heads) capable of classifying for both Cardiomegaly and Pleural Effusion. We will utilise the Cross-entropy loss function to optimise the model during training.

**This is a TODO since it can change**


### First iteration - Simple feedforward neural network

#### Model

In [14]:
# # We will be using Binary Classification, with output of 0 or 1
# class SimpleNN(nn.Module):
#     def __init__(self, n_x, n_h, n_y, n_z):
#         super(SimpleNN, self).__init__()

#         # Model Layers
#         self.fc1 = nn.Linear(n_x,n_h, dtype = torch.float32)
#         self.fc2 = nn.Linear(n_h,n_y, dtype = torch.float32)
#         self.fc3 = nn.Linear(n_y,n_z, dtype = torch.float32)
#         self.sigmoid = nn.Sigmoid()

#         # Loss and Accuracy metrics
#         self.loss = nn.BCELoss()
#         self.accuracy = BinaryAccuracy()


#     def forward(self,x):
#         x = x.to(torch.float32)

#         #All  forward operations
#         out1 = self.fc1(x)
#         out2 = self.fc2(out1)
#         out3 = self.fc3(out2)
#         out4 = self.sigmoid(out3)

#         return out4

In [15]:
# # Create Model
# n_x = 14700 # 3*70*70 channels*height*width like in a MLP input layer for images
# n_h = 8000
# n_y = 2000
# n_z = 1

# model = SimpleNN(n_x, n_h, n_y, n_z).to(device)
# print(model)

#### Training

In [16]:
def train_loop(model, train_loader, optimizer, loss):
    model.train()

    train_loss = 0.0
    train_total = 0
    train_correct = 0

    for inputs, outputs in tqdm(train_loader):
        outputs_pEff = outputs[:, 0, :]
        inputs_re, outputs_re = inputs.to(device), outputs_pEff.to(device)
        
        optimizer.zero_grad()
        preds = model(inputs_re)

        # Feed class labels for each sample within the batch - Cross Entropy accepts class labels so we need to convert the OHE labels to class labels
        loss_value = loss(preds, torch.argmax(outputs_re, dim=1))
        loss_value.backward()
        optimizer.step()

        # Compute metric
        train_loss += loss_value.item() * outputs_re.size(0)
        train_total += outputs_re.size(0)
        train_correct += (torch.argmax(preds, dim=1) == torch.argmax(outputs_re, dim=1)).sum().item() # Convert both to class labels

    train_loss /= train_total
    train_accuracy = train_correct / train_total

    return train_loss, train_accuracy

In [17]:
def test_loop(model, valid_loader):
    model.eval()
        
    val_total = 0
    val_correct = 0

    with torch.no_grad():
        for inputs, outputs in tqdm(valid_loader):
            # Retrieve predictions
            outputs_pEff = outputs[:, 0, :]
            inputs_re, outputs_re = inputs.to(device), outputs_pEff.to(device)
            preds = model(inputs_re)

            # Compute metrics
            val_total += outputs_re.size(0)
            val_correct += (torch.argmax(preds, dim=1) == torch.argmax(outputs_re, dim=1)).sum().item() # Convert both to class labels

    ## TODO: Implement Validation Loss as well
    
    val_accuracy = val_correct / val_total

    return val_accuracy   

In [18]:
def train(model, train_loader, valid_loader, epochs=10, lr=1e-3):
    # Adam
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    loss = nn.CrossEntropyLoss().to(device)

    #train_loss_values = []
    #train_accuracy_values = []
    for epoch in tqdm(range(epochs)):
        # Train loop
        train_loss, train_accuracy = train_loop(model, train_loader, optimizer, loss)

        # Test loop
        val_accuracy = test_loop(model, valid_loader)

        ## TODO: Implement Validation Loss as well

        print(f'--- Epoch {epoch+1}/{epochs}: Train loss: {train_loss:.4f}, Train accuracy: {train_accuracy:.4f}\n Validation accuracy: {val_accuracy}')


### Second iteration - Basic Convolutional neural network (CNN)

#### Model

In [19]:
class ResidualBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(ResidualBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size, stride, padding)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU()
    
    def forward(self, x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        out += residual
        out = self.relu(out)
        return out

In [20]:
class ConvolutionBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
        super(ConvolutionBlock, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        self.bn = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU()
    
    def forward(self, x):
        out = self.conv(x)
        out = self.bn(out)
        out = self.relu(out)
        return out

In [21]:
# We will be using Convolutional Neural Network
class FullCNN(nn.Module):
    def __init__(self):
        super(FullCNN, self).__init__()

        # Conv Layers
        self.conv1 = nn.Conv2d(3, 16, kernel_size = 3, stride = 1, padding = 1)
        self.conv2 = nn.Conv2d(16, 32, kernel_size = 3, stride = 1, padding = 1)
        self.conv3 = nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1)
        self.conv4 = nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1)

        # BN Layers
        self.batch_norm1 = nn.BatchNorm2d(16)
        self.batch_norm2 = nn.BatchNorm2d(32)
        self.batch_norm3 = nn.BatchNorm2d(64)
        self.batch_norm4 = nn.BatchNorm2d(128)

        # MP Layer
        self.maxpool2d = F.max_pool2d

        # Skip connection implementation
        self.skip_connection1 = nn.Conv2d(16, 32, kernel_size=1, stride=2)
        self.skip_connection2 = nn.Conv2d(32, 64, kernel_size=1, stride=2)

        # Dropout Layer
        self.dropout = nn.Dropout(0.2)

        # FC Layers
        self.fc1 = nn.Linear(128*32*32, 8192, dtype = torch.float32)
        self.fc2 = nn.Linear(8192, 512, dtype = torch.float32) #512
        self.fc3 = nn.Linear(512, 32, dtype = torch.float32)
        self.fc4 = nn.Linear(32, 3, dtype = torch.float32)

        # Softmax Layer - (Not needed when using CrossEntropyLoss)
        # self.softmax = F.softmax


    def forward(self,x):
        # All forward operations
        x1 = x.to(torch.float32)
        # First convolutional layer
        x1 = self.conv1(x1)
        x1 = self.batch_norm1(x1)
        x1 = F.relu(x1)
        x1 = self.maxpool2d(x1, 2)
        
        # Second convolutional layer -> pooling
        x2 = self.conv2(x1)
        x2 = self.batch_norm2(x2)
        x2 = F.relu(x2)
        x2 = self.maxpool2d(x2, 2)

        # Apply skip connection after the second convolutional layer
        skip_connection_output1 = self.skip_connection1(x1)
        x2 += skip_connection_output1

        # Third convolutional layer -> pooling
        x3 = self.conv3(x2)
        x3 = self.batch_norm3(x3)
        x3 = F.relu(x3)
        x3 = self.maxpool2d(x3, 2)

        # Apply skip connection after the third convolutional layer
        skip_connection_output2 = self.skip_connection2(x2)
        x3 += skip_connection_output2

        # Fourth convolutional layer -> pooling
        x4 = self.conv4(x3)
        x4 = self.batch_norm4(x4)
        x4 = F.relu(x4)
        x4 = self.maxpool2d(x4, 2)

        # Flatten output of convolutions
        x4 = x4.view(-1, 128*32*32)
        # print(x.shape)
        x4 = self.dropout(x4)

        # First FC layer
        x5 = self.fc1(x4)
        x5 = F.relu(x5)
        x5 = self.dropout(x5)

        # Second FC layer
        x6 = self.fc2(x5)
        x6 = F.relu(x6)
        x6 = self.dropout(x6)

        # Third FC layer
        x7 = self.fc3(x6)
        x7 = F.relu(x7)
        x7 = self.dropout(x7)
        
        # Fourth FC layer
        x8 = self.fc4(x7)
        # x9 = self.softmax(x8) # No need, Refer to https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss

        return x8

In [22]:
# Create Model
# model = torchvision.models.resnet18(weights=False).to(device) # to check if trainer works
model = FullCNN().to(device)
print(model)

FullCNN(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv4): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (batch_norm1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (batch_norm2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (batch_norm3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (batch_norm4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (skip_connection1): Conv2d(16, 32, kernel_size=(1, 1), stride=(2, 2))
  (skip_connection2): Conv2d(32, 64, kernel_size=(1, 1), stride=(2, 2))
  (dropout): Dropout(p=0.2, inplace=False)
  (fc1): Linear(in_features=131072, out_features=8192, bias=True)
  (fc2): Linear(in_features=8192, out_fea

In [23]:
for inputs, outputs in pEff_train_loader:
    print(inputs.shape)
    print(outputs.shape)
    break

torch.Size([16, 3, 512, 512])
torch.Size([16, 2, 3])


  y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)


In [24]:
train(model, pEff_train_loader, pEff_valid_loader, epochs=5)
#torch.save(model.state_dict(), 'weights.pt')

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/2033 [00:00<?, ?it/s]

  y = torch.tensor(self.dataframe.iloc[idx, 1:], dtype=torch.float64)


### Testing

## Observations

**TODO** Discuss whether its right for us to pluck all our evaluation and training together and discuss it here or break up the code without any descriptions
