# Introduction

---
This notebook details the complete workflow for building and evaluating a multi-label image classification model using the Resnet-18 architecture. The goal is to predict multiple labels simultaneously for each image within a dataset.  This notebook uses PyTorch for model building and training, scikit-learn for data splitting and some evaluation metrics.  The process is divided into several key steps, outlined below.


# Import Libraries

---
This section imports all the necessary Python libraries required for the project. Key libraries include PyTorch (for deep learning, scikit-learn (for machine learning utilities), pandas (for data manipulation), and several others for image processing, visualization, and progress tracking.


In [None]:
import torchvision
from torch.utils.data import Dataset, DataLoader
import os
import pandas as pd
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
from tqdm.auto import tqdm
from PIL import Image
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_curve, auc, roc_auc_score
import seaborn as sns
import numpy as np
from torchvision import models, transforms
import torch.nn.functional as F
from torch.utils.tensorboard import SummaryWriter
import itertools
from sklearn.metrics import classification_report
import random
import json
from sklearn.metrics import multilabel_confusion_matrix
from plot_utils import *
from helper import *
from metrics import *

In [None]:
# # Set device to GPU if available, otherwise default to CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


In [None]:
#be fruitful and multiply
seed = 42
np.random.seed(seed)
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)

# Load Files

---
This step loads the image data and corresponding labels. The image paths are read from a CSV file ('annotations.csv'), which contains filenames and their associated labels with associated feature presence. The images the dataset used for this project is a collection of images of beehive frames with varying features. The images themselves are loaded from the 'images' directory.


In [None]:
data_dir = 'data'
csv_path = os.path.join(data_dir, 'annotations.csv')
img_path = os.path.join(data_dir, 'images')
df = pd.read_csv(csv_path)

# Model Definition and Training

---

A custom dataset class ('CustomImageDataset') is defined to efficiently handle the image and label data during training. The pre-trained Resnet-18 model is loaded, and its final classification layer is adjusted to match the number of output labels (15 in this case). The model is then trained using binary cross-entropy loss ('BCEWithLogitsLoss') and an Adam optimizer. The training process is monitored and logged, with metrics recorded at each epoch.

In [None]:
# *******************************************
class CustomImageDataset(Dataset):
    def __init__(self, images, labels, data_transform=None):
        self.images = images
        self.labels = labels
        self.transform = data_transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        image = self.images[idx]
        label = self.labels[idx]

        # Apply transformers if provided
        if self.transform:
            image = self.transform(image)

        # Convert label from a list to a tensor
        label = torch.tensor(label, dtype=torch.float32)

        sample = {'Image': image, 'Label': label}

        return sample


In [28]:
# Read in label file (annotations.csv)
df = pd.read_csv(csv_path)
#df = df[:20]


# Convert all labels to floats, NaN if blank
for col in df.columns[1:16]:
    df[col] = pd.to_numeric(df[col], errors='coerce')

# Drop any images with NaNs in label columns
df.dropna(subset=df.columns[:16], inplace=True)
df.reset_index(drop=True, inplace=True)

# Extract ground truth labels for each image in label file (annotations.csv)
labels = []
for index, row in df.iterrows():
    labels.append(list(row.iloc[1:16]))

# Create path for and load in each image in directory
image_names = df['filename'].tolist()
images = []

for img_name in tqdm(image_names):
    image_path = os.path.join(img_path, img_name)
    image = Image.open(image_path)
    if image.mode != 'RGB':
        image = image.convert('RGB')
    images.append(image)

# Split data into train (80%), validation (10%), and test (10%) sets
X_train_val, X_test, y_train_val, y_test = train_test_split(images, labels, test_size=1/10, random_state=seed)
X_train, X_val, y_train, y_val = train_test_split(X_train_val, y_train_val, test_size=1/9, random_state=seed)


### Feature Distribution on the split sets

In [29]:
feature_distribution_train = pd.DataFrame(np.array(y_train).sum(axis=0), columns=['count'], index=df.columns[1:16])
# at_the_bar(feature_distribution_train['count'], 'Feature Distribution - Training Set', 'feature_distribution_train.png', 'res18', 'train', save_dir)a

In [30]:
feature_distribution_val = pd.DataFrame(np.array(y_val).sum(axis=0), columns=['count'], index=df.columns[1:16])
at_the_bar(feature_distribution_val['count'], 'Feature Distribution - Validation Set', 'feature_distribution_val.png', 'res18', 'val', save_dir)

In [31]:
feature_distribution_test = pd.DataFrame(np.array(y_test).sum(axis=0), columns=['count'], index=df.columns[1:16])
at_the_bar(feature_distribution_test['count'], 'Feature Distribution - Test Set', 'feature_distribution_test.png', 'res18', 'test', save_dir)a

In [None]:
res18_train_datatransform = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomRotation(degrees=15),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.RandomAffine(degrees=0, translate=(0.1, 0.1), scale=(0.9, 1.1)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                          std=[0.229, 0.224, 0.225])
])

res18_test_datatransform = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

res18_triv_aug_trans = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.TrivialAugmentWide(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

In [32]:
# *********************************
# Create datasets for each split
val_data = CustomImageDataset(X_val, y_val, data_transform=res18_test_datatransform)
test_data = CustomImageDataset(X_test, y_test, data_transform=res18_test_datatransform)

BATCH_SIZE = 32
val_dataloader = DataLoader(val_data, batch_size=BATCH_SIZE, shuffle=False)
test_dataloader = DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=False)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Inherit the pre-trained weights from the ResNet18 model
weights = models.ResNet18_Weights.DEFAULT
model_res18 = models.resnet18(weights=weights).to(device)

for param in model_res18.parameters():
  param.requires_grad = False

# Set output_shape variable to the number of labels in dataset
input_shape = model_res18.fc.in_features
output_shape = 15


# Recreate the `fc` layer of the pre-trained model to custom output shape
model_res18.fc = torch.nn.Sequential(
    torch.nn.Linear(input_shape,
                    output_shape),
                    #nn.Sigmoid()
).to(device)

#Optimizer
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model_res18.parameters(), lr=0.001)

writer = SummaryWriter(log_dir="runs/my_resnet18_run")
params = {'learning_rate': 0.001, 'batch_size': 32, 'epochs':10}



# Hyperparameter Tuning

---

To optimize performance, a hyperparameter search is conducted. Different combinations of batch size, number of epochs, and learning rate are tested, using both the original data transformations and a new transformation using 'transforms.TriviaAugmentWide'. The results of this search are saved to 'hyperparaeter_testing_Res18.csv' for analysis. The best performing hyperparameter set is then saved to 'best_params_res18'.

In [None]:
num_epochs = [5, 10]
lrs = [.001, .0001, .01]
batch_size = [16, 32, 64]

hyperparameter_combos = list(itertools.product(batch_size, num_epochs, lrs))
results = []

In [33]:
# Create two datasets, one for each transform
train_data_triv_aug = CustomImageDataset(X_train, y_train, data_transform=res18_triv_aug_trans)
train_data_res18_transform = CustomImageDataset(X_train, y_train, data_transform=res18_train_datatransform)

for transform_name, train_data in [("triv_aug", train_data_triv_aug), ("res18_transform", train_data_res18_transform)]:
    for batch_size, epoch, lr in hyperparameter_combos:
        # Create DataLoader based on the current transform
        train_dataloader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True)
        save_dir = f"runs/{transform_name}/lr_{lr}_batch_{batch_size}_epochs_{epoch}"
        os.makedirs(save_dir, exist_ok=True)
        writer = SummaryWriter(log_dir=save_dir)

        params = {'learning_rate': lr, 'batch_size': batch_size, 'epochs': epoch, 'transform': transform_name}
        writer.add_hparams(params, {})

        print(f"Starting training with hyperparameters: lr={lr}, batch_size={batch_size}, epochs={epoch}, transform={transform_name}")

        try:
            train_results = train(model=model_res18,
                                  train_dataloader=train_dataloader,
                                  test_dataloader=val_dataloader,
                                  optimizer=torch.optim.Adam(model_res18.parameters(), lr=lr),
                                  loss_fn=criterion,
                                  epochs=epoch,
                                  device=device,
                                  writer=writer,
                                  params=params)

            final_test_loss = train_results['test_loss'][-1]
            final_test_ham_acc = train_results['test_ham_acc'][-1]
            final_test_zero_one_acc = train_results['test_zero_one_acc'][-1]
            writer.add_hparams(params, {'Final Test Loss': final_test_loss, 'Final Test Hamming Acc': final_test_ham_acc, 'Final Test Zero-One Acc': final_test_zero_one_acc})

            results.append({'transform': transform_name, 'batch_size': batch_size, 'epochs': epoch, 'lr': lr, 'final_test_loss': final_test_loss, 'final_test_ham_acc': final_test_ham_acc, 'final_test_zero_one_acc': final_test_zero_one_acc})

        except Exception as e:
            print(f"An error occurred during training: {e}")
            results.append({'transform': transform_name, 'batch_size': batch_size, 'epochs': epoch, 'lr': lr, 'final_test_loss': float('nan'), 'final_test_ham_acc': float('nan'), 'final_test_zero_one_acc': float('nan')})

        finally:
            writer.close()

results_df = pd.DataFrame(results)
print(results_df)

In [None]:
results_df = pd.DataFrame(results)
results_df['final_test_zero_one_acc'] = results_df['final_test_zero_one_acc'].apply(lambda x: x.item())
file_path = os.path.join(save_dir, "hyperparameter_testing_Res18.csv")
results_df.to_csv(file_path, index=False)

In [None]:
best_row = results_df.loc[results_df['final_test_ham_acc'].idxmax()]

best_params = {
    'transform': best_row['transform'],
    'batch_size': best_row['batch_size'].item(),
    'epochs': best_row['epochs'].item(),
    'lr': best_row['lr'].item(),
    'final_test_loss': best_row['final_test_loss'].item(),
    'final_test_ham_acc': best_row['final_test_ham_acc'].item(),
    'final_test_zero_one_acc': best_row['final_test_zero_one_acc'].item()
}

filepath = 'best_params_res18.json'
with open(filepath, 'w') as f:
    json.dump(best_params, f, indent=4)
print(f"Best parameters saved to {filepath}")
print(best_params)

Best parameters saved to best_params_res18.json
{'transform': 'triv_aug', 'batch_size': 16, 'epochs': 5, 'lr': 0.0001, 'final_test_loss': 0.2857619822025299, 'final_test_ham_acc': 0.87458336353302, 'final_test_zero_one_acc': 0.109375}


In [None]:
# load best hyperparameters
filepath = 'best_params_res18.json'
with open(filepath, 'r') as f:
    best_params = json.load(f)

best_transform = best_params['transform']
if best_transform == 'triv_aug':
    train_data = CustomImageDataset(X_train, y_train, data_transform=res18_triv_aug_trans)
elif best_transform == 'res18_transform':
    train_data = CustomImageDataset(X_train, y_train, data_transform=res18_train_datatransform)
else:
    raise ValueError(f"Unknown transform: {best_transform}")

train_dataloader = DataLoader(train_data, batch_size=best_params['batch_size'], shuffle=True)
val_dataloader = DataLoader(val_data, batch_size=best_params['batch_size'], shuffle=False)

In [34]:
#train with best hyperparameters
optimizer = torch.optim.Adam(model_res18.parameters(), lr=best_params['lr'])
train_results = train(model=model_res18,
                      train_dataloader=train_dataloader,
                      test_dataloader=val_dataloader,
                      optimizer=optimizer,
                      loss_fn=criterion,
                      epochs=best_params['epochs'],
                      device=device)

### Final Testing on the held-out test set

In [35]:
test_loss, test_ham_acc, test_zero_one_acc = test_step(model=model_res18,
                                                        dataloader=test_dataloader,
                                                        loss_fn=criterion,
                                                        device=device)

test_loss = test_loss.item() if isinstance(test_loss, torch.Tensor) else test_loss
test_ham_acc = test_ham_acc.item() if isinstance(test_ham_acc, torch.Tensor) else test_ham_acc
test_zero_one_acc = test_zero_one_acc.item() if isinstance(test_zero_one_acc, torch.Tensor) else test_zero_one_acc

test_results = {
    'Test Loss': test_loss,
    'Test Hamming Accuracy': test_ham_acc,
    'Test Zero-One Accuracy': test_zero_one_acc
}

json_filepath = os.path.join(save_dir, 'test_results_res18.json')
with open(json_filepath, 'w') as jsonfile:
    json.dump(test_results, jsonfile, indent=4)

print(f"Test results saved to {json_filepath}")
print(f"Test Loss: {test_loss:.4f}, Test Hamming Accuracy: {test_ham_acc:.4f}, Test Zero-One Accuracy: {test_zero_one_acc:.4f}")

# Evaluations

---
After training, the model is evaluated on a held-out test set using several metrics. These include:
*   **Accuracy:** Overall correctness of predictions.
*   **Precision:** Proportion of correctly predicted positive labels among all predicted positive labels.
*   **Recall:** Proportion of correctly predicted positive labels among all actual positive labels.
*   **F1-Score:** Harmonic mean of precision and recall
*   **Confusion Matrix:** Visual represnetation of model predictions versus ground truth.
*   **ROC AUC:** Area under the ROC curve, measuring the model's ability to distinguish between classes.


In [36]:
def how_did_i_do(model, test_dataloader, device):
    """Evaluates the model and computes metrics."""
    model.eval()
    y_true = []
    y_prob = []

    with torch.no_grad():
        for batch in test_dataloader:
            inputs = batch['Image'].to(device)
            labels = batch['Label'].to(device)
            outputs = model(inputs)

            y_true.extend(labels.cpu().numpy())
            probabilities = torch.sigmoid(outputs)
            y_prob.extend(probabilities.cpu().numpy())

    y_true = np.array(y_true)
    y_prob = np.array(y_prob)

    y_pred = (y_prob >= 0.5).astype(int)

    accuracy = compute_accuracy(y_true, y_pred)
    precision = compute_precision(y_true, y_pred, average='micro')
    recall = compute_recall(y_true, y_pred, average='micro')
    f1 = compute_f1_score(y_true, y_pred, average='micro')
    confusion_mat = compute_confusion_matrix(y_true.flatten(), y_pred.flatten())
    roc_auc = compute_roc_auc(y_true.flatten(), y_prob.flatten())
    fpr, tpr, roc_auc_value = compute_roc_curve_data(y_true.flatten(), y_prob.flatten())


    # Print the metrics
    print(f"Accuracy: {accuracy}")
    print(f"Precision: {precision}")
    print(f"Recall: {recall}")
    print(f"F1-Score: {f1}")
    print(f"Confusion Matrix:\n{confusion_mat}")
    print(f"ROC AUC: {roc_auc}")
    print(f"FPR: {fpr}")
    print(f"TPR: {tpr}")
    print(f"ROC AUC Value: {roc_auc_value}")

    return accuracy, precision, recall, f1, confusion_mat, roc_auc, fpr, tpr, roc_auc_value, y_true, y_pred

In [37]:
accuracy, precision, recall, f1, confusion_mat, roc_auc, fpr, tpr, roc_auc_value, y_true, y_pred = how_did_i_do(model_res18, test_dataloader, device)

# Visualizations

---
Finally, visualizations are generated to aid in the understanding of the model's performance. A confusion matrix is plotted to show the distribution of correct and incorrect predictions. Plots comparing training and validation loss and accuracy over epochs are generated as well as visualizations for the AUC.


### Confusion Matrix

In [38]:
plot_confusion_matrix(y_true, y_pred, df, 'res18', save_dir)

In [39]:
plot_single_confusion_matrix(model_res18, test_dataloader, device, 'res18', save_dir)

### Training and Validation Comaprisons per Metric

In [40]:
plot_training_validation_metrics(save_dir, train_results, 'res18')

### Area Under the Curve (AUC)

In [41]:
plot_roc_curves(save_dir, model_res18, test_dataloader, device, 'res18')

# Tensor Board

In [None]:
# Viewing TensorBoard in Jupyter and Google Colab Notebooks (uncomment to view full TensorBoard instance)
%load_ext tensorboard
%tensorboard --logdir runs

# References
---
1.   **Gallery of transformations:**<br>
      PyTorch.org. "Plot Transforms Illustrations." PyTorch, https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_illustrations.html#sphx-glr-auto-examples-transforms-plot-transforms-illustrations-py. Accessed 01 November 2024.
2.   **DataLoaders:**<br>
      PyTorch.org. "torch.utils.data.DataLoader — PyTorch 2.1.0+cu118 documentation." PyTorch, https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader. Accessed 03 November 2024.
3.    **PyTorch Deep Learning (Train and Test Loops):**<br>
      Bourke, Daniel. "engine.py." pytorch-deep-learning, https://github.com/mrdbourke/pytorch-deep-learning/blob/main/going_modular/going_modular/engine.py. Accessed 03 November 2024.
4.    **PyTorch Torchvision Blog Post:**<br>
      PyTorch Blog. "How to Train State-of-the-Art Models Using Torchvision’s Latest Primitives." PyTorch, https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/. Accessed 03 November 2024.
5.    **Resnet-18 Video Tutorial:**<br>
      Indomitable Tech. (2022, June 27). "Implement a PreTrained (ResNet18) CNN Model using PyTorch from scratch on a Kaggle Image Dataset". [Online Video]. YouTube. https://www.youtube.com/watch?v=5rD8f1oiuWM. Accessed 18 November 2024.
6.    **Resnet-18 Tutorial:**<br>
      Srivastava, Gaurav. "Implementing ResNet18 for Image Classification". Kaggle. https://www.kaggle.com/code/ggsri123/implementing-resnet18-for-image-classification. Accessed 18 November 2024.
7.    **Tensor Board Tutorial:**<br>
      TensorFlow. (n.d.). Get started with TensorBoard. https://www.tensorflow.org/tensorboard/get_started. 22 November 2024.