<img src = "https://github.com/exponentialR/DL4CV/blob/main/media/BMC_Summer_Course_Deep_Learning_for_Computer_Vision.jpg?raw=true" alt='BMC Summer Course' width='300'/>

# BMC Summer Course: Deep Learning for Computer Vision
## Walkthrough Excercise on ResNet-18 with Custom Dataset

Author: Samuel A.


## **1. Introduction**

In this exercise, you will use the ResNet-18 model we built from scratch in `2.1` to classify images into three categories: `cats`, `dogs`, and `pandas`. This exercise will help you understand how to implement a deep learning model architecture using a defined architexture, prepare a dataset, train the model, and evaluate its performance.


### **What You Will Learn:**
- Import and Instantiating the ResNet-18 Model.
- Preparing a custom dataset.
- Splitting the dataset into training and validation sets.
- Training the ResNet-18 model.
- Evaluating model performance and visualizing results.
- Using TensorBoard to track metrics during training.

## **2. Setup and Preparation**

### **2.1 Import Libraries**

First, we will import the ResNet18 code from `2.1`. We will also import all the necessary libraries. These include PyTorch for building and training the model, torchvision for handling image datasets, and TensorBoard for visualizing the training process.

In [24]:
!pip3 install tensorboard gdown

In [8]:
import requests

In [28]:
!curl -O https://github.com/import requests

resnet_url = 'https://github.com/exponentialR/DL4CV/raw/main/resnet18.py'  # Make sure to use the raw link for files
response = requests.get(resnet_url)

with open('resnet18.py', 'wb') as file:
    file.write(response.content)

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 94298    0 94298    0     0   149k      0 --:--:-- --:--:-- --:--:--  149k
100  170k    0  170k    0     0   272k      0 --:--:-- --:--:-- --:--:--  273k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0curl: (6) Could not resolve host: requests


In [30]:
# Download the utils code
utils_url = 'https://github.com/exponentialR/DL4CV/raw/main/utils.py'
response = requests.get(utils_url)
with open('utils.py', 'wb') as file:
    file.write(response.content)

In [44]:
# Import the model architecture
from resnet18 import ResNet18
from utils import list_files
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split
from torch.utils.tensorboard import SummaryWriter
import matplotlib.pyplot as plt
import numpy as np
import os
import gdown
import zipfile

### Check Device
We will check if a GPU is available. if not, we will use the CPU.


In [14]:
# Set device to GPU if available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Using device: {device}')


Using device: cpu


## 3. Data Preparation 
### 3.1 Download the Dataset
Let's download the dataset and unzip

In [22]:
url = 'https://drive.google.com/uc?id=1qw-Vj5IvLIK3Uqzr2if-xSzuzIqEgEuM'
output = 'animals.zip'

gdown.download(url, output, quiet=False)


Downloading...
From (original): https://drive.google.com/uc?id=1qw-Vj5IvLIK3Uqzr2if-xSzuzIqEgEuM
From (redirected): https://drive.google.com/uc?id=1qw-Vj5IvLIK3Uqzr2if-xSzuzIqEgEuM&confirm=t&uuid=f1b7fbbd-1908-49cb-9bc1-d7dedac05af1
To: C:\Users\sadebayo\OneDrive - Belfast Metropolitan College\Castlereagh\BMC-Summer Course\Deep Learning for CV\Day-2\animals.zip
100%|██████████| 197M/197M [00:05<00:00, 35.7MB/s] 


'animals.zip'

In [23]:
with zipfile.ZipFile('animals.zip', 'r') as zip_ref:
    zip_ref.extractall('animals')

### 3.2 Dataset Structure
The dataset should be organized into three folders: `cats`, `dogs`, and `pandas`. Each folder contains images corresponding to that category.



In [36]:

list_files('animals', limit=5)

├── animals/
│   ├── cats/
│   │   ├── cats_00001.jpg
│   │   ├── cats_00002.jpg
│   │   ├── cats_00003.jpg
│   │   ├── cats_00004.jpg
│   │   ├── cats_00005.jpg
│   ├── dogs/
│   │   ├── dogs_00001.jpg
│   │   ├── dogs_00002.jpg
│   │   ├── dogs_00003.jpg
│   │   ├── dogs_00004.jpg
│   │   ├── dogs_00005.jpg
│   ├── panda/
│   │   ├── panda_00001.jpg
│   │   ├── panda_00002.jpg
│   │   ├── panda_00003.jpg
│   │   ├── panda_00004.jpg
│   │   ├── panda_00005.jpg


### 3.3 Define Transformations
We will resize the images to 224x224 pixels (the expected input size for ResNet-18) and normalize them.

In [38]:
# Define transformations for training and validation datasets
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])


### 3.4 Load and Split the Dataset
We will load the dataset using the ImageFolder class, which automatically assigns labels based on folder names.

In [45]:
# Load the dataset
data_dir = 'animals/animals'
full_dataset = datasets.ImageFolder(root=data_dir, transform=transform)

# Split the dataset into training (80%) and validation (20%) sets
train_size = int(0.8 * len(full_dataset))
val_size = len(full_dataset) - train_size
train_dataset, val_dataset = random_split(full_dataset, [train_size, val_size])

# Create DataLoader objects for training and validation sets
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)


## Implementing ResNet-18

### 4.1 Instantiate the Model (ResNet-18)

In [46]:
model = ResNet18().to(device)

### 4.2 Define Loss Function and Optimizer
We will use Cross-Entropy Loss and the Adam optimizer for training the model.

In [48]:
# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)


## 5. Training the Model 
### 5.1 Set Up TensorBoard
We will setup Tensorboard to visualize training metrics like loss and accuracy.

In [49]:
# Set up TensorBoard
writer = SummaryWriter('runs/ResNet_exercise')

### 5.2 Training Loop
We will train the model for a specified number of epochs, tracking the loss and accuracy. These metrics will be logged to TensorBoard

In [50]:
num_epochs = 10
for epoch in range(num_epochs):  # Number of epochs
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)  # Move to device

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        # Calculate accuracy
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    epoch_loss = running_loss / len(train_loader)
    epoch_acc = 100 * correct / total

    # Log the metrics to TensorBoard
    writer.add_scalar('Training Loss', epoch_loss, epoch)
    writer.add_scalar('Training Accuracy', epoch_acc, epoch)

    print(f'Epoch {epoch + 1}/{num_epochs}, Loss: {epoch_loss:.4f}, Accuracy: {epoch_acc:.2f}%')


Epoch 1/10, Loss: 0.9395, Accuracy: 57.29%


KeyboardInterrupt: 

## Evaluating the Model 
After training, we will evaluate the model on the validation dataset to measure its performance.

In [None]:
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in val_loader:
        inputs, labels = inputs.to(device), labels.to(device)  # Move to device
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

val_acc = 100 * correct / total

# Log the validation accuracy
writer.add_scalar('Validation Accuracy', val_acc, num_epochs)

print(f'Accuracy of the network on the validation images: {val_acc:.2f}%')


## 7. Visualizing Results
We will visualize some validation images along with their predicted and actual labels to see how well the model is performing.

In [None]:
# Function to visualize some validation images and predictions
def visualize_predictions(model, data_loader):
    model.eval()
    data_iter = iter(data_loader)
    images, labels = data_iter.next()
    images, labels = images.to(device), labels.to(device)
    
    with torch.no_grad():
        outputs = model(images)
        _, preds = torch.max(outputs, 1)
    
    # Plot some images with their predictions
    fig, axes = plt.subplots(1, 5, figsize=(15, 3))
    for i in range(5):
        axes[i].imshow(np.transpose(images[i].cpu().numpy(), (1, 2, 0)))
        axes[i].set_title(f'Pred: {preds[i].item()}, True: {labels[i].item()}')
        axes[i].axis('off')
    plt.show()

# Visualize the results
visualize_predictions(model, val_loader)


## 8. Saving and Loading the Model
After training, it’s important to save the model so that you can load it later for inference or further training.

In [None]:
# Save the model
torch.save(model.state_dict(), 'resnet18_cats_dogs_pandas.pth')

# To load the model
# model.load_state_dict(torch.load('resnet18_cats_dogs_pandas.pth'))
# model.eval()


## 9. Conclusion and Further Work
### 9.1 Summary
In this exercise, we built and trained a ResNet-18 model from scratch to classify images of cats, dogs, and pandas. We went through the entire process from implementing the architecture to data preparation, model training, evaluation, and visualization.

### 9.2 Further Work
- Data Augmentation: Experiment with different data augmentation techniques to improve the model's performance.
- Hyperparameter Tuning: Try different learning rates, batch sizes, and optimizers.