<a href="https://colab.research.google.com/github/RealBJr/sign-language-classifier/blob/training-initial-steps/model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1) Environment Setup

In [3]:
import torch
import torch.nn as nn
import torchvision.models as models
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

# 2) Data

## 2.1) Create Datasets

Creation of dummy dataset to ensure to ensure that pipeline works:

In [4]:
class DummyImageDataset(Dataset):
    def __init__(self, num_samples=1000, num_classes=10, image_size=(3, 224, 224)):
        self.num_samples = num_samples
        self.num_classes = num_classes
        self.image_size = image_size

    def __len__(self):
        return self.num_samples

    def __getitem__(self, idx):
        image = torch.randn(self.image_size)          # fake image
        label = torch.randint(0, self.num_classes, (1,)).item()  # fake class
        return image, label

## 2.2) Data Splitting

Dummy data splitting in training and validation steps, used for testing pipeline works:

In [5]:
dummy_train_dataset = DummyImageDataset(num_samples=500)
dummy_val_dataset   = DummyImageDataset(num_samples=100)

## 2.3) Data Loaders

Dummy data loaders used for testing pipeline works:

In [6]:
dummy_train_loader = DataLoader(dummy_train_dataset, batch_size=32, shuffle=True)
dummy_val_loader   = DataLoader(dummy_val_dataset, batch_size=32)

## 2.4) Data Preprocessing

# 3) Training Procedure

## 3.1) Load Models

For our project, we are required to compare the performance between fine tuned pre-trained models using transfer learning, and models trained from scratch. Specifically, we will train:

- 3 models using Transfer learning
- 9 models from scratch

Note: Each models will have similar hyperparameters. The 9 different models will come from using **3 different datasets** with **3 different neural network architectures**.

### 3.1.1) Pre-trained models

With pretrained models, the accuracy will theoretically be better because the model doesn't have to learn certain characteristics(classes) all over again. Pre-trained models are available [here](https://docs.pytorch.org/vision/main/models.html).

In [7]:
model_resnet_tl = models.resnet18(weights="IMAGENET1K_V1")
model_mobilenet_tl = models.mobilenet_v2(weights="IMAGENET1K_V1")
model_vgg_tl = models.vgg16(weights="IMAGENET1K_V1")

To get the most benefits out of pre-trained models, one strategy involves **"freezing"** the feature extractor layers. This means that the weight gradients will not be computed for layers defined for general purpose computer vision. The weights will not be updated. This makes it so **the training will be focused on updating the classifier layer** instead of re-learning everything related to image recognition.

Note: This code will freeze all network, later on in **3.2) Classifier Head Replacement**, the classifier layer will be unfroze

In [8]:
for param in model_resnet_tl.parameters():
    param.requires_grad = False

for param in model_mobilenet_tl.parameters():
    param.requires_grad = False

for param in model_vgg_tl.parameters():
    param.requires_grad = False

#### Debug and Test

In [24]:
# print(model_resnet_tl)
# print(model_mobilenet_tl)
# print(model_vgg_tl)

# Check if all weights and biases (aka parameters) do not require gradients
# for name, param in model_resnet_tl.named_parameters():
#     print(name, param.requires_grad)

### 3.1.2) Models from scratch

In [9]:
# To be trained on Dataset A
model_resnet_A = models.resnet18(weights=None)
model_mobilenet_A = models.mobilenet_v2(weights=None)
model_vgg_A = models.vgg16(weights=None)

# To be trained on Dataset B
model_resnet_B = models.resnet18(weights=None)
model_mobilenet_B = models.mobilenet_v2(weights=None)
model_vgg_B = models.vgg16(weights=None)

# To be trained on Dataset C
model_resnet_C = models.resnet18(weights=None)
model_mobilenet_C = models.mobilenet_v2(weights=None)
model_vgg_C = models.vgg16(weights=None)

#### Debug and Test

In [27]:
# print(model_resnet_A)
# print(model_mobilenet_A)
# print(model_vgg_A)

# print(model_resnet_B)
# print(model_mobilenet_B)
# print(model_vgg_B)

# print(model_resnet_C)
# print(model_mobilenet_C)
# print(model_vgg_C)

### 3.1.3) Create a Models Dictionary

In [10]:
models_dict = {
    # Dataset A — from scratch
    "resnet_A": models.resnet18(weights=None),
    "mobilenet_A": models.mobilenet_v2(weights=None),
    "vgg_A": models.vgg16(weights=None),

    # Dataset B — from scratch
    "resnet_B": models.resnet18(weights=None),
    "mobilenet_B": models.mobilenet_v2(weights=None),
    "vgg_B": models.vgg16(weights=None),

    # Dataset C — from scratch
    "resnet_C": models.resnet18(weights=None),
    "mobilenet_C": models.mobilenet_v2(weights=None),
    "vgg_C": models.vgg16(weights=None),

    # Transfer learning models
    "resnet_tl": models.resnet18(weights="IMAGENET1K_V1"),
    "mobilenet_tl": models.mobilenet_v2(weights="IMAGENET1K_V1"),
    "vgg_tl": models.vgg16(weights="IMAGENET1K_V1"),
}

## 3.2) Classifier Head Replacement

Replacing the classifier head means that we are tailoring the last layer(s) of our neural network to the amount of categories that said classifier has to predict

In [11]:
# Util function to replace classifier(aka head) of model
def replace_classifier(model, num_classes):
    if hasattr(model, "fc"):  # ResNet
        model.fc = nn.Linear(model.fc.in_features, num_classes)

    elif hasattr(model, "classifier"):
        # MobileNetV2 or VGG16
        if isinstance(model.classifier, nn.Sequential):
            last_layer = model.classifier[-1]
            model.classifier[-1] = nn.Linear(last_layer.in_features, num_classes)
        else:
            raise ValueError("Unknown classifier structure")

    else:
        raise ValueError("Unsupported model architecture")

    return model

In [12]:
num_classes = 26

for name, model in models_dict.items():
    models_dict[name] = replace_classifier(model, num_classes)

### Debug and Test

In [None]:
# images, labels = next(iter(dummy_train_loader))

# print("--- Output Shape Check ---")
# with torch.no_grad():
#     for name, model in models_dict.items():
#         model.eval()
#         outputs = model(images)
#         print(f"{name}: {outputs.shape}")
#         del outputs

## 3.3) Optimizer + Loss Function Setup

The optimizer is the algorithm uused to update the learning parameters. When giving them to the optimizer, we need to ensure that we give the unfroze parameters, those that can be updated.

In [14]:
# Loss function
criterion = nn.CrossEntropyLoss()

# Optimizers for our models
optimizers = {}
for name, model in models_dict.items():
    optimizers[name] = optim.AdamW(
        filter(lambda p: p.requires_grad, model.parameters()),
        lr=0.001,
        weight_decay=1e-4
    )

### Debug and Test

In [None]:
# images, labels = next(iter(dummy_train_loader))

# print("--- Loss Sanity Check ---")
# with torch.no_grad():
#     for name, model in models_dict.items():
#         model.eval() # Set to evaluation mode
#         outputs = model(images)
#         loss = criterion(outputs, labels)
#         print(f"{name}: loss = {loss.item():.4f}")
#         # Clear large objects from memory
#         del outputs

# print("\n--- Optimizers Loaded ---")
# print(f"Total optimizers: {len(optimizers)}")

## 3.4) Forward Pass and Loss Calculation

This function is meant to be used within the training loop to calculate the loss for given model

In [17]:
def calculate_loss(model, images, labels):
    outputs = model(images)
    loss = criterion(outputs, labels)
    return loss, outputs

### Debug and Test

In [None]:
test_images, test_labels = next(iter(dummy_train_loader))
test_model = models_dict['resnet_A']
test_model.eval()

with torch.no_grad():
    loss_value, outputs_value = calculate_loss(test_model, test_images, test_labels)

print(f'Test Loss: {loss_value.item():.4f}')
print(f'Output Shape: {outputs_value.shape}')

## 3.5) Backpropagation

## 3.6) Weight Update Step

## 3.7) Validation Phase

## 3.8) Optimization

## 3.9) Train and Save Models

# 4) Model Evaluation & Analysis