<a href="https://colab.research.google.com/github/thestbobo/DLML-Labs/blob/main/MLDL_Lab02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 02: Training a Custom Model


**Objective of this lab**: training a small custom model on the Tiny-ImageNet dataset.

## Dataset preparation

In [35]:
!wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
!unzip tiny-imagenet-200.zip -d tiny-imagenet

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_3979.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_3963.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_7199.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_2752.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_9687.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_9407.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_3603.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_3412.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_6982.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_8496.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_7332.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_9241.JPEG  
  inflating: tiny-imagenet/tiny-imagenet-200/val/images/val_4196.JPEG  

We need to adjust the format of the val split of the dataset to be used with ImageFolder.

In [36]:
import os
import shutil

with open('tiny-imagenet/tiny-imagenet-200/val/val_annotations.txt') as f:
    for line in f:
        fn, cls, *_ = line.split('\t')
        os.makedirs(f'tiny-imagenet/tiny-imagenet-200/val/{cls}', exist_ok=True)

        shutil.copyfile(f'tiny-imagenet/tiny-imagenet-200/val/images/{fn}', f'tiny-imagenet/tiny-imagenet-200/val/{cls}/{fn}')

shutil.rmtree('tiny-imagenet/tiny-imagenet-200/val/images')

In [44]:
from torchvision.datasets import ImageFolder
import torchvision.transforms as T

transform = T.Compose([
    T.Resize((64, 64)),  # Resize to fit the input dimensions of the network
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# root/{classX}/x001.jpg

tiny_imagenet_dataset_train = ImageFolder(root='tiny-imagenet/tiny-imagenet-200/train', transform=transform)
tiny_imagenet_dataset_val = ImageFolder(root='tiny-imagenet/tiny-imagenet-200/val', transform=transform)

In [45]:
print(f"Length of train dataset: {len(tiny_imagenet_dataset_train)}")
print(f"Length of val dataset: {len(tiny_imagenet_dataset_val)}")

# The following code also checks the number of samples per class
from collections import Counter

class_counts = Counter([target for _, target in tiny_imagenet_dataset_val])
for class_label, count in class_counts.items():
    print(f"Class {class_label}: {count} entries")


Length of train dataset: 100000
Length of val dataset: 10000
Class 0: 50 entries
Class 1: 50 entries
Class 2: 50 entries
Class 3: 50 entries
Class 4: 50 entries
Class 5: 50 entries
Class 6: 50 entries
Class 7: 50 entries
Class 8: 50 entries
Class 9: 50 entries
Class 10: 50 entries
Class 11: 50 entries
Class 12: 50 entries
Class 13: 50 entries
Class 14: 50 entries
Class 15: 50 entries
Class 16: 50 entries
Class 17: 50 entries
Class 18: 50 entries
Class 19: 50 entries
Class 20: 50 entries
Class 21: 50 entries
Class 22: 50 entries
Class 23: 50 entries
Class 24: 50 entries
Class 25: 50 entries
Class 26: 50 entries
Class 27: 50 entries
Class 28: 50 entries
Class 29: 50 entries
Class 30: 50 entries
Class 31: 50 entries
Class 32: 50 entries
Class 33: 50 entries
Class 34: 50 entries
Class 35: 50 entries
Class 36: 50 entries
Class 37: 50 entries
Class 38: 50 entries
Class 39: 50 entries
Class 40: 50 entries
Class 41: 50 entries
Class 42: 50 entries
Class 43: 50 entries
Class 44: 50 entries
Clas

In [46]:
import torch
train_loader = torch.utils.data.DataLoader(tiny_imagenet_dataset_train, batch_size=32, shuffle=True, num_workers=8)
val_loader = torch.utils.data.DataLoader(tiny_imagenet_dataset_val, batch_size=32, shuffle=False)

## Custom model definition

In [47]:
import torch
from torch import nn

# Define the custom neural network
class CustomNet(nn.Module):
    def __init__(self, num_classes = 200):
        super(CustomNet, self).__init__()
        # Define layers of the neural network
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1, stride=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(128, 128, kernel_size=3, padding=1)
        self.conv5 = nn.Conv2d(128, 256, kernel_size=3, padding=1)
        self.conv6 = nn.Conv2d(256, 256, kernel_size=3, padding=1)
        self.conv7 = nn.Conv2d(256, 256, kernel_size=3, padding=1)

        # ACTIVATION layer
        self.relu = nn.ReLU()


        # POOLING layer
        self.max_pool = nn.MaxPool2d(kernel_size=2, stride=2)

        # Add more layers...
        self.fc1 = nn.Linear(256 * 8 * 8, 512)
        self.fc2 = nn.Linear(512, num_classes)

    def forward(self, x):
        # Define forward pass

        # INPUT: B x 3 x 224 x 224

        x = self.relu(self.conv1(x)) # B x 64 x 224 x 224
        x = self.relu(self.conv2(x)) # B x 128 x 224 x 224

        x = self.max_pool(x) # B x 128 x 112 x 112

        x = self.relu(self.conv3(x)) # B x 256 x 112 x 112
        x = self.relu(self.conv4(x)) # B x 256 x 112 x 112

        x = self.max_pool(x) # B x 256 x 56 x 56

        x = self.relu(self.conv5(x)) # B x 512 x 56 x 56
        x = self.relu(self.conv6(x)) # B x 512 x 56 x 56
        x = self.relu(self.conv7(x)) # B x 512 x 56 x 56

        x = self.max_pool(x) # B x 512 x 28 x 28

        x = torch.flatten(x, start_dim=1) # Flatten to (B x 512*28*28
        x = self.relu(self.fc1(x)) # B x 200

        x = self.fc2(x)    # B x num_classes


        return x

In [48]:
def train(epoch, model, train_loader, criterion, optimizer):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for batch_idx, (inputs, targets) in enumerate(train_loader):
        # Move data to GPU if available
        inputs, targets = inputs.cuda(), targets.cuda()

        # Reset gradients to zero
        optimizer.zero_grad()

        # Forward pass, compute prediction
        outputs = model(inputs)

        # Compute the loss between predictions and true labels
        loss = criterion(outputs, targets)

        # Backpropagation: compute gradients
        loss.backward()

        # Update model parameters based on gradients
        optimizer.step()

        running_loss += loss.item()
        _, predicted = outputs.max(1)
        total += targets.size(0)
        correct += predicted.eq(targets).sum().item()

        if batch_idx % 50 == 0:
            current_loss = running_loss / (batch_idx + 1)
            current_accuracy = 100. * correct / total
            print(f'Epoch [{epoch}] - Batch [{batch_idx}/{len(train_loader)}] '
                  f'Loss: {current_loss:.4f} | Acc: {current_accuracy:.2f}%')

    train_loss = running_loss / len(train_loader)
    train_accuracy = 100. * correct / total
    print(f'Train Epoch: {epoch} Loss: {train_loss:.6f} Acc: {train_accuracy:.2f}%')



In [49]:
# Validation loop
def validate(model, val_loader, criterion):
    model.eval()
    val_loss = 0

    correct, total = 0, 0

    with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(val_loader):
            # move data to gpu
            inputs, targets = inputs.cuda(), targets.cuda()

            # Forward pass: compute predictions
            outputs = model(inputs)

            # compute loss
            loss = criterion(outputs, targets)

            val_loss += loss.item()

            _, predicted = outputs.max(1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()

    val_loss = val_loss / len(val_loader)
    val_accuracy = 100. * correct / total

    print(f'Validation Loss: {val_loss:.6f} Acc: {val_accuracy:.2f}%')
    return val_accuracy

## Putting everything together

In [50]:
import torch
import torch.nn as nn

if torch.cuda.is_available():
    device = torch.device("cuda")
    torch.cuda.empty_cache()

    print(f"✅ GPU disponibile: {torch.cuda.get_device_name(0)}")
else:
    device = torch.device("cpu")
    print("⚠️ GPU NON disponibile, sto utilizzando la CPU!")

model = CustomNet(num_classes=200).cuda()
criterion = nn.CrossEntropyLoss().cuda()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

best_acc = 0

# Run the training process for {num_epochs} epochs
num_epochs = 10
for epoch in range(1, num_epochs + 1):
    print(f"\nEpoch {epoch}/{num_epochs}:")
    train(epoch, model, train_loader, criterion, optimizer)

    # At the end of each training iteration, perform a validation step
    val_accuracy = validate(model, val_loader, criterion)

    # Best validation accuracy
    best_acc = max(best_acc, val_accuracy)


print(f'Best validation accuracy: {best_acc:.2f}%')


✅ GPU disponibile: Tesla T4

Epoch 1/10:
Epoch [1] - Batch [0/3125] Loss: 5.3072 | Acc: 0.00%
Epoch [1] - Batch [50/3125] Loss: 5.2991 | Acc: 0.43%
Epoch [1] - Batch [100/3125] Loss: 5.2989 | Acc: 0.37%
Epoch [1] - Batch [150/3125] Loss: 5.2989 | Acc: 0.41%
Epoch [1] - Batch [200/3125] Loss: 5.2990 | Acc: 0.40%
Epoch [1] - Batch [250/3125] Loss: 5.2990 | Acc: 0.41%
Epoch [1] - Batch [300/3125] Loss: 5.2988 | Acc: 0.44%
Epoch [1] - Batch [350/3125] Loss: 5.2988 | Acc: 0.44%
Epoch [1] - Batch [400/3125] Loss: 5.2987 | Acc: 0.44%
Epoch [1] - Batch [450/3125] Loss: 5.2987 | Acc: 0.43%
Epoch [1] - Batch [500/3125] Loss: 5.2988 | Acc: 0.42%
Epoch [1] - Batch [550/3125] Loss: 5.2988 | Acc: 0.45%
Epoch [1] - Batch [600/3125] Loss: 5.2987 | Acc: 0.44%
Epoch [1] - Batch [650/3125] Loss: 5.2987 | Acc: 0.44%
Epoch [1] - Batch [700/3125] Loss: 5.2987 | Acc: 0.46%
Epoch [1] - Batch [750/3125] Loss: 5.2987 | Acc: 0.47%
Epoch [1] - Batch [800/3125] Loss: 5.2986 | Acc: 0.48%
Epoch [1] - Batch [850/3125