    # Assignment 2

This assignment serves as a comprehensive evaluation of your machine learning skills, encompassing not only the technical aspects of model development but also your ability to analyze, interpret, and present data insights effectively. As such, it's essential to ensure that your submission is complete, functional, and devoid of any obvious gaps, as if you were delivering this project to a client.

To achieve this, leverage the full capabilities of Markdown and the interactive visualization tools available in Jupyter notebooks to craft a well-structured and visually appealing report of your findings. Your report should clearly communicate the insights you've gained from the exploratory data analysis, the rationale behind your data preprocessing and feature engineering decisions, and a thorough analysis of feature importance. High-quality visualizations and well-organized documentation will not only support your analysis but also make your results more accessible and understandable to your audience.

Remember, the ability to present complex results in an intuitive and engaging manner is a crucial skill, almost as important as the technical proficiency in model building and data analysis. Treat this assignment as an opportunity to showcase your skills in both areas.

## Instructions
- Your submission should be a `.ipynb` file with your name,
  like `FirstnameLastname.ipynb`. It should include the answers to the questions in markdown cells, your data analysis and results.
- You are expected to follow the best practices for code writing and model
training. Poor coding style will be penalized.
- You are allowed to discuss ideas with your peers, but no sharing of code.
Plagiarism in the code will result in failing. If you use code from the
internet, cite it by adding the source of the code as a comment in the first line of the code cell. [Academic misconduct policy](https://wiki.innopolis.university/display/DOE/Academic+misconduct+policy)
- In real life clients can give unclear goals or requirements. So, if the instructions seem vague, use common sense to make reasonable assumptions and decisions.

## Self-Reliance and Exploration
In this task, you're encouraged to rely on your resourcefulness and creativity. Dive into available resources, experiment with various solutions, and learn from every outcome. While our team is here to clarify task details and offer conceptual guidance, we encourage you to first seek answers independently. This approach is vital for developing your problem-solving skills in machine learning.



# Task 2: Image Classification with CNNs (50%)

In this task, you'll dive into the world of Convolutional Neural Networks (CNNs) by working with the CIFAR-10 dataset, a staple in image classification challenges. Your goal is to build and evaluate two different CNN models to classify images into one of the ten categories accurately.

The dataset is availabel in pytorch and keras.

## Part 1: Custom CNN Model (20%)

- Design and train a CNN model from scratch tailored for the CIFAR-10 dataset.
- Focus on the architecture that you believe will perform best for this specific task.
- Integrate various techniques such as batch normalization, dropout, learning rate schedulers, and early stopping to improve model training. Experiment with these methods and finetune them to see how they affect training stability, convergence speed, and overall performance.

## Part 2: Transfer Learning Model (20%)

- Implement a transfer learning approach using a pre-trained model of your choice.
- Fine-tune the model on the CIFAR-10 dataset to achieve the best possible performance.

## Evaluation (10%)

Ensure that both models are robust and generalized well to unseen data.

After training both models, you will evaluate them on a provided test dataset.

Compare your models based on:
- **AUC-ROC**: How well does each model discriminate between classes?
- **Model Size**: Consider the trade-offs in model complexity.
- **Inference Speed**: Evaluate how quickly your model can predict classes for new images.

Reflect on the performance, size, and inference speed of both models. What insights can you draw from these comparisons?

### Learning Objectives

- Understand and apply CNNs for image classification.
- Explore the impact of model architecture on performance and efficiency.
- Learn the process and benefits of transfer learning in deep learning.

Remember, the key to this task is not just about achieving the highest accuracy but also understanding the strengths and limitations of different approaches in machine learning model development.

In [3]:
# Lab 10 code
import torch
from torch.utils import data
from torchvision import datasets, transforms

train_batch_size = 128
test_batch_size = 128

# We need this transofrmations to make our model more generalized on the unknown data and avoid overfitting
train_transforms = transforms.Compose([
    transforms.RandomCrop(32, padding=2),# add zeros to save initial size of an image after transformations
    transforms.RandomHorizontalFlip(), # FLips the image w.r.t horizontal axis
    transforms.RandomRotation(10),     #Rotates the image to a specified angel
    transforms.RandomAffine(0, shear=10, scale=(0.8,1.2)), #Performs actions like zooms, change shear angles.
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# We do not modify test transforms, because it will corrupt test data
test_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
# Load train data
train_dataset = datasets.CIFAR10(root='cifar10',
                                 train=True,
                                 transform=train_transforms,
                                 download=True)

train_data_loader = data.DataLoader(train_dataset,
                                    batch_size=train_batch_size,
                                    shuffle=True,
                                    drop_last=True,
                                    num_workers=2)


# Load test data
test_dataset = datasets.CIFAR10(root='cifar10',
                                 train=False,
                                 transform=test_transforms,
                                 download=True)

test_data_loader = data.DataLoader(test_dataset,
                                    batch_size=test_batch_size,
                                    shuffle=False,
                                    num_workers=2)
print(train_dataset[0][0].shape)

Files already downloaded and verified
Files already downloaded and verified
torch.Size([3, 32, 32])


In [4]:
# Let's build a model
# Lab10 code, but I change activation function to LeakyRelu. It allows to avoid gradient vanishing.
# for Rely everything < 0 is 0, but leaky Relu assigns small values for these variables
# Reference: https://www.baeldung.com/cs/relu-vs-leakyrelu-vs-prelu
import torch
import torch.nn as nn
import torch.nn.functional as F
# !nvidia-smi
# !pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
# !pip list
# !pip install torch torchvision torchaudio
# from tensorflow.python.client import device_lib
# print(device_lib.list_local_devices())
class Cifar10_model(nn.Module):
    def __init__(self):
        super(Cifar10_model, self).__init__()
        # 1st convolutional layer
        self.conv1 = nn.Sequential(
            nn.Conv2d(
                in_channels=3,
                out_channels=16,
                kernel_size=3),
            nn.LeakyReLU(),
            nn.MaxPool2d(2, 2),
            nn.BatchNorm2d(16)
        )
        # 2nd convolutional layer
        self.conv2 = nn.Sequential(
            nn.Conv2d(16, 32, 3),
            nn.LeakyReLU(),
            nn.BatchNorm2d(32),
            nn.Dropout(0.25)
        )
        # 3rd convolutional layer
        self.conv3 = nn.Sequential(
            nn.Conv2d(32, 64, 3),
            nn.LeakyReLU(),
            nn.BatchNorm2d(64),
        )
        # output layer
        self.linear1 = nn.Sequential(
            nn.Linear(64*11*11, 256),
            nn.LeakyReLU(),
            nn.Dropout(0.1),
            nn.Linear(256, 10)
        )


    def forward(self, x):
        # Propagate x through the network
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = torch.flatten(x, 1)
        x = self.linear1(x)
        return F.log_softmax(x, dim=1)

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = Cifar10_model().to(device)

print(f'Device: {device}')

print(model)

Device: cuda
Cifar10_model(
  (conv1): Sequential(
    (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (conv2): Sequential(
    (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
    (2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25, inplace=False)
  )
  (conv3): Sequential(
    (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
    (1): LeakyReLU(negative_slope=0.01)
    (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (linear1): Sequential(
    (0): Linear(in_features=7744, out_features=256, bias=True)
    (1): LeakyReLU(negative_slope=0.01)
    (2): Dropout(p=0.1, inplace=False)
    (3): Linear(in_features=256, out_

In [5]:
# Lab 10 code
import operator
import numpy as np

class EarlyStopping():
    def __init__(self, tolerance=5, min_delta=0, mode='min'):
        '''
        :param tolerance: number of epochs that the metric doesn't improve
        :param min_delta: minimum improvement
        :param mode: 'min' or 'max' to minimize or maximize the metric
        '''

        '''
        You should keep these parameters,
        define a counter of __call__ falses and the previous best value of metric
        '''
        self.tolerance = tolerance
        self.min_delta = min_delta
        self.mode = mode
        self.counter = 0
        self.early_stop = False
        self.prev_metric = np.inf if mode == 'min' else -np.inf
        self.operation = operator.gt if mode == 'min' else operator.lt


    def __call__(self, metric)->bool:
        ''' This function should return True if `metric` is not improving for
            'tolerance' calls
        '''
        delta = (metric - self.prev_metric)

        if self.operation(delta, self.min_delta):
            self.counter +=1
        else:
            self.counter = 0
            self.prev_metric = metric

        if self.counter >= self.tolerance:
            self.early_stop = True
        return self.early_stop

In [6]:
# Lab 10 code, learning rate scheduler. It's applied to choose the best lr during
# the training process
from torch.optim import lr_scheduler
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Reference: https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
LRs = {"ReduceLROnPlateau": lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.3,
                                                           patience=10, verbose=True,min_lr=0.001),
       "Step LR": lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5),
       "Exponent LR": lr_scheduler.ExponentialLR(optimizer, gamma=0.9),
       "Cyclic LR":lr_scheduler.CyclicLR(optimizer, base_lr=0.01, max_lr=0.2,
                                         cycle_momentum=False, step_size_up=10)}




In [7]:
# Lab 10 code train and test functions
# ! pip install tqdm
from time import time
from tqdm import tqdm


def train(model, device, train_loader, criterion, optimizer, epoch):
    model.train()
    epoch_loss = 0
    start_time = time()
    correct = 0
    iteration = 0
    bar = tqdm(train_loader)
    for data, target in bar:
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        # Get the index of the max log-probability
        pred = output.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()

        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()
        iteration += 1
        bar.set_postfix({"Loss": format(epoch_loss/iteration, '.6f')})

    acc = 100. * correct / len(train_loader.dataset)
    print(f'\rTrain Epoch: {epoch}, elapsed time:{time()-start_time:.2f}s')
    return epoch_loss, acc


def test(model, device, test_loader, criterion):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    acc = 100. * correct / len(test_loader.dataset)
    return test_loss, acc

In [8]:
from torch.optim import SGD
from copy import deepcopy

# Define hyperparams
epochs = 100
criterion = nn.CrossEntropyLoss()
optimizer = SGD(model.parameters(), lr=0.1, momentum=0.9)
# Choosing LR
scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.3,
                                                           patience=3, verbose=True, min_lr=0.001)
early_stopping = EarlyStopping(tolerance=7, mode='min')
best_model_wts = deepcopy(model.state_dict())

In [9]:
# Lab10 code, training function
import torch.optim as optim
from torch.utils.tensorboard import SummaryWriter
import copy

def training(writing=False):
    if writing:
        writer = SummaryWriter(log_dir='runs/model')
    best_acc = 0.0
    for epoch in range(1, epochs + 1):
        train_loss, train_acc = train(model, device, train_data_loader, criterion, optimizer, epoch)
        # Update learning rate if needed
        scheduler.step(train_loss)

        test_loss, test_acc = test(model, device, test_data_loader, criterion)
        # Terminate training if loss stopped to decrease
        if early_stopping(test_loss):
            print('\nEarly stopping\n')
            break
        # Deep copy the weight of model if its accuracy is the best for now
        if test_acc > best_acc:
            best_acc = test_acc
            best_model_wts = copy.deepcopy(model.state_dict())
        if writing:
            writer.add_scalars('Loss',
                            {
                                'train': train_loss,
                                'test': test_loss
                            },
                            epoch)

            writer.add_scalars('Accuracy',
                            {
                                'train': train_acc,
                                'test': test_acc
                            },
                            epoch)
        else:
            print(f"Training accuracy {train_acc}, test accuracy {test_acc}")
            print(f"Training loss {train_loss}, test loss {test_loss}")

    torch.save(model.state_dict(), "model_task2.pt")
    model.load_state_dict(best_model_wts)
    torch.save(model.state_dict(), "best_model_task2.pt")
    if writing:
        writer.close()
# Train and save model that performs the best
training()

100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.09it/s, Loss=1.759929]


Train Epoch: 1, elapsed time:19.42s
Training accuracy 36.992, test accuracy 46.74
Training loss 686.3724775314331, test loss 118.79534685611725


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.42it/s, Loss=1.474310]


Train Epoch: 2, elapsed time:19.10s
Training accuracy 48.036, test accuracy 56.84
Training loss 574.9807734489441, test loss 96.85349214076996


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 19.55it/s, Loss=1.413511]


Train Epoch: 3, elapsed time:19.95s
Training accuracy 50.72, test accuracy 55.63
Training loss 551.2694227695465, test loss 99.55080050230026


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.47it/s, Loss=1.382717]


Train Epoch: 4, elapsed time:19.05s
Training accuracy 52.04, test accuracy 59.76
Training loss 539.2594690322876, test loss 94.69908374547958


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.48it/s, Loss=1.323562]


Train Epoch: 5, elapsed time:19.05s
Training accuracy 54.23, test accuracy 58.37
Training loss 516.1890986561775, test loss 100.48216331005096


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:20<00:00, 19.08it/s, Loss=1.266691]


Train Epoch: 6, elapsed time:20.45s
Training accuracy 55.796, test accuracy 64.07
Training loss 494.00963670015335, test loss 82.98117882013321


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 19.54it/s, Loss=1.212559]


Train Epoch: 7, elapsed time:19.96s
Training accuracy 57.624, test accuracy 63.69
Training loss 472.8981402516365, test loss 82.84106373786926


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:20<00:00, 19.36it/s, Loss=1.163508]


Train Epoch: 8, elapsed time:20.15s
Training accuracy 59.316, test accuracy 63.74
Training loss 453.7681695818901, test loss 79.37410682439804


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:21<00:00, 18.43it/s, Loss=1.137337]


Train Epoch: 9, elapsed time:21.16s
Training accuracy 60.306, test accuracy 66.35
Training loss 443.5612493753433, test loss 75.68856209516525


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 19.65it/s, Loss=1.103366]


Train Epoch: 10, elapsed time:19.85s
Training accuracy 61.35, test accuracy 68.75
Training loss 430.3127566576004, test loss 70.1827597618103


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.61it/s, Loss=1.072607]


Train Epoch: 11, elapsed time:18.92s
Training accuracy 62.386, test accuracy 69.58
Training loss 418.31668573617935, test loss 69.09827655553818


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.54it/s, Loss=1.051317]


Train Epoch: 12, elapsed time:18.99s
Training accuracy 63.212, test accuracy 70.49
Training loss 410.0134735107422, test loss 66.27765420079231


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.33it/s, Loss=1.036140]


Train Epoch: 13, elapsed time:19.19s
Training accuracy 63.832, test accuracy 71.59
Training loss 404.0945081114769, test loss 66.14741206169128


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.51it/s, Loss=1.003804]


Train Epoch: 14, elapsed time:19.02s
Training accuracy 64.888, test accuracy 71.59
Training loss 391.48339158296585, test loss 62.793825507164


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.24it/s, Loss=0.996574]


Train Epoch: 15, elapsed time:19.28s
Training accuracy 65.164, test accuracy 72.79
Training loss 388.66377317905426, test loss 63.978801250457764


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.51it/s, Loss=0.988772]


Train Epoch: 16, elapsed time:19.02s
Training accuracy 65.524, test accuracy 72.6
Training loss 385.62103176116943, test loss 62.828800678253174


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.64it/s, Loss=0.976183]


Train Epoch: 17, elapsed time:18.90s
Training accuracy 65.884, test accuracy 72.15
Training loss 380.7114289999008, test loss 65.20055288076401


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.55it/s, Loss=0.967950]


Train Epoch: 18, elapsed time:18.98s
Training accuracy 66.416, test accuracy 72.93
Training loss 377.5005541443825, test loss 63.37118446826935


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.58it/s, Loss=0.951899]


Train Epoch: 19, elapsed time:18.95s
Training accuracy 66.85, test accuracy 75.03
Training loss 371.2407942414284, test loss 59.16467934846878


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.54it/s, Loss=0.954532]


Train Epoch: 20, elapsed time:18.99s
Training accuracy 66.63, test accuracy 73.63
Training loss 372.2675054073334, test loss 60.28387135267258


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.42it/s, Loss=0.949706]


Train Epoch: 21, elapsed time:19.10s
Training accuracy 67.13, test accuracy 73.52
Training loss 370.3854983150959, test loss 61.892553210258484


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.54it/s, Loss=0.932456]


Train Epoch: 22, elapsed time:18.99s
Training accuracy 67.552, test accuracy 73.23
Training loss 363.6579886674881, test loss 61.370517402887344


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.39it/s, Loss=0.932453]


Train Epoch: 23, elapsed time:19.13s
Training accuracy 67.508, test accuracy 73.63
Training loss 363.6568278670311, test loss 59.50195360183716


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.55it/s, Loss=0.915039]


Train Epoch: 24, elapsed time:18.98s
Training accuracy 68.238, test accuracy 74.43
Training loss 356.8652173280716, test loss 60.36752462387085


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.48it/s, Loss=0.914151]


Train Epoch: 25, elapsed time:19.05s
Training accuracy 68.656, test accuracy 75.83
Training loss 356.5188881158829, test loss 56.36476814746857


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.51it/s, Loss=0.908252]


Train Epoch: 26, elapsed time:19.02s
Training accuracy 68.416, test accuracy 74.18
Training loss 354.218409717083, test loss 58.37699991464615


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.57it/s, Loss=0.906390]


Train Epoch: 27, elapsed time:18.97s
Training accuracy 68.554, test accuracy 75.89
Training loss 353.49191880226135, test loss 59.52516037225723


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.39it/s, Loss=0.891354]


Train Epoch: 28, elapsed time:19.13s
Training accuracy 69.11, test accuracy 75.8
Training loss 347.6280128955841, test loss 56.995022654533386


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.44it/s, Loss=0.896702]


Train Epoch: 29, elapsed time:19.08s
Training accuracy 68.68, test accuracy 75.16
Training loss 349.71378725767136, test loss 61.40827256441116


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.56it/s, Loss=0.892294]


Train Epoch: 30, elapsed time:18.97s
Training accuracy 69.098, test accuracy 76.55
Training loss 347.99453753232956, test loss 56.24958062171936


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.50it/s, Loss=0.889331]


Train Epoch: 31, elapsed time:19.02s
Training accuracy 69.264, test accuracy 75.89
Training loss 346.8389399051666, test loss 57.61089268326759


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.56it/s, Loss=0.875154]


Train Epoch: 32, elapsed time:18.97s
Training accuracy 69.52, test accuracy 75.66
Training loss 341.3101750612259, test loss 56.38264226913452


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.09it/s, Loss=0.871805]


Train Epoch: 33, elapsed time:19.42s
Training accuracy 69.926, test accuracy 75.63
Training loss 340.00379210710526, test loss 55.783997893333435


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.29it/s, Loss=0.868242]


Train Epoch: 34, elapsed time:19.23s
Training accuracy 69.858, test accuracy 74.13
Training loss 338.61432310938835, test loss 58.233831226825714


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.42it/s, Loss=0.881464]


Train Epoch: 35, elapsed time:19.10s
Training accuracy 69.624, test accuracy 76.23
Training loss 343.7708355784416, test loss 54.443907141685486


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.55it/s, Loss=0.866576]


Train Epoch: 36, elapsed time:18.98s
Training accuracy 69.968, test accuracy 76.35
Training loss 337.9647201895714, test loss 54.750521540641785


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.54it/s, Loss=0.862750]


Train Epoch: 37, elapsed time:18.99s
Training accuracy 70.072, test accuracy 76.21
Training loss 336.47255742549896, test loss 54.901883363723755


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.43it/s, Loss=0.855293]


Train Epoch: 38, elapsed time:19.09s
Training accuracy 70.472, test accuracy 76.46
Training loss 333.56410759687424, test loss 57.83837714791298


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.48it/s, Loss=0.866643]


Train Epoch: 39, elapsed time:19.04s
Training accuracy 70.236, test accuracy 76.19
Training loss 337.990831553936, test loss 57.16347414255142


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.52it/s, Loss=0.844675]


Train Epoch: 40, elapsed time:19.00s
Training accuracy 70.666, test accuracy 77.86
Training loss 329.4233305454254, test loss 51.673425287008286


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.27it/s, Loss=0.852303]


Train Epoch: 41, elapsed time:19.24s
Training accuracy 70.502, test accuracy 76.33
Training loss 332.3979831337929, test loss 53.6647364795208


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.41it/s, Loss=0.850912]


Train Epoch: 42, elapsed time:19.11s
Training accuracy 70.908, test accuracy 75.95
Training loss 331.8557026386261, test loss 55.18781340122223


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.42it/s, Loss=0.835919]


Train Epoch: 43, elapsed time:19.10s
Training accuracy 71.216, test accuracy 75.66
Training loss 326.0084396004677, test loss 62.09466230869293


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.47it/s, Loss=0.842795]


Train Epoch: 44, elapsed time:19.05s
Training accuracy 70.964, test accuracy 75.77
Training loss 328.69006472826004, test loss 57.252180606126785


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.38it/s, Loss=0.835392]


Train Epoch: 45, elapsed time:19.14s
Training accuracy 71.276, test accuracy 77.66
Training loss 325.8028270602226, test loss 55.63011571764946


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:19<00:00, 20.36it/s, Loss=0.828060]


Train Epoch: 46, elapsed time:19.16s
Training accuracy 71.376, test accuracy 77.8
Training loss 322.94333946704865, test loss 53.88644599914551


100%|█████████████████████████████████████████████████████████████████| 390/390 [00:18<00:00, 20.54it/s, Loss=0.825159]


Train Epoch: 47, elapsed time:18.99s

Early stopping


Let's evaluate this model on train dataset

In [11]:
# let's evaluate cnn model on test data
import numpy as np
import tensorflow as tf
from torch.utils.data import TensorDataset, DataLoader
# Upload, normalize, and convert to tensor test images
test_images = np.load('task_2_test_images.npy')
# we need to reshape the inputs to pass it to the neural network
test_images_reshaped = np.moveaxis(test_images, 3, 1)
test_labels = np.load('task_2_test_labels.npy')
test_images_tensor = torch.tensor(test_images_reshaped, dtype=torch.float32)
test_labels_tensor = torch.tensor(test_labels, dtype=torch.long)
test_images_tensor /= 255
#Reference https://www.geeksforgeeks.org/save-and-load-models-in-pytorch/
# Let's load CNN model
cnn_model = Cifar10_model()
cnn_model.load_state_dict(torch.load('best_model_task2.pt'))
output = cnn_model(test_images_tensor)
pred = output.argmax(dim=1, keepdim=True)

In [17]:
# Let's print AUC-ROC score
# Reference: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelBinarizer.html
# We need to perform it due to multi-class classification
from sklearn.preprocessing import LabelBinarizer
from sklearn import metrics
lb = LabelBinarizer()
lb.fit(test_labels)
test_labels_bin = lb.transform(test_labels)
pred_bin = lb.transform(pred.numpy())
auc_score_array = metrics.roc_auc_score(test_labels_bin, pred_bin, average=None, multi_class='ovo')
print("Mean AUC-ROC score for CNN model:", np.mean(auc_score_array))

Mean AUC-ROC score for CNN model: 0.8002385042738105


In [None]:
# Let's build and finetune pretrained model

In [19]:
# Lab 10 self-practise code
import torchvision.models as models
batch_size = 32
# Create pretrained model
pretrained_model =  models.resnet50(pretrained=True)
pretrained_model



ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [21]:
# Change input layer according to cifar10
pretrained_model.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)

# replace number of output classes
num_features = pretrained_model.fc.in_features
pretrained_model.fc = nn.Linear(num_features, 10)

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
pretrained_model = pretrained_model.to(device)

In [23]:
# Lab 10 self-practise code
from torch.optim import SGD
from torch.optim import lr_scheduler
from copy import deepcopy

# Again, define hyperparameters (we don't need large number of epochs)
epochs = 10
criterion = nn.CrossEntropyLoss()
optimizer = SGD(pretrained_model.parameters(), lr=0.01, momentum=0.9)
scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, 'min', factor=0.3, patience=2, verbose=True, min_lr=0.0001)




In [25]:
# Lab 10 code
# Train finetunde model
def training():
    for epoch in range(0, epochs):
        train_loss, train_acc = train(pretrained_model, device, train_data_loader, criterion, optimizer, epoch)
        # Update learning rate if needed
        scheduler.step(train_loss)
        test_loss, test_acc = test(pretrained_model, device, test_data_loader, criterion)
        print(f"Training accuracy {train_acc}, test accuracy {test_acc}")
        print(f"Training loss {train_loss}, test loss {test_loss}")

    torch.save(pretrained_model.state_dict(), "model_finetuned.pt")

training()

100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.50it/s, Loss=1.251244]


Train Epoch: 0, elapsed time:60.03s
Training accuracy 55.544, test accuracy 70.62
Training loss 487.9851709008217, test loss 74.36300724744797


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.48it/s, Loss=0.735274]


Train Epoch: 1, elapsed time:60.18s
Training accuracy 74.116, test accuracy 80.66
Training loss 286.7569063901901, test loss 43.37174245715141


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.49it/s, Loss=0.585380]


Train Epoch: 2, elapsed time:60.11s
Training accuracy 79.494, test accuracy 83.07
Training loss 228.2980423271656, test loss 38.59309211373329


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.48it/s, Loss=0.494268]


Train Epoch: 3, elapsed time:60.21s
Training accuracy 82.5, test accuracy 85.69
Training loss 192.76445445418358, test loss 32.54959836602211


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.47it/s, Loss=0.430460]


Train Epoch: 4, elapsed time:60.28s
Training accuracy 84.752, test accuracy 87.2
Training loss 167.8792096376419, test loss 29.644398786127567


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.48it/s, Loss=0.382085]


Train Epoch: 5, elapsed time:60.22s
Training accuracy 86.732, test accuracy 87.97
Training loss 149.0130054950714, test loss 27.017413392663002


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:01<00:00,  6.38it/s, Loss=0.347548]


Train Epoch: 6, elapsed time:61.18s
Training accuracy 87.662, test accuracy 86.95
Training loss 135.54390709102154, test loss 30.403337463736534


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.48it/s, Loss=0.324499]


Train Epoch: 7, elapsed time:60.20s
Training accuracy 88.576, test accuracy 88.74
Training loss 126.55458089709282, test loss 27.252283424139023


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.47it/s, Loss=0.296667]


Train Epoch: 8, elapsed time:60.25s
Training accuracy 89.296, test accuracy 89.19
Training loss 115.70014302432537, test loss 25.384615778923035


100%|█████████████████████████████████████████████████████████████████| 390/390 [01:00<00:00,  6.46it/s, Loss=0.279331]


Train Epoch: 9, elapsed time:60.38s
Training accuracy 90.026, test accuracy 89.61
Training loss 108.9392597079277, test loss 24.53753024339676




Model size of resnet50 is much bigger that for my CNN model. It requires more memory, more CPU/GPU power, takes more time to learn (we can see it in the training process (number of iterations per second)). Inference speed of CNN implemented from scratch is much faster as I have less layers and less complex structure than resNet50. Resnet50 is a very large model for complex classification tasks (10000 classes), so usually it is used for more complex tasks.

Let's obtain predictions on test data

In [27]:
from PIL import Image
from torch.utils.data import TensorDataset, DataLoader
test_images = np.load('task_2_test_images.npy')
test_labels = np.load('task_2_test_labels.npy')
# apply same transofrm as for cifar10 dataset
train_transforms = transforms.Compose([
    transforms.RandomCrop(32, padding=2),# add zeros to save initial size of an image after transformations
    transforms.RandomHorizontalFlip(), # FLips the image w.r.t horizontal axis
    transforms.RandomRotation(10),     #Rotates the image to a specified angel
    transforms.RandomAffine(0, shear=10, scale=(0.8,1.2)), #Performs actions like zooms, change shear angles.
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
transformed_images = []
# Process all images one by one (I asked chatGPT to help me with this for loop. I tried to process images alltogether, but it suggests to process it one by one if for loop)
for image_array in test_images:
    # Convert numpy array to PIL image
    image_pil = Image.fromarray(np.uint8(image_array))

    # Apply transformations
    transformed_image = train_transforms(image_pil)

    # Convert transformed image back to numpy array
    transformed_image_array = np.array(transformed_image)

    # Append transformed image to list
    transformed_images.append(transformed_image_array)
# Convert list of transformed images to numpy array
transformed_images_array = np.array(transformed_images)
# Convert to tensor
test_images_tensor = torch.tensor(transformed_images_array, dtype=torch.float32)
test_labels_tensor = torch.tensor(test_labels, dtype=torch.long)
# Create Tensor test dataset
test_dataset_transformed = TensorDataset(test_images_tensor, test_labels_tensor)
# Create test data loader
test_dataset_loader = DataLoader(test_dataset_transformed, batch_size=128, shuffle=False)


In [29]:
# Calculate AUC-ROC score for pretrained model
from sklearn import metrics
pretrained_model.eval()
test_labels_ans = []
test_predictions = []
with torch.no_grad():
    for data, target in test_dataset_loader:
        data, target = data.to(device), target.to(device)
        output = pretrained_model(data)
        probabilities = nn.functional.softmax(output, dim=1)
        test_labels_ans.extend(target.cpu().numpy())
        test_predictions.extend(probabilities.cpu().numpy())
test_labels_ans_onehot = np.eye(10)[test_labels_ans]
micro_auc = metrics.roc_auc_score(test_labels_ans_onehot, np.array(test_predictions), average='micro')
print(f"Micro-average AUC: {micro_auc}")


Micro-average AUC: 0.9696829625639631


My custom CNN shows not bad performance on test dataset given for the task, but AUC-ROC score of the complex model is almost 1. It means that pretrained model performs well on test data and this is reasonable to use such approach.
