## Assessment 1: Deep Learning

1) Answer all questions.
2) This assessment is open-book. You are allowed to refer to any references including online materials, books, notes, codes, github links, etc.
3) Copy this notebook to your google drive (click **FILE** > **save a copy in Drive**)
4) Upload the answer notebook to your github. 
5) Submit the assessment by sharing the link to your answer notebook. 





**QUESTION 1** 

One day while wandering around a clothing store at KL East Mall, you stumbled upon a pretty girl who is choosing a dress for Hari Raya. It turns out that the girl is visually impaired and had a hard time distinguishing between an abaya and a kebaya. To help people with the similar situation, you then decided to develop an AI system to identify the type of clothes using a Convolutional Neural Networks (ConvNet). In order to train the network, you decide to use the Fashion MNIST dataset which is freely available on Pytorch.


a) Given the problem, what is the most appropriate loss function to use? Justify your answer. **[5 marks]**


<span style="color:blue">
    ANSWER: YOUR ANSWER HERE (Cross Entropy Loss)</span> 

b) Create and train a ConvNet corresponding to the following CNN architecture (with a modification of the final layer to address the number of classes). Please include **[10 marks]**:

    1) The dataloader to load the train and test datasets.

    2) The model definition (either using sequential method OR pytorch class method).

    3) Define your training loop.

    4) Output the mean accuracy for the whole testing dataset.

    

<div>
<img src="https://vitalflux.com/wp-content/uploads/2021/11/VGG16-CNN-Architecture.png" width="550"/>
</div>


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import time
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import glob
import numpy
import random
import pandas as pd
import torchvision
import torchvision.transforms as transforms

from PIL import Image
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision import datasets, models, transforms
from torchsummary import summary
from torchvision.transforms import ToTensor

###############################################
######## THE REST OF YOUR CODES HERE ##########
###############################################

# Applying Transforms to the Data
image_transforms = {
    'train': transforms.Compose([
        transforms.Resize(32),        # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        #transforms.Grayscale(num_output_channels=3),    # Convert image to return 3 channel output image
        #transforms.RandomResizedCrop(size=32, scale=(0.8, 1.0)),
        #transforms.RandomRotation(degrees=15),
        #transforms.RandomHorizontalFlip(),
        #transforms.CenterCrop(size=32),
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])      # 3 Channel Gray image normalization
    ]),

    'test': transforms.Compose([
        transforms.Resize(size=32),   # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        #transforms.Grayscale(num_output_channels=3),    # Convert image to return 3 channel output image
        #transforms.CenterCrop(size=32),
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])       # 3 Channel Gray image normalization
    ])
}

# Loading the Data
batch_size = 32

trainset = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=True,
                                        download=True, transform=image_transforms['train'])
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=False,
                                       download=True, transform=image_transforms['test'])
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)

classes = ('T-Shirt', 'Trouser', 'Pullover', 'Dress',
       'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')

train_data_size = len(trainloader.dataset)
test_data_size = len(testloader.dataset)

print(train_data_size)
print(test_data_size)

# Creating the Model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        #self.conv3 = nn.Conv2d(128, 256, 1)
        #self.conv4 = nn.Conv2d(256, 512, 1)
        #self.conv5 = nn.Conv2d(512, 512, 1)
        self.pool = nn.MaxPool2d(2, 2)              # kernel size=1x1, stride=2
        self.fc6 = nn.Linear(16 * 5 * 5, 200)     # output image size=1x1, channels=512   (if use 224 size image, output size is 7)
        self.fc7 = nn.Linear(200, 100)
        self.fc8 = nn.Linear(100, 10)              # output class=10
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        #x = self.pool(self.relu(self.conv3(x)))
        #x = self.pool(self.relu(self.conv4(x)))
        #x = self.pool(self.relu(self.conv5(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = self.relu(self.fc6(x))
        x = self.relu(self.fc7(x))
        x = self.relu(self.fc8(x))
        return x

model = CNN()

# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Move the model to GPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model.to(device)

# Calculate the training time
import time 

def train_and_validate(model, loss_criterion, optimizer, epochs=25):
    '''
    Function to train and validate
    Parameters
        :param model: Model to train and validate
        :param loss_criterion: Loss Criterion to minimize
        :param optimizer: Optimizer for computing gradients
        :param epochs: Number of epochs (default=25)
  
    Returns
        model: Trained Model with best validation accuracy
        history: (dict object): Having training loss, accuracy and validation loss, accuracy
    '''
    
    start = time.time()
    history = []
    best_acc = 0.0

    for epoch in range(epochs):
        epoch_start = time.time()
        print("Epoch: {}/{}".format(epoch+1, epochs))
        
        # Set to training mode
        model.train()
        
        # Loss and Accuracy within the epoch
        train_loss = 0.0
        train_acc = 0.0
        
        valid_loss = 0.0
        valid_acc = 0.0
        
        for i, (inputs, labels) in enumerate(trainloader):

            inputs = inputs.to(device)
            labels = labels.to(device)
            
            # Clean existing gradients
            optimizer.zero_grad()
            
            # Forward pass - compute outputs on input data using the model
            outputs = model(inputs)
            
            # Compute loss
            loss = loss_criterion(outputs, labels)
            
            # Backpropagate the gradients
            loss.backward()
            
            # Update the parameters
            optimizer.step()
            
            # Compute the total loss for the batch and add it to train_loss
            train_loss += loss.item() * inputs.size(0)
            
            # Compute the accuracy
            ret, predictions = torch.max(outputs.data, 1)
            correct_counts = predictions.eq(labels.data.view_as(predictions))
            
            # Convert correct_counts to float and then compute the mean
            acc = torch.mean(correct_counts.type(torch.FloatTensor))
            
            # Compute total accuracy in the whole batch and add to train_acc
            train_acc += acc.item() * inputs.size(0)
            
            #print("Batch number: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}".format(i, loss.item(), acc.item()))

            
        # Validation - No gradient tracking needed
        with torch.no_grad():

            # Set to evaluation mode
            model.eval()

            # Validation loop
            for j, (inputs, labels) in enumerate(testloader):
                inputs = inputs.to(device)
                labels = labels.to(device)

                # Forward pass - compute outputs on input data using the model
                outputs = model(inputs)

                # Compute loss
                loss = loss_criterion(outputs, labels)

                # Compute the total loss for the batch and add it to valid_loss
                valid_loss += loss.item() * inputs.size(0)

                # Calculate validation accuracy
                ret, predictions = torch.max(outputs.data, 1)
                correct_counts = predictions.eq(labels.data.view_as(predictions))

                # Convert correct_counts to float and then compute the mean
                acc = torch.mean(correct_counts.type(torch.FloatTensor))

                # Compute total accuracy in the whole batch and add to valid_acc
                valid_acc += acc.item() * inputs.size(0)

                #print("Validation Batch number: {:03d}, Validation: Loss: {:.4f}, Accuracy: {:.4f}".format(j, loss.item(), acc.item()))
            
        # Find average training loss and training accuracy
        avg_train_loss = train_loss/train_data_size 
        avg_train_acc = train_acc/train_data_size

        # Find average training loss and training accuracy
        avg_test_loss = valid_loss/test_data_size 
        avg_test_acc = valid_acc/test_data_size

        history.append([avg_train_loss, avg_test_loss, avg_train_acc, avg_test_acc])
                
        epoch_end = time.time()
    
        print("Epoch : {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}%, \n\t\tValidation : Loss : {:.4f}, Accuracy: {:.4f}%, Time: {:.4f}s".format(epoch, avg_train_loss, avg_train_acc*100, avg_test_loss, avg_test_acc*100, epoch_end-epoch_start))
            
    return model, history

# Train the model for 10 epochs
num_epochs = 10
trained_model, history = train_and_validate(model, criterion, optimizer, num_epochs)

60000
10000
Epoch: 1/10
Epoch : 000, Training: Loss: 1.4114, Accuracy: 50.9267%, 
		Validation : Loss : 0.6491, Accuracy: 74.7300%, Time: 22.3493s
Epoch: 2/10
Epoch : 001, Training: Loss: 0.5492, Accuracy: 79.4350%, 
		Validation : Loss : 0.5379, Accuracy: 79.9800%, Time: 22.4672s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.4573, Accuracy: 83.1250%, 
		Validation : Loss : 0.4737, Accuracy: 82.6900%, Time: 22.1978s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.4077, Accuracy: 85.0000%, 
		Validation : Loss : 0.4282, Accuracy: 84.4700%, Time: 23.4096s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.3716, Accuracy: 86.2567%, 
		Validation : Loss : 0.3948, Accuracy: 85.3300%, Time: 21.8772s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.3505, Accuracy: 86.9967%, 
		Validation : Loss : 0.3722, Accuracy: 86.3200%, Time: 22.0673s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.3320, Accuracy: 87.8183%, 
		Validation : Loss : 0.3636, Accuracy: 86.7300%, Time: 21.8310s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.

c) Replace your defined CNN in b) with a pre-trained model. Then, proceed with a transfer learning and finetune the model for the Fashion MNIST dataset. **[10 marks]**

In [None]:
###############################################
###############YOUR CODES HERE ################
###############################################

model_pretrained = torchvision.models.resnet152(pretrained=True)
# 2. LOSS AND OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model_pretrained.parameters(), lr=0.001, momentum=0.9)

# 3. move the model to GPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model_pretrained.to(device)

# Train the model for 10 epochs
num_epochs = 10
trained_model, history = train_and_validate(model_pretrained, criterion, optimizer, num_epochs)

  f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
Downloading: "https://download.pytorch.org/models/resnet152-394f9c45.pth" to /root/.cache/torch/hub/checkpoints/resnet152-394f9c45.pth


  0%|          | 0.00/230M [00:00<?, ?B/s]

Epoch: 1/10
Epoch : 000, Training: Loss: 0.4469, Accuracy: 85.9033%, 
		Validation : Loss : 0.2600, Accuracy: 90.3500%, Time: 171.3074s
Epoch: 2/10
Epoch : 001, Training: Loss: 0.2348, Accuracy: 91.5933%, 
		Validation : Loss : 0.2638, Accuracy: 90.8800%, Time: 169.9983s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.1902, Accuracy: 93.0683%, 
		Validation : Loss : 0.2478, Accuracy: 91.0400%, Time: 170.6333s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.1685, Accuracy: 93.8050%, 
		Validation : Loss : 0.2322, Accuracy: 91.9700%, Time: 170.7317s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.1381, Accuracy: 94.8033%, 
		Validation : Loss : 0.2222, Accuracy: 92.2800%, Time: 170.6529s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.1149, Accuracy: 95.8083%, 
		Validation : Loss : 0.2275, Accuracy: 92.5800%, Time: 170.0805s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.0959, Accuracy: 96.4600%, 
		Validation : Loss : 0.2506, Accuracy: 92.2100%, Time: 170.2837s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.0842,

d) Using model-centric methods, propose two (2) strategies that can be used to increase the accuracy of the model on the testing dataset. **[5 marks]**


<span style="color:blue">
    Two model-centric techniques that I propose are: 
    1. Fine Tune the lower layer of the pre-trained model
    2. PyTorch batch normalization </span>

e) Next, implement the two proposed model-centric techniques for the same problem as in the previous question. **[15 marks]**

In [None]:
###############################################
###############YOUR CODES HERE ################
###############################################

# 1. Fine Tune the lower layer of the pre-trained model
model_conv = torchvision.models.resnet152(pretrained=True)
for param in model_conv.parameters():
    param.requires_grad = False

#Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
print(num_ftrs)
model_conv.fc = nn.Sequential(nn.Linear(num_ftrs, (num_ftrs//2)),nn.ReLU(),
                            nn.Linear((num_ftrs//2), (num_ftrs//4)), nn.ReLU(),
                            nn.Linear((num_ftrs//4), (num_ftrs//8)), nn.ReLU(),
                            nn.Linear((num_ftrs//8), (num_ftrs//16)), nn.ReLU(),
                            nn.Linear((num_ftrs//16),10)
                            )

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model_conv.to(device)

# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model_conv.parameters(), lr=0.001, momentum=0.9)

# Train the model for 10 epochs
num_epochs = 10
trained_model, history = train_and_validate(model_conv, criterion, optimizer, num_epochs)

  f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "


2048
Epoch: 1/10
Epoch : 000, Training: Loss: 1.8915, Accuracy: 34.6867%, 
		Validation : Loss : 1.8451, Accuracy: 61.4000%, Time: 75.8402s
Epoch: 2/10
Epoch : 001, Training: Loss: 0.8950, Accuracy: 67.5733%, 
		Validation : Loss : 2.2217, Accuracy: 72.1700%, Time: 77.8057s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.7710, Accuracy: 72.5100%, 
		Validation : Loss : 1.6655, Accuracy: 74.7900%, Time: 77.1594s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.7076, Accuracy: 74.5567%, 
		Validation : Loss : 1.2607, Accuracy: 76.4400%, Time: 76.4139s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.6668, Accuracy: 75.7367%, 
		Validation : Loss : 2.1212, Accuracy: 75.4600%, Time: 79.0639s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.6436, Accuracy: 76.4200%, 
		Validation : Loss : 1.5421, Accuracy: 76.2900%, Time: 77.3594s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.6195, Accuracy: 77.3633%, 
		Validation : Loss : 1.3750, Accuracy: 77.9800%, Time: 76.7156s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.5989, A

In [None]:
# 2. PyTorch batch normalization

class CNN_batch(nn.Module):
    def __init__(self):
        super(CNN_batch, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.pool = nn.MaxPool2d(2, 2)              # kernel size=1x1, stride=2
        self.batchnorm1 = nn.BatchNorm2d(6)         # adding batchnorm to the model
        self.batchnorm2 = nn.BatchNorm2d(16)
        self.fc6 = nn.Linear(16 * 5 * 5, 200)     # output image size=1x1, channels=512   (if use 224 size image, output size is 7)
        self.fc7 = nn.Linear(200, 100)
        self.fc8 = nn.Linear(100, 10)              # output class=10
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.batchnorm1(x)
        x = self.pool(self.relu(self.conv2(x)))
        x = self.batchnorm2(x)
        x = x.view(-1, 16 * 5 * 5)
        x = self.relu(self.fc6(x))
        x = self.relu(self.fc7(x))
        x = self.relu(self.fc8(x))
        return x

model_batchnorm = CNN_batch()

# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model_batchnorm.parameters(), lr=0.001, momentum=0.9)

# Move the model to GPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model_batchnorm.to(device)

# Train the model for 10 epochs
num_epochs = 10
trained_model, history = train_and_validate(model_conv, criterion, optimizer, num_epochs)

Epoch: 1/10
Epoch : 000, Training: Loss: 0.5443, Accuracy: 79.9783%, 
		Validation : Loss : 0.8624, Accuracy: 79.6000%, Time: 78.0406s
Epoch: 2/10
Epoch : 001, Training: Loss: 0.5445, Accuracy: 80.0867%, 
		Validation : Loss : 0.9642, Accuracy: 79.5000%, Time: 76.5401s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.5425, Accuracy: 80.2667%, 
		Validation : Loss : 0.6397, Accuracy: 80.6300%, Time: 77.9094s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.5416, Accuracy: 80.1233%, 
		Validation : Loss : 1.0285, Accuracy: 78.8500%, Time: 76.6242s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.5411, Accuracy: 80.2417%, 
		Validation : Loss : 0.7390, Accuracy: 79.2800%, Time: 74.6013s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.5472, Accuracy: 79.8433%, 
		Validation : Loss : 0.9099, Accuracy: 79.9800%, Time: 75.2368s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.5398, Accuracy: 80.1100%, 
		Validation : Loss : 1.0097, Accuracy: 79.4000%, Time: 75.6008s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.5400, Accura

f) Do you see any accuracy improvement? Whether it is a "yes" or "no", discuss the possible reasons contributing to the accuracy improvement/ unimprovement. **[5 marks]**

<span style="color:blue">
    Your answer here: NO. Batch normalization may cause inaccurate estimation of batch statistics when we have a small batch size. This increases the model error. In tasks such as image segmentation, the batch size is usually too small. BN needs a sufficiently large batch size. </span>

g) In real applications, data-centric strategies are essential to train robust deep learning models. Give two (2) examples of such strategies and discuss how the strategies helps improving the model accuracy. **[5 marks]**

<span style="color:blue">
    Your answer here </span>

h) Next, implement the two proposed data-centric techniques for the same problem as in the previous question. **[10 marks]**
1. Data Augmentation - Random resized crop & Center Crop
2. 2. Data Augmentation - Random rotation & Random horizontal flip

In [10]:
###############################################
##############YOUR CODES HERE #################
###############################################
import torch, torchvision
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import time
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import glob
import numpy
import random
import pandas as pd
import torchvision
import torchvision.transforms as transforms

from PIL import Image
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision import datasets, models, transforms
from torchsummary import summary
from torchvision.transforms import ToTensor

# 1. Data Augmentation - Random resized crop & Center Crop

image_transforms_2 = {
    'train': transforms.Compose([
        transforms.Resize(32),        # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        transforms.RandomResizedCrop(size=32, scale=(0.8, 1.0)),
        transforms.CenterCrop(size=32),
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])      # 3 Channel Gray image normalization
    ]),

    'test': transforms.Compose([
        transforms.Resize(size=32),   # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        transforms.CenterCrop(size=32),
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])       # 3 Channel Gray image normalization
    ])
}

batch_size = 32

classes = ('T-Shirt', 'Trouser', 'Pullover', 'Dress',
       'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')

trainset2 = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=True,
                                        download=True, transform=image_transforms_2['train'])
trainloader2 = torch.utils.data.DataLoader(trainset2, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset2 = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=False,
                                       download=True, transform=image_transforms_2['test'])
testloader2 = torch.utils.data.DataLoader(testset2, batch_size=batch_size,
                                         shuffle=False, num_workers=2)

train_data_size2 = len(trainloader2.dataset)
test_data_size2 = len(testloader2.dataset)

# 2. Data Augmentation - Random rotation & Random horizontal flip

image_transforms_3 = {
    'train': transforms.Compose([
        transforms.Resize(32),        # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        transforms.RandomRotation(degrees=15),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])      # 3 Channel Gray image normalization
    ]),

    'test': transforms.Compose([
        transforms.Resize(size=32),   # Resize the image to 32x32 (supposedly resized to 224 but changed to 32 to reduce training time)
        transforms.ToTensor(),
        transforms.Lambda(lambda x: x.repeat(3, 1, 1) if x.size(0)==1 else x),
        transforms.Normalize([0.5,0.5,0.5],
                             [0.5,0.5,0.5])       # 3 Channel Gray image normalization
    ])
}

trainset3 = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=True,
                                        download=True, transform=image_transforms_3['train'])
trainloader3 = torch.utils.data.DataLoader(trainset2, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset3 = torchvision.datasets.FashionMNIST(root='/content/drive/MyDrive/Assesment1/Fashion_MNIST/', train=False,
                                       download=True, transform=image_transforms_3['test'])
testloader3 = torch.utils.data.DataLoader(testset2, batch_size=batch_size,
                                         shuffle=False, num_workers=2)

train_data_size3 = len(trainloader3.dataset)
test_data_size3 = len(testloader3.dataset)

# Define the Model for both Augmentation
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        #self.conv3 = nn.Conv2d(128, 256, 1)
        #self.conv4 = nn.Conv2d(256, 512, 1)
        #self.conv5 = nn.Conv2d(512, 512, 1)
        self.pool = nn.MaxPool2d(2, 2)              # kernel size=1x1, stride=2
        self.fc6 = nn.Linear(16 * 5 * 5, 200)     # output image size=1x1, channels=512   (if use 224 size image, output size is 7)
        self.fc7 = nn.Linear(200, 100)
        self.fc8 = nn.Linear(100, 10)              # output class=10
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        #x = self.pool(self.relu(self.conv3(x)))
        #x = self.pool(self.relu(self.conv4(x)))
        #x = self.pool(self.relu(self.conv5(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = self.relu(self.fc6(x))
        x = self.relu(self.fc7(x))
        x = self.relu(self.fc8(x))
        return x

model = CNN()

# Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Move the model to GPU
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model.to(device)

CNN(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc6): Linear(in_features=400, out_features=200, bias=True)
  (fc7): Linear(in_features=200, out_features=100, bias=True)
  (fc8): Linear(in_features=100, out_features=10, bias=True)
  (relu): ReLU()
)

In [11]:
# Training for 1. Data Augmentation - Random resized crop & Center Crop
import time 

def train_and_validate(model, loss_criterion, optimizer, epochs=25):
    '''
    Function to train and validate
    Parameters
        :param model: Model to train and validate
        :param loss_criterion: Loss Criterion to minimize
        :param optimizer: Optimizer for computing gradients
        :param epochs: Number of epochs (default=25)
  
    Returns
        model: Trained Model with best validation accuracy
        history: (dict object): Having training loss, accuracy and validation loss, accuracy
    '''
    
    start = time.time()
    history2 = []
    best_acc = 0.0

    for epoch in range(epochs):
        epoch_start = time.time()
        print("Epoch: {}/{}".format(epoch+1, epochs))
        
        # Set to training mode
        model.train()
        
        # Loss and Accuracy within the epoch
        train_loss = 0.0
        train_acc = 0.0
        
        valid_loss = 0.0
        valid_acc = 0.0
        
        for i, (inputs, labels) in enumerate(trainloader2):

            inputs = inputs.to(device)
            labels = labels.to(device)
            
            # Clean existing gradients
            optimizer.zero_grad()
            
            # Forward pass - compute outputs on input data using the model
            outputs = model(inputs)
            
            # Compute loss
            loss = loss_criterion(outputs, labels)
            
            # Backpropagate the gradients
            loss.backward()
            
            # Update the parameters
            optimizer.step()
            
            # Compute the total loss for the batch and add it to train_loss
            train_loss += loss.item() * inputs.size(0)
            
            # Compute the accuracy
            ret, predictions = torch.max(outputs.data, 1)
            correct_counts = predictions.eq(labels.data.view_as(predictions))
            
            # Convert correct_counts to float and then compute the mean
            acc = torch.mean(correct_counts.type(torch.FloatTensor))
            
            # Compute total accuracy in the whole batch and add to train_acc
            train_acc += acc.item() * inputs.size(0)
            
            #print("Batch number: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}".format(i, loss.item(), acc.item()))

            
        # Validation - No gradient tracking needed
        with torch.no_grad():

            # Set to evaluation mode
            model.eval()

            # Validation loop
            for j, (inputs, labels) in enumerate(testloader2):
                inputs = inputs.to(device)
                labels = labels.to(device)

                # Forward pass - compute outputs on input data using the model
                outputs = model(inputs)

                # Compute loss
                loss = loss_criterion(outputs, labels)

                # Compute the total loss for the batch and add it to valid_loss
                valid_loss += loss.item() * inputs.size(0)

                # Calculate validation accuracy
                ret, predictions = torch.max(outputs.data, 1)
                correct_counts = predictions.eq(labels.data.view_as(predictions))

                # Convert correct_counts to float and then compute the mean
                acc = torch.mean(correct_counts.type(torch.FloatTensor))

                # Compute total accuracy in the whole batch and add to valid_acc
                valid_acc += acc.item() * inputs.size(0)

                #print("Validation Batch number: {:03d}, Validation: Loss: {:.4f}, Accuracy: {:.4f}".format(j, loss.item(), acc.item()))
            
        # Find average training loss and training accuracy
        avg_train_loss = train_loss/train_data_size2 
        avg_train_acc = train_acc/train_data_size2

        # Find average training loss and training accuracy
        avg_test_loss = valid_loss/test_data_size2 
        avg_test_acc = valid_acc/test_data_size2

        history2.append([avg_train_loss, avg_test_loss, avg_train_acc, avg_test_acc])
                
        epoch_end = time.time()
    
        print("Epoch : {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}%, \n\t\tValidation : Loss : {:.4f}, Accuracy: {:.4f}%, Time: {:.4f}s".format(epoch, avg_train_loss, avg_train_acc*100, avg_test_loss, avg_test_acc*100, epoch_end-epoch_start))
            
    return model, history2

# Train the model for 10 epochs
num_epochs = 10
trained_model, history2 = train_and_validate(model, criterion, optimizer, num_epochs)

Epoch: 1/10
Epoch : 000, Training: Loss: 1.9386, Accuracy: 28.0817%, 
		Validation : Loss : 1.3386, Accuracy: 49.3900%, Time: 31.3927s
Epoch: 2/10
Epoch : 001, Training: Loss: 1.2180, Accuracy: 53.5467%, 
		Validation : Loss : 1.0023, Accuracy: 60.5500%, Time: 30.3998s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.9983, Accuracy: 60.2783%, 
		Validation : Loss : 0.9633, Accuracy: 61.9300%, Time: 31.2639s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.9494, Accuracy: 62.1550%, 
		Validation : Loss : 0.9170, Accuracy: 62.9800%, Time: 30.4971s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.9170, Accuracy: 63.3983%, 
		Validation : Loss : 0.8661, Accuracy: 65.5500%, Time: 30.4044s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.8910, Accuracy: 64.5200%, 
		Validation : Loss : 0.8385, Accuracy: 66.5800%, Time: 30.3296s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.8702, Accuracy: 65.1767%, 
		Validation : Loss : 0.8215, Accuracy: 67.2400%, Time: 30.9831s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.8515, Accura

In [12]:
# Training for 2. Data Augmentation - Random rotation & Random horizontal flip
import time 

def train_and_validate(model, loss_criterion, optimizer, epochs=25):
    '''
    Function to train and validate
    Parameters
        :param model: Model to train and validate
        :param loss_criterion: Loss Criterion to minimize
        :param optimizer: Optimizer for computing gradients
        :param epochs: Number of epochs (default=25)
  
    Returns
        model: Trained Model with best validation accuracy
        history: (dict object): Having training loss, accuracy and validation loss, accuracy
    '''
    
    start = time.time()
    history3 = []
    best_acc = 0.0

    for epoch in range(epochs):
        epoch_start = time.time()
        print("Epoch: {}/{}".format(epoch+1, epochs))
        
        # Set to training mode
        model.train()
        
        # Loss and Accuracy within the epoch
        train_loss = 0.0
        train_acc = 0.0
        
        valid_loss = 0.0
        valid_acc = 0.0
        
        for i, (inputs, labels) in enumerate(trainloader3):

            inputs = inputs.to(device)
            labels = labels.to(device)
            
            # Clean existing gradients
            optimizer.zero_grad()
            
            # Forward pass - compute outputs on input data using the model
            outputs = model(inputs)
            
            # Compute loss
            loss = loss_criterion(outputs, labels)
            
            # Backpropagate the gradients
            loss.backward()
            
            # Update the parameters
            optimizer.step()
            
            # Compute the total loss for the batch and add it to train_loss
            train_loss += loss.item() * inputs.size(0)
            
            # Compute the accuracy
            ret, predictions = torch.max(outputs.data, 1)
            correct_counts = predictions.eq(labels.data.view_as(predictions))
            
            # Convert correct_counts to float and then compute the mean
            acc = torch.mean(correct_counts.type(torch.FloatTensor))
            
            # Compute total accuracy in the whole batch and add to train_acc
            train_acc += acc.item() * inputs.size(0)
            
            #print("Batch number: {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}".format(i, loss.item(), acc.item()))

            
        # Validation - No gradient tracking needed
        with torch.no_grad():

            # Set to evaluation mode
            model.eval()

            # Validation loop
            for j, (inputs, labels) in enumerate(testloader3):
                inputs = inputs.to(device)
                labels = labels.to(device)

                # Forward pass - compute outputs on input data using the model
                outputs = model(inputs)

                # Compute loss
                loss = loss_criterion(outputs, labels)

                # Compute the total loss for the batch and add it to valid_loss
                valid_loss += loss.item() * inputs.size(0)

                # Calculate validation accuracy
                ret, predictions = torch.max(outputs.data, 1)
                correct_counts = predictions.eq(labels.data.view_as(predictions))

                # Convert correct_counts to float and then compute the mean
                acc = torch.mean(correct_counts.type(torch.FloatTensor))

                # Compute total accuracy in the whole batch and add to valid_acc
                valid_acc += acc.item() * inputs.size(0)

                #print("Validation Batch number: {:03d}, Validation: Loss: {:.4f}, Accuracy: {:.4f}".format(j, loss.item(), acc.item()))
            
        # Find average training loss and training accuracy
        avg_train_loss = train_loss/train_data_size3 
        avg_train_acc = train_acc/train_data_size3

        # Find average training loss and training accuracy
        avg_test_loss = valid_loss/test_data_size3 
        avg_test_acc = valid_acc/test_data_size3

        history3.append([avg_train_loss, avg_test_loss, avg_train_acc, avg_test_acc])
                
        epoch_end = time.time()
    
        print("Epoch : {:03d}, Training: Loss: {:.4f}, Accuracy: {:.4f}%, \n\t\tValidation : Loss : {:.4f}, Accuracy: {:.4f}%, Time: {:.4f}s".format(epoch, avg_train_loss, avg_train_acc*100, avg_test_loss, avg_test_acc*100, epoch_end-epoch_start))
            
    return model, history3

# Train the model for 10 epochs
num_epochs = 10
trained_model, history3 = train_and_validate(model, criterion, optimizer, num_epochs)

Epoch: 1/10
Epoch : 000, Training: Loss: 0.8120, Accuracy: 67.3750%, 
		Validation : Loss : 0.8271, Accuracy: 66.7600%, Time: 31.4669s
Epoch: 2/10
Epoch : 001, Training: Loss: 0.7852, Accuracy: 68.6517%, 
		Validation : Loss : 0.5781, Accuracy: 78.7200%, Time: 30.4748s
Epoch: 3/10
Epoch : 002, Training: Loss: 0.5982, Accuracy: 77.4367%, 
		Validation : Loss : 0.5674, Accuracy: 78.2000%, Time: 30.3609s
Epoch: 4/10
Epoch : 003, Training: Loss: 0.5864, Accuracy: 77.6150%, 
		Validation : Loss : 0.5521, Accuracy: 79.1800%, Time: 30.5604s
Epoch: 5/10
Epoch : 004, Training: Loss: 0.5779, Accuracy: 77.8717%, 
		Validation : Loss : 0.5691, Accuracy: 78.1700%, Time: 31.6651s
Epoch: 6/10
Epoch : 005, Training: Loss: 0.5684, Accuracy: 78.1850%, 
		Validation : Loss : 0.5413, Accuracy: 78.9900%, Time: 30.5012s
Epoch: 7/10
Epoch : 006, Training: Loss: 0.5619, Accuracy: 78.3450%, 
		Validation : Loss : 0.5556, Accuracy: 78.3100%, Time: 30.7709s
Epoch: 8/10
Epoch : 007, Training: Loss: 0.3595, Accura

**QUESTION 2** **[35 marks]**

Firstly, watch this video:

https://drive.google.com/file/d/1bsypahR7I3f_R3DXkfw_tf0BrbCHxE_O/view?usp=sharing

This video shows an example of masked face recognition where the deep learning model is able to detect and classify your face even when wearing a face mask. Using the end-to-end object detection pipeline that you have learned, develop your own masked face recognition such that the model should recognize your face even on face mask while recognize other persons as "others".

Deliverables for this question are:

- the model file. Change the name to <your_name>.pt file (e.g. hasan.pt).
- a short video (~10 secs) containing your face and your friends faces (for inference).

In [13]:
#clone YOLOv5 and 
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
%pip install -qr requirements.txt # install dependencies
%pip install -q roboflow

import torch
import os
from IPython.display import Image, clear_output  # to display images

print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

from roboflow import Roboflow
rf = Roboflow(model_format="yolov5", notebook="ultralytics")

# set up environment
os.environ["DATASET_DIRECTORY"] = "/content/datasets"

#after following the link above, recieve python code with these fields filled in

from roboflow import Roboflow
rf = Roboflow(api_key="5VlKA4iBf9T2rMQ3dx3b")
project = rf.workspace("ahmad-firdaus").project("person_recognition")
dataset = project.version(1).download("yolov5")

Cloning into 'yolov5'...
remote: Enumerating objects: 12351, done.[K
remote: Counting objects: 100% (133/133), done.[K
remote: Compressing objects: 100% (87/87), done.[K
remote: Total 12351 (delta 77), reused 89 (delta 46), pack-reused 12218[K
Receiving objects: 100% (12351/12351), 12.40 MiB | 26.66 MiB/s, done.
Resolving deltas: 100% (8447/8447), done.
/content/yolov5/yolov5
Setup complete. Using torch 1.12.0+cu113 (Tesla T4)
upload and label your dataset, and get an API KEY here: https://app.roboflow.com/?model=yolov5&ref=ultralytics
loading Roboflow workspace...
loading Roboflow project...
Downloading Dataset Version Zip in /content/datasets/Person_Recognition-1 to yolov5pytorch: 100% [20978369 / 20978369] bytes


Extracting Dataset Version Zip to /content/datasets/Person_Recognition-1 in yolov5pytorch:: 100%|██████████| 952/952 [00:01<00:00, 908.26it/s] 


In [14]:
!python train.py --img 416 --batch 16 --epochs 150 --data {dataset.location}/data.yaml --weights yolov5s.pt --cache

[34m[1mtrain: [0mweights=yolov5s.pt, cfg=, data=/content/datasets/Person_Recognition-1/data.yaml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=150, batch_size=16, imgsz=416, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
[34m[1mgithub: [0mup to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v6.1-347-g7b9cc32 Python-3.7.13 torch-1.12.0+cu113 CUDA:0 (Tesla T4, 15110MiB)

[34m[1mhyperparameters: [0mlr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0

In [15]:
!python detect.py --weights runs/train/exp/weights/best.pt --img 416 --conf 0.1 --source {dataset.location}/test/images

[34m[1mdetect: [0mweights=['runs/train/exp/weights/best.pt'], source=/content/datasets/Person_Recognition-1/test/images, data=data/coco128.yaml, imgsz=[416, 416], conf_thres=0.1, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.1-347-g7b9cc32 Python-3.7.13 torch-1.12.0+cu113 CUDA:0 (Tesla T4, 15110MiB)

Fusing layers... 
Model summary: 213 layers, 7015519 parameters, 0 gradients, 15.8 GFLOPs
image 1/20 /content/datasets/Person_Recognition-1/test/images/WIN_20220725_12_24_54_Pro_jpg.rf.00f1fef1300a406b60beacc8da2cbbaa.jpg: 416x416 1 Ahmad Firdaus, 4 Otherss, Done. (0.018s)
image 2/20 /content/datasets/Person_Recognition-1/test/images/WIN_20220725_12_24_55_Pro--4-_jpg.rf.f852a9073e68f102ccb7e23e4d007efe.j

In [16]:
#export your model's weights for future use
from google.colab import files
files.download('./runs/train/exp/weights/best.pt')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
# Run on Conda Prompt #
#python detect.py --weights best.pt --source 0