# Assessed Exercise for Deep Learning (M)

This exercise must be submitted as a colab notebook. Deadline  Tuesday the 10th of March, 11:00am.





You will create a classifier and test it on a collection of images for a new task. While you are welcome to build a full network from scratch, most of you will not have sufficient access to the data and compuational power required, so you are welcome to provide a solution based on transfer learning from a pre-trained network, adapted to your new task. 

PyTorch has a range of pre-trained models ready to use which can be accessed from https://pytorch.org/docs/stable/torchvision/models.html and information about the pre-trained versions is available at https://pytorch.org/docs/stable/hub.html#loading-models-from-hub You can pass the parameter `pretrained=True` when setting up the model to include a pre-trained element. This includes popular models such as `vgg16` and `inception` and `resnext50_32x4d`

A tutorial on fine-tuning pre-trained networks in PyTorch is available https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html and this gives you a good potential structure for the task. General notes on transfer learning: https://cs231n.github.io/transfer-learning/ 

If you use the pretrained model, a good way to start is by freezing all layers up to the last layer before the output. Adapt the output layer to fit your classification problem. You might then unfreeze some earlier layers for further fine tuning.

You will need to create a training set (at least 100 images per class, potentially classifying e.g type of location, activity. If you are training a full network from scratch you would need orders of magnitude more, but this will work for transfer learning on an existing network). It might be sensible to start off testing and demonstrating your approach by using an existing dataset e.g. 
the [flowers one](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/). You can find other interesting datasets at:
*   [Google's dataset search tool](https://datasetsearch.research.google.com/)
*   [http://deeplearning.net/datasets](http://deeplearning.net/datasets/)  
*   [UCI ML collection](https://archive.ics.uci.edu/ml/datasets.html)
*   [https://www.visualdata.io](https://www.visualdata.io)
*   [https://ai.google/tools/datasets](https://ai.google/tools/datasets/ )

You should not just work with a single pre-created dataset, but you can augment your own data with data from one or more other datasets, but there should be a significant level of novelty in the dataset created. Students who put more effort into creating and analysing an interesting dataset will tend to do better in marks for sections 1. & 2. below.

Write a pre-processing step that will resize and crop the images to the right size ((224, 224) is default for ResNet50 and (299,299) is the default for Inception), and consider how you can apply data augmentation techniques to your new dataset, and design appropriate pre-processing functions.

In your submission you should have the following structure (share of the AX marks given in brackets at the end of each part):

1.   Analysis of the problem. What are you trying to solve, and what are the challenges in the task? (15%)
2.   Visualisation and analysis of the data type, quality and  class distributions. You may want to design some data augmentation in your system. (20%)
3.   Creation of multiple candidate network architectures. Include your justification of the design decisions. You should inlcude one very simple baseline model (e.g. a linear model, or a simple two layer Densely connected model). (15%)
4.   Training. This should include code for hyperparameter search, regularisation methods. (15%)
5.   Empirical evaluation of performance, and potentially visualisation and  analysis of the trained network. This should make good use of graphs and tables of results, confusion matrices etc to represent the relative performance of the different models.  Explain why you chose the metrics you use. (20%)
6.   Report on the performance, discussing the suitability of the final network for use. (15%)

For each of the design decisions, make sure you describe in detail the motivation behind them. 

**Submission process**

You should submit the colab notebook with *all* code needed to run your model and all visualisations of results in place (I don't want to have to run 80 projects :-) ). This exercise must be submitted as a colab notebook. Deadline Tuesday the 10th of March, 11:00. If you have your own training data, make sure that any links to that are accessible by 3rd parties (this will be restricted to the markers, myself and Josh Mitton - we won't share the links with anyone else).

Share the Colab link (click on Share on top right of the Colab notebook, then 'Get shareable link') with me by e-mail [Roderick.Murray-Smith@glasgow.ac.uk](mailto:Roderick.Murray-Smith@glasgow.ac.uk?subject=Deep Learning AX 2020) and make sure that the Subject of the e-mail is *exactly* `'Deep Learning AX 2020'` (automatically generated if you click on the e-mail hyperlink above)

#  Marking Scheme

To help you understand what is expected, we are sharing the marking scheme, and will return your mark profile, so that you have a link to the types of issues associated with the mark range you get for each topic.

**Analysis of the problem. What are you trying to solve, and what are the challenges in the task? (15%)**

Marks and expected Feedback:
* 10-15 : Good choice of problem. Clear explanation and introduction to the problem. Extra marks for more challenging problems.
* 5-9: Acceptable problem, collated by the student independently. Possibly a less complete explanation of the problem.
* 0-4: Problem too simple. Possibly just downloaded a pre-canned dataset.


**Visualisation and analysis of the data type, quality and class distributions. You may want to design some data augmentation in your system. (20%)**

Marks and expected Feedback:

* 16-20: Used innovative/customised data augmentation, re-scaled data, good visualisation of task challenge and completeness of data.
* 10-15: Used some basic data augmentation. Visualised the data well enough that a reader had a good sense of the task challenge and completeness of the data. Appropriate re-scaling of data. 
* 0-9: Did not use data augmentation. Did not visualise a range of examples in such a way that a reader had a good sense of the task challenge, and suitability of the data set. Basic visualisation of the classes in the data. No re-scaling of data,

**Creation of multiple candidate network architectures. Include your justification of the design decisions. You should inlcude one very simple baseline model (e.g. a linear model, or a simple two layer Densely connected model). (15%)**

Marks and expected Feedback:

* 10-15: Good choice of architectures, and sensible comparison with baseline and good use of pre-trained networks.
* 5-9: Used a pre-trained network correctly. Some basic architectures compared.
* 0-4: No use of pre-trained networks. No attempt to compare network architectures.

**Training. This should include code for hyperparameter search, regularisation methods. (15%)**

Marks and expected Feedback:
* 10-15: Advanced use of hyperparameter tuning. Well-motivated regularisation methods were used. Good use of k-fold methods.
* 5-9: Basic hyperparameter tuning implemented and tested. Maybe only randomised hyperparameter  Basic regularisation implemented and tested. 
* 0-4: No good use of hyperparameter tuning. No regularisation methods used. Stopped training too early to see sensible results.


**Empirical evaluation of performance, and potentially visualisation and analysis of the trained network. This should make good use of graphs and tables of results, confusion matrices etc to represent the relative performance of the different models. Explain why you chose the metrics you use. (20%)**

Marks and expected Feedback:
* 14-20: Good visualisation of images and presentation of performance  results. Good explanation of metric choice in evaluation. Visualisation of network hidden layers.
* 10-13: Some basic visualisation of images and analysis and presentation of performance results.
* 0-9: No visualisation of images for misclassification results. Poor explanation of metric choice in evaluation. Minimal evaluation and analysis of results


**Report on the performance, discussing the suitability of the final network for use. (15%)**

Marks and expected Feedback:
Remember to add your name to the report!
* 12-15: Excellent, well-presented report. Good use of figures, data well presented and clearly communicated. Each of the design decisions is described in detail, including the motivation behind them.
* 10-12: Reasonably well presented report with figures and data presented, but some shortcomings, and less well explained designs and motivation. 
* 5-9:  Report: Code, text and figures presented, but final product is still unpolished in terms of structure or  presentation, and has gaps in explanation of designs and motivation.
* 0-4: Significant issues with English spelling and grammar in the report. Lack of explanation of the design or motivation.


In [1]:
from __future__ import print_function
from __future__ import division
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import torchvision
import torchvision.models as models
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy
print("PyTorch Version: ",torch.__version__)
print("Torchvision Version: ",torchvision.__version__)

ModuleNotFoundError: No module named 'torch'

In [None]:
#FineTuning helper functions found at https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
def train_model(model, dataloaders, criterion, optimizer, num_epochs=25, is_inception=False):
    since = time.time()

    val_acc_history = []

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    # Get model outputs and calculate loss
                    # Special case for inception because in training it has an auxiliary output. In train
                    #   mode we calculate the loss by summing the final output and the auxiliary output
                    #   but in testing we only consider the final output.
                    if is_inception and phase == 'train':
                        # From https://discuss.pytorch.org/t/how-to-optimize-inception-model-with-auxiliary-classifiers/7958
                        outputs, aux_outputs = model(inputs)
                        loss1 = criterion(outputs, labels)
                        loss2 = criterion(aux_outputs, labels)
                        loss = loss1 + 0.4*loss2
                    else:
                        outputs = model(inputs)
                        loss = criterion(outputs, labels)

                    _, preds = torch.max(outputs, 1)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(dataloaders[phase].dataset)
            epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())
            if phase == 'val':
                val_acc_history.append(epoch_acc)

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model, val_acc_history


def set_parameter_requires_grad(model, feature_extracting):
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

def initialize_model(model_name, num_classes, feature_extract, use_pretrained=True):
    # Initialize these variables which will be set in this if statement. Each of these
    #   variables is model specific.
    model_ft = None
    input_size = 0

    if model_name == "resnet":
        """ Resnet18
        """
        model_ft = models.resnet18(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs, num_classes)
        input_size = 224

    elif model_name == "alexnet":
        """ Alexnet
        """
        model_ft = models.alexnet(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
        input_size = 224

    elif model_name == "vgg":
        """ VGG11_bn
        """
        model_ft = models.vgg11_bn(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier[6].in_features
        model_ft.classifier[6] = nn.Linear(num_ftrs,num_classes)
        input_size = 224

    elif model_name == "squeezenet":
        """ Squeezenet
        """
        model_ft = models.squeezenet1_0(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size=(1,1), stride=(1,1))
        model_ft.num_classes = num_classes
        input_size = 224

    elif model_name == "densenet":
        """ Densenet
        """
        model_ft = models.densenet121(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        num_ftrs = model_ft.classifier.in_features
        model_ft.classifier = nn.Linear(num_ftrs, num_classes)
        input_size = 224

    elif model_name == "inception":
        """ Inception v3
        Be careful, expects (299,299) sized images and has auxiliary output
        """
        model_ft = models.inception_v3(pretrained=use_pretrained)
        set_parameter_requires_grad(model_ft, feature_extract)
        # Handle the auxilary net
        num_ftrs = model_ft.AuxLogits.fc.in_features
        model_ft.AuxLogits.fc = nn.Linear(num_ftrs, num_classes)
        # Handle the primary net
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs,num_classes)
        input_size = 299

    else:
        print("Invalid model name, exiting...")
        exit()

    return model_ft, input_size

In [None]:
# resnet18 = models.resnet18(pretrained=True)
# alexnet = models.alexnet(pretrained=True)
# squeezenet = models.squeezenet1_0(pretrained=True)
# vgg16 = models.vgg16(pretrained=True)
# densenet = models.densenet161(pretrained=True)
# inception = models.inception_v3(pretrained=True)
# googlenet = models.googlenet(pretrained=True)
# shufflenet = models.shufflenet_v2_x1_0(pretrained=True)
# mobilenet = models.mobilenet_v2(pretrained=True)
# resnext50_32x4d = models.resnext50_32x4d(pretrained=True)
# wide_resnet50_2 = models.wide_resnet50_2(pretrained=True)
# mnasnet = models.mnasnet1_0(pretrained=True)

# This includes popular models such as vgg16 and inception and resnext50_32x4d
vgg16 = models.vgg16(pretrained=True) #(224x224)
# inception = models.inception_v3(pretrained=True)  #(299x299)
# resnext50_32x4d = models.resnext50_32x4d(pretrained=True)  #(224x224)

# Models to choose from [resnet, alexnet, vgg, squeezenet, densenet, inception]
model_name = "vgg"

# Number of classes in the dataset
num_classes = 102

# Batch size for training (change depending on how much memory you have)
batch_size = 8

# Number of epochs to train for
num_epochs = 15

# Flag for feature extracting. When False, we finetune the whole model,
#   when True we only update the reshaped layer params
feature_extract = True


In [None]:
# When feature extracting, we only want to update the parameters of the last layer, or in other words, we only want to update the parameters 
# for the layer(s) we are reshaping. Therefore, we do not need to compute the gradients of the parameters that we are not changing, so for 
# efficiency we set the .requires_grad attribute to False. This is important because by default, this attribute is set to True. Then, when we 
# initialize the new layer and by default the new parameters have .requires_grad=True so only the new layer’s parameters will be updated. 
# When we are finetuning we can leave all of the .required_grad’s set to the default of True.

# Initialize the model for this run
model_ft, input_size = initialize_model(model_name, num_classes, feature_extract, use_pretrained=True)

# Print the model we just instantiated
print(model_ft)

In [17]:
fileNeigh="D:\Computer-Science\Year-4\DL\data\test.txt"
allNeighboursInOrder = []
with open(fileNeigh, "r") as f:
    for row in csv.reader(f):
        allNeighboursInOrder += row
f.close()
print(allNeighboursInOrder)

FileNotFoundError: ignored

In [16]:
# Top level data directory. Here we assume the format of the directory conforms
#   to the ImageFolder structure
data_dir = "D:/Computer-Science/Year-4/DL/data"
#"./data"

# Data augmentation and normalization for training
# Just normalization for validation
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(input_size),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(input_size),
        transforms.CenterCrop(input_size),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

print("Initializing Datasets and Dataloaders...")

# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val']}

# Detect if we have a GPU available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Initializing Datasets and Dataloaders...


FileNotFoundError: ignored

In [0]:
model.train() or model.eval()

 The images have to be loaded in to a range of [0, 1] and then normalized using
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])