# RESOURCES RAY
CIFAR: https://www.cs.toronto.edu/~kriz/cifar.html

MY GOOGLE DOC: https://docs.google.com/document/d/1B8ariJmySLXtpVoA5bQymO4XImhqPopynKJYOa9_z7k/edit

ASSIGNMENT DISCUSSION: https://canvas.sussex.ac.uk/courses/28083/discussion_topics/358640

USEFUL TUTORIAL: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

CIFAR DATASET on PYTORCH: https://pytorch.org/vision/stable/generated/torchvision.datasets.CIFAR10.html#torchvision.datasets.CIFAR10

REFS:
https://pytorch.org/vision/stable/generated/torchvision.datasets.CIFAR10.html#torchvision.datasets.CIFAR10

# Assignment overview
The overarching goal of this assignment is to produce a research report in which you implement, analyse, and discuss various Neural Network techniques. You will be guided through the process of producing this report, which will provide you with experience in report writing that will be useful in any research project you might be involved in later in life.

All of your report, including code and Markdown/text, ***must*** be written up in ***this*** notebook. This is not typical for research, but is solely for the purpose of this assignment. Please make sure you change the title of this file so that XXXXXX is replaced by your candidate number. You can use code cells to write code to implement, train, test, and analyse your NNs, as well as to generate figures to plot data and the results of your experiments. You can use Markdown/text cells to describe and discuss the modelling choices you make, the methods you use, and the experiments you conduct. So that we can mark your reports with greater consistency, please ***do not***:

* rearrange the sequence of cells in this notebook.
* delete any cells, including the ones explaining what you need to do.

If you want to add more code cells, for example to help organise the figures you want to show, then please add them directly after the code cells that have already been provided.

Please provide verbose comments throughout your code so that it is easy for us to interpret what you are attempting to achieve with your code. Long comments are useful at the beginning of a block of code. Short comments, e.g. to explain the purpose of a new variable, or one of several steps in some analyses, are useful on every few lines of code, if not on every line. Please do not use the code cells for writing extensive sentences/paragraphs that should instead be in the Markdown/text cells.

# Abstract/Introduction (instructions) - 15 MARKS
Use the next Markdown/text cell to write a short introduction to your report. This should include:
* a brief description of the topic (image classification) and of the dataset being used (CIFAR10 dataset). (2 MARKS)
* a brief description of how the CIFAR10 dataset has aided the development of neural network techniques, with examples. (3 MARKS)
* a descriptive overview of what the goal of your report is, including what you investigated. (5 MARKS)
* a summary of your major findings. (3 MARKS)
* two or more relevant references. (2 MARKS)

### RD! *INTRO/ABSTRACT HERE*


# Methodology (instructions) - 55 MARKS
Use the next cells in this Methodology section to describe and demonstrate the details of what you did, in practice, for your research. Cite at least two academic papers that support your model choices. The overarching prinicple of writing the Methodology is to ***provide sufficient details for someone to replicate your model and to reproduce your results, without having to resort to your code***. You must include at least these components in the Methodology:
* Data - Decribe the dataset, including how it is divided into training, validation, and test sets. Describe any pre-processing you perform on the data, and explain any advantages or disadvantages to your choice of pre-processing.
* Architecture - Describe the architecture of your model, including all relevant hyperparameters. The architecture must include 3 convolutional layers followed by two fully connected layers. Include a figure with labels to illustrate the architecture.
* Loss function - Describe the loss function(s) you are using, and explain any advantages or disadvantages there are with respect to the classification task.
* Optimiser - Describe the optimiser(s) you are using, including its hyperparameters, and explain any advantages or disadvantages there are to using that optimser.
* Experiments - Describe how you conducted each experiment, including any changes made to the baseline model that has already been described in the other Methodology sections. Explain the methods used for training the model and for assessing its performance on validation/test data.


## Data (7 MARKS)
RD! *Describe the dataset and any pre-processing here*

## Architecture (17 MARKS)
RD! *Describe the architecture here*

## Loss function (3 MARKS)
RD! *Describe the loss function here*

## Optimser (4 MARKS)
RD! *Describe the optimiser here*

## Experiments
### Experiment 1 (8 MARKS)
RD! *Describe how you went about conducting experiment 1 here*
### Experiment 2 (8 MARKS)
RD! *Describe how you went about conducting experiment 2 here*
### Experiment 3 (8 MARKS)
RD! *Describe how you went about conducting experiment 3 here*

In [15]:
############################################
### Code for building the baseline model ###
############################################

'''
The code in this cell does a number of things, some are relevent to the whole notebook, others for only building the baseline model
1. imports required paackages for entire notebook (according to convention)
2. established compute avialable and assignes to 'device' variable so as to utilise any GPU resources
3. establishes the environment and creatyes a variable IN_COLAB to handle if notebook being run locally or in google colab
4. Imports the data for the task, namely the CIFAR10 dataset which is available via the torchviision.datasets package
  - basic train and test sets used for the baseline model.
'''


# 1. imports
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn.functional as F
import torch.optim as optim
from torchsummary import summary
from torch.utils.data import DataLoader, random_split




# 2. set up for using GPU if available (with printed confirmation)
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

# 3. checking environment
try:
    from google.colab import drive
    drive.mount('/content/drive')
    IN_COLAB = True
except:
    IN_COLAB = False
print(f"IN_COLAB= {IN_COLAB}")

# 4. get the data for the task (for baseline, no transforms applied, simple train/test split used)
# This is the equivalent of building an instance of the pytorch 'Dataset' class using the CIFAR dataset
# Each dataset can be indexed into and each individual sample is a tuple of the form (image, target) where target is index of the target class ref https://pytorch.org/vision/stable/generated/torchvision.datasets.CIFAR10.html#torchvision.datasets.CIFAR10.
# a simple train/test split

batch_size = 16

transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])


train_data = torchvision.datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)
test_data = torchvision.datasets.CIFAR10(root='./data', train=False, transform=transform)

num_validation_samples = 5000
num_train_samples = len(train_data) - num_validation_samples

train_data, val_data = random_split(train_data, [num_train_samples, num_validation_samples])



print(len(train_data)) # 50000 training egs
print(len(val_data)) # 10000 test egs
print(len(test_data)) # 10000 test egs

train_dataloader = DataLoader(train_data, batch_size=batch_size, shuffle=True, num_workers=2)
val_dataloader = DataLoader(val_data, batch_size=batch_size, shuffle=True, num_workers=2)
test_dataloader = DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=2)



Using cpu device
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
IN_COLAB= True
Files already downloaded and verified
45000
5000
10000


In [None]:

# The baseline version of your CNN must have an input, three convolutional hidden layers, then two fully connected layers, where the
# second fully connected layer is 10-dimensional and provides the output classifications of your network. Each convolutional layer
# block must include pooling and a non-linear activation function of your choice. You can use PyTorch (recommended), or any

# I NEED
# - POOL AFER CONV2,
# - CONV3
# - pool after conv 3
# - only two fully connected
# - relu for all

class BaselineNet(nn.Module):
    def __init__(self):
        super().__init__()

        # First Convolutional Layer
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5, stride=1, padding=0)
        # Pooling Layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        # Second Convolutional Layer
        self.conv2 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=5, stride=1, padding=0)

        # First Fully Connected Layer
        self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features=120)
        # Second Fully Connected Layer
        self.fc2 = nn.Linear(in_features=120, out_features=84)
        # Output Layer
        self.fc3 = nn.Linear(in_features=84, out_features=10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))  # 1st convolutional layer + ReLU + pooling
        x = self.pool(F.relu(self.conv2(x)))  # 2nd convolutional layer + ReLU + pooling
        x = torch.flatten(x, 1)  # flatten all dimensions except batch
        x = F.relu(self.fc1(x))  # 1st fully connected layer + ReLU
        x = F.relu(self.fc2(x))  # 2nd fully connected layer + ReLU
        x = self.fc3(x)  # 3rd fully connected layer (output)
        return x

In [None]:
# Hyperparameters
# Have good starting ones from hiS NOTES!!!

num_epochs = 10
num_classes = 10
batch_size = 100
learning_rate = 0.0001

# LOSS AND OPTIMIZER
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(BaselineNet.parameters(), lr=learning_rate, momentum=0.9)

In [8]:
# RUN THE TRAINING OF THE MODEL AND THEN EVALUATE!!!!!!!!!!!!!!!!

# Results (instructions) - 55 MARKS
Use the Results section to summarise your findings from the experiments. For each experiment, use the Markdown/text cell to describe and explain your results, and use the code cell (and additional code cells if necessary) to conduct the experiment and produce figures to show your results.


### Experiment 1 (17 MARKS)
*Write up results for Experiment 1 here*

In [None]:
#############################
### Code for Experiment 1 ###
#############################

### Experiment 2 (19 MARKS)
*Write up results for Experiment 2 here*

In [None]:
#############################
### Code for Experiment 2 ###
#############################

### Experiment 3 (19 MARKS)
*Write up results for Experiment 3 here*

In [None]:
#############################
### Code for Experiment 3 ###
#############################

# Conclusions and Discussion (instructions) - 25 MARKS
In this section, you are expected to:
* briefly summarise and describe the conclusions from your experiments (8 MARKS).
* discuss whether or not your results are expected, providing scientific reasons (8 MARKS).
* discuss two or more alternative/additional methods that may enhance your model, with scientific reasons (4 MARKS).
* Reference two or more relevant academic publications that support your discussion. (4 MARKS)

*Write your Conclusions/Discussion here*

# References (instructions)
Use the cell below to add your references. A good format to use for references is like this:

[AB Name], [CD Name], [EF Name] ([year]), [Article title], [Journal/Conference Name] [volume], [page numbers] or [article number] or [doi]

Some examples:

JEM Bennett, A Phillipides, T Nowotny (2021), Learning with reinforcement prediction errors in a model of the Drosophila mushroom body, Nat. Comms 12:2569, doi: 10.1038/s41467-021-22592-4

SO Kaba, AK Mondal, Y Zhang, Y Bengio, S Ravanbakhsh (2023), Proc. 40th Int. Conf. Machine Learning, 15546-15566

*List your references here*