# Assignment 3: Convolutional Neural Networks (50 marks total)
### Due: October 17 at 11:59pm

### Name: Hiu Sum Yuen

The goal of this assignment is to apply Convolutional Neural Networks (CNNs) in PyTorch for image classification.

## Part 1: MLP vs. CNN

### Step 0: Import Libraries

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, random_split
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

### Step 1: Data Loading & Preprocessing (11 marks)

This assignment will use the CIFAR-10 dataset (*available via torchvision.datasets*). CIFAR-10 is a smaller version of the ImageNet dataset, that contains 60,000 32×32 color images in 10 classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck).

The first step is to define the transformations for the training and testing datasets. In this case, we will apply a random horizontal flip. Apply any other transformations that are required.

In [None]:
# TODO: define transforms for train and test sets (2 marks)
transform_train = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2470, 0.2435, 0.2616))
])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2470, 0.2435, 0.2616))
])

The next step is to load the training and testing datasets and split the training data into training and validation sets. You can consider 20% of the training dataset will be used for validation. You can use a batch size of 64 for all the datasets.

In [None]:
# TODO: Load Datasets (1 mark)
train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, 
                                           download=True, transform=transform_train)
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, 
                                          download=True, transform=transform_test)

# TODO: Split train into train + validation (1 mark)
train_size = int(0.8 * len(train_dataset))
val_size = len(train_dataset) - train_size
train_dataset, val_dataset = random_split(train_dataset, [train_size, val_size])

# TODO: Data loaders (1 mark)
batch_size = 64
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

For this assignment, we will compare the performance of a feed-forward network (MLP) with a convolutional neural network (CNN). We will need to define a separate class to represent each model.

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Your MLP class should have the following minimum requirements:
- At least 3 layers with ReLU activations
- At least 500 hidden units per layer
- Softmax output layer for 10 classes

Print a summary of the model architecture (number of parameters, layer shapes)

In [None]:
# TODO: Define Neural Network Model (3 marks)
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(32*32*3, 512)
        self.fc2 = nn.Linear(512, 512)
        self.fc3 = nn.Linear(512, 256)
        self.fc4 = nn.Linear(256, 10)
        self.dropout = nn.Dropout(0.2)
        
    def forward(self, x):
        x = x.view(-1, 32*32*3)  # Flatten the image
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        x = F.relu(self.fc3(x))
        x = self.dropout(x)
        x = self.fc4(x)
        return F.log_softmax(x, dim=1)

# Create model and print summary
mlp_model = MLP().to(device)
print("MLP Model Summary:")
print(mlp_model)
print(f"Total parameters: {sum(p.numel() for p in mlp_model.parameters())}")

Your CNN class should have the following minimum requirements:
- At least 3 convolutional layers with ReLU activations
  - Example output for 3 layers: 32 feature maps -> 64 feature maps -> 128 feature maps
- At least 2 max-pooling layers
- At least 1 fully connected layer before the output
- Softmax output layer for 10 classes
- Use dropout to improve generalization

Print a summary of the model architecture (number of parameters, layer shapes)

In [None]:
# TODO: Define CNN Model (3 marks)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.conv3 = nn.Conv2d(64, 128, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(128 * 4 * 4, 512)
        self.fc2 = nn.Linear(512, 10)
        self.dropout = nn.Dropout(0.3)
        
    def forward(self, x):
        # Convolutional layers
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        
        # Flatten
        x = x.view(-1, 128 * 4 * 4)
        
        # Fully connected layers
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)

# Create model and print summary
cnn_model = CNN().to(device)
print("CNN Model Summary:")
print(cnn_model)
print(f"Total parameters: {sum(p.numel() for p in cnn_model.parameters())}")

### Step 3: Define Training and Testing Loops (7 marks)

Next, you will need to define the loss criterion and the optimizer. Select an appropriate criterion. For the optimizer, use Adam with a constant learning rate of 0.001.

In [None]:
# TODO: Define loss criterion and optimizer (2 marks)


Since we are evaluating the performance of two different models, we can create functions for both the training and validation loops. For each loop, you should print the average loss and the accuracy for each epoch.

In [None]:
# TODO: Training loop (3 marks)


In [None]:
# TODO: Validation loop (2 marks)


### Step 4: Train, Evaluate and Visualize (11 marks)

We can also create functions for plotting the training and validation losses over the epochs and for creating a confusion matrix.

In [None]:
# TODO: Create plotting functions (2 marks)


Now that we have defined our required functions, we can train, evaluate and visualize the results for both models. You can use 20 epochs for both models.

In [None]:
# TODO: Run Training and validation loops for NN (2 marks)


In [None]:
# TODO: Plot losses for NN (1 mark)


In [None]:
# TODO: Run Training and validation loops for CNN (2 marks)


In [None]:
# TODO: Plot losses for CNN (1 mark)


### Step 5: Model Testing (3 marks)

Now that we have compared the two models, we can select the best one and use the testing data to see how well this model generalizes.

In [None]:
# TODO: Test loop (3 marks)


### Questions (12 marks)
1. How do your training, validation, and test accuracies compare for your best model? What does this tell you about the model's generalization?
1. Examine your results for both models. Do you see signs of overfitting or underfitting? Explain what indicates this.
1. How can we further improve the results? Provide two suggestions.
1. If your model performs poorly on certain classes (check your confusion matrix), what does that suggest about those images or their features?
1. What is the role of dropout in the CNN? How might removing dropout change your results?
1. What other data transformations did you include and why? Are there any other data augmentation methods that we could use for this dataset?

*ANSWER HERE*

### Process Description (4 marks)
Please describe the process you used to create your code. Cite any websites or generative AI tools used. You can use the following questions as guidance:
1. Where did you source your code?
1. In what order did you complete the steps?
1. If you used generative AI, what prompts did you use? Did you need to modify the code at all? Why or why not?
1. Did you have any challenges? If yes, what were they? If not, what helped you to be successful?

*DESCRIBE YOUR PROCESS HERE - BE SPECIFIC*

## Part 2: Reflection (2 marks)
Include a sentence or two about:
- what you liked or disliked,
- found interesting, confusing, challenging, motivating
while working on this assignment.


*ADD YOUR THOUGHTS HERE*