<a href="https://colab.research.google.com/github/cnwokoye1/Image-Classification-Using-CNNs/blob/main/A5_CNN_Nwokoye_Christopher.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# YOUR NAME HERE: Christopher Nwokoye

# A5 Convolutional Neural Network (Total 150pts)


## 1. Import libraries (Total 6pts)

### 1.1 Import torch, torchvision, torchvision.transforms, torch.utils.data and torch.nn (6pts)

In [None]:
import torch
import torchvision
import torchvision.transforms
import torch.utils.data as data
import torch.nn as nn

## 2. Data Preparation (Total 32pts)


### 2.1 Image Transformation (12pts)
Define a transformation pipeline using torchvision.transforms.Compose.

In the pipeline, use **ColorJitter, GaussianBlur, RandomHorizontalFlip, ToTensor and Normalize** from the transforms library.

For the first four transformations, use suitable parameters of your informed choice. At the end, normalize the images with mean 0.5 and variance 0.5.

Read about these transformations here: https://pytorch.org/vision/0.9/transforms.html

In [None]:
trans_pipeline = torchvision.transforms.Compose(
    [
      torchvision.transforms.ColorJitter(),
      torchvision.transforms.GaussianBlur(kernel_size=3),
      torchvision.transforms.RandomHorizontalFlip(),
      torchvision.transforms.ToTensor(),
      torchvision.transforms.Normalize(mean=0.5, std = 0.5),
    ]
)


### 2.2 Prepare train and test set by loading CIFAR10 dataset from torchvision.datasets. (4pts)
Make sure you are using the **transform** pipeline (you just wrote in task #2.1) on both train and test set.

**Hint:** Preparing train and test sets can be directly achieved by utilizing the class parameters.


Read about CIFAR10 dataset class in PyTorch: https://pytorch.org/vision/0.9/datasets.html#cifar

In [None]:
# import needed library
import torchvision.datasets as datasets

# initialize the CIFAR training and test sets
cifar_train = datasets.CIFAR10(root='./data', train=True, download=True, transform=trans_pipeline)
cifar_test = datasets.CIFAR10(root='./data', train=False, download=True, transform=trans_pipeline)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


  0%|          | 196608/170498071 [00:00<01:27, 1952646.17it/s]

### 2.3 Use torch.utils.data.random_split() to make a validation set from the training set with 80:20 split. (3pts)

Make sure the training set you'll use after this point excludes the validation set of images


In [None]:
# Random split
train_set_size = int(len(cifar_train) * 0.8)
valid_set_size = len(cifar_train) - train_set_size

train_set, valid_set = data.random_split(cifar_train, [train_set_size, valid_set_size])

### 2.4 Prepare three dataloaders for train, validation and test set. Use an appropriate batchsize of your choice. (1+2+2+2 =7pts)


**Hints:**
1. Remember that choosing a batchsize is always a trade-off between efficiency and generalizability. With large batchsize, your model learns more and better in each forward pass, but each pass will require larger computation. On the other hand, with small batchsize, it might converge quicker, but each forward pass teaches features from a smaller subset, which may not be a good representation of the overall data; leading to jittery convergence.
2. During training, you will use the train and validation set for tracking the loss and avoiding overfitting. The test set will be hold out until you are ready to evaluate a trained model on new data.

Read about pytorch Dataloaders here:
https://pytorch.org/tutorials/beginner/basics/data_tutorial.html#preparing-your-data-for-training-with-dataloaders

In [None]:
# import needed module
from torch.utils.data import DataLoader

# TODO: set a batch size
size = 20

# TODO: write dataloader for train set
train_loader = DataLoader(train_set, batch_size=size, shuffle=True)

# TODO: write dataloader for test set
test_loader = DataLoader(cifar_test, batch_size=size, shuffle=True)

# TODO: write dataloader for validation set
valid_loader = DataLoader(valid_set, batch_size=size, shuffle=True)


### 2.5 Load a random batch of images from the training set using the trainloader. Then use *make_grid()*  from *torchvision.utils* and *imshow()* from *matplotlib.pyplot* to show the images. Also, print the corresponding true labels for those image samples. (6pts)
Hint: you may need to reshape the *make_grid()* output to comply with the format *imshow()* accepts.

In [None]:
# import needed modules
from torchvision.utils import make_grid
from matplotlib.pyplot import imshow
import matplotlib.pyplot as plt

# TODO: load a random batch of test set images
batch = next(iter(train_loader))

# TODO: show the images
plt.imshow(batch)

# TODO: print the ground truth class labels for these images
batch.labels_


## 3. Model Design (Total 22pts)

### 3.1 Define a neural network model: (2+7+7 =16pts)
- Name the model class with your first name
- In the following sequential order, the model should consist:

    (1) a 2D convolution layer with 6 filters, dimension of each filter is (5, 5), stride=1, no zero padding
    
    (2) a Max Pool layer with filter size (2, 2), stride=2
    
    (3) a 2D convolution layer with 16 filters, dimension of each filter is (5, 5), stride=1, no zero padding

    (4) a 2D Max Pool layer with filter size (2, 2), stride=2
    
    ~ a flatten layer ~

    (5) a Dense/Fully-connected layer with 120 neurons
    
    ~ a ReLU activation ~
    
    ~ a Dropout Layer ~

    (6) a Dense/Fully-connected layer with 80 neurons
    
    ~ a ReLU activation ~

    (7) a Dense/Fully-connected layer with 10 neurons

Note:
1. Flatten, ReLU and Dropout are not really "layers". They are operations with specific purpose. But in model construction in pytorch, they are abstracted as layers.
    
    Flatten is used to convert the 4th layer output to a 1D tensor so that it can be passed through the next fully-connected layer. Since each forward pass takes a batch of data, use the *start_dim* parameter of *torch.flatten()* appropriately to keep the batch dimension intact.
    
    ReLU is an activation that transforms the Dense Layer's linear output to a non-linear "active" output.
    
    Dropout is a regularization technique. Read more in slides. In this assignment, you can drop neurons with 50% probability.

2. This dataset has 10 classes, hence the final layer consists 10 neurons.

3. The model architecture is similar to the one you saw in in-class Quiz 2, with an extra dense layer in the end.

    Read about building your custom model in pytorch here: https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html

    The official pytorch documentation on conv, flatten, rely, dense are also resourceful.


In [None]:
class Net(nn.Module):
  def __init__(self):
    # TODO: Initialize the layers
    self.conv1 = nn.Conv2d(6, (5, 5), stride=1)
    self.maxpool = nn.MaxPool((2, 2), stride=2)
    self.conv2 = nn.Conv2d(16, (5, 5), stride=1)
    self.maxpool = nn.MaxPool2d((2, 2), stride=2)
    torch.flatten(start_dim=5)
    self.fc1 = nn.Linear(120)
    torch.nn.ReLU()
    nn.Dropout(0.5)
    self.fc2 = nn.Linear(80)
    torch.nn.ReLU
    self.fc3 = nn.Linear(10)

  def forward(self, x):
    # TODO: Define the dataflow through the layers
    x = self.features(x)
    x = self.avgpool(x)
    x = self.classifier(x)
    return x

### 3.2 Create an instance of the model class that you just prepared. (2pts)

In [None]:
# TODO:
net = Net()

### 3.3 Set up Cross Entropy Loss as the loss function and *Adam* as the optimizer. Use a learning rate of your choice for the optimizer. (4pts)


In [None]:
# import module
import torch.optim as optim

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.paramters(), lr=3e-4)

## 4. Training and Validation (Total 50pts)


### 4.1 Write a training loop to load data, compute model output, compute loss and backpropagating it to update model parameters. (30pts)

The # TODO tags below contain further instructions.

In [None]:
# TODO: Define number of epochs
epochs = 10

# TODO: Initialize empty lists to store training loss, training accuracy, validation loss, validation accuracy
# You will use these lists to plot the loss history.
train_loss, train_acc, val_loss, val_acc = ([] for i in range(4))

# TODO: Loop through the number of epochs
for epoch in epochs:
    # TODO: set model to train mode
    net.train()

    # TODO: iterate over the training data in batches
    for i, data in enumerate(train_loader, 0):

        # TODO: get the image inputs and labels from current batch
        inputs, labels = data

        # TODO: set the optimizer gradients to zero to avoid accumulation of gradients
        optimizer.zero_grad()

        # TODO: compute the output of the model
        outputs = net(inputs)

        # TODO: compute the loss on current batch
        loss = criterion(outputs, labels)

        # TODO: backpropagate the loss
        loss.backward()

        # TODO: update the optimizer parameters
        optimizer.step()

        # TODO: update the train loss and accuracy
        train_loss.append(outputs)
        train_acc.append(outputs)

    # TODO: compute the average training loss and accuracy and store in respective arrays
    loss_avg = mean(train_loss)
    acc_avg = mean(train_acc)

    train_loss.append(loss_avg)
    train_acc.append(acc_avg)

    # TODO: set the model to evaluation mode
    net.eval()

    # TODO: keeping the gradient computation turned off, loop over batches of data from validation set.
    for i, data in enumerate(valid_loader, 0):
            # TODO: compute output of the model
            inputs, labels = data
            outputs = net(input)

            # TODO: compute the loss
            loss = criterion(outputs, labels)

            # TODO: compute the validation loss and accuracy
            val_loss.append(outputs)
            val_acc.append(outputs)

    # TODO: compute the average validation loss and accuracy and store in respective lists
    loss_avg = mean(val_loss)
    acc_avg = mean(val_acc)

    val_loss.append(loss_avg)
    val_acc.append(acc_avg)

    # TODO: print the training loss, training accuracy, validation loss and validation accuracy at the end of each epoch
    print(train_loss, train_acc, val_loss, val_acc)

    # TODO: save the model parameters once in every 5 epochs
    if epoch % 5 == 0
      self.model.save("model_{}".format(epoch))


### 4.2 Plot and compare (5+5 =10pts)
1. training and validation loss over the number of epochs
2. training and validation accuracy over the number of epochs

(Hint: Use plot() from *matplotlib.pyplot*, import it if you haven't already done so.)

In [None]:
# TODO: plot the training and validation loss
plt.plot(train_loss, val_loss)
plt.show()

# TODO: plot the training and validation accuracy
plt.plot(train_acc, val_acc)
plt.show()

### 4.3 Discussion: (2*5 = 10pts)
(1) Does the training loss and accuracy improve as number of epoch increases?

(2) Does the validation loss and accuracy improve as number of epoch increases?

(3) Are there any sign of overfitting in the results? If so, when did it start to occur?

(4) How many epochs did it take for the model to converge to a good solution?

(5) What enhancement can be tried to the architecture to further improve the validation performance?

1) I found that increasing the number of epochs does not improve training loss and/or accuracy.

2) I found that increasing the number of epochs does reduce or improve validation loss, but it does not improve validation accuracy.

3) I discovered overfitting to occur when testing the model with an increased number of epochs.

4) Twenty epochs was a decent number for my model in terms of it converging to a good solution during testing.

5) Adding more data to the model is an enhancement that may further improve validation performance.

## 5. Testing on new data (Total 40pts)


### 5.1 Load the best performing model (one with good validation accuracy and without overfitting) among the ones you saved. (4pts)

In [None]:
# TODO: instantiate a model
net = Net()

# TODO: load parameters from one of the saved model states
net.load()

# TODO: set this model to evaluation mode
net.eval()


### 5.2 Take a random batch of images from test set and show the images. Print the corresponding ground truth class labels. Then compute model output (model selected at previous step) and the predicted labels for the images in this batch. (10pts)

This is similar to task #2.5 with additional task on computing model output and printing predicted labels.

In [None]:
# TODO: load a random batch of test set images
batch = next(iter(train_loader))

# TODO: show the images
plt.imshow(batch)

# TODO: print the ground truth class labels for these images
batch.labels_

# TODO: compute model output
inputs, labels = data
outputs = net(input)

# TODO: print the predicted class labels for these images
print(outputs)


### 5.3 Compute the average accuracy on test data using this model. (4+2 =6pts)
Loop over the test set, compute accuracy on each batch, lastly print the average accuracy.

In [None]:
# TODO: compute accuracy on each batch of test set
from sklearn.metrics import balanced_accuracy_score
acc = balanced_accuracy_score(actual, pred)

# TODO: print the average accuracy
print(acc)


### 5.4 Compute the average accuracy for each individual class. (8+4 =12pts)
Hint: similar to #5.3. During each loop, log the accuracy for each class separately (a python/numpy dictionary can help). Then print the individual accuracy for the 10 output classes.

In [None]:
# TODO: compute per-class accuracy on each batch of test set
for i in range(10):
  acc = balanced_accuracy_score(actual, pred)

# TODO: print per-class accuracy for 10 output classes
print(acc)


### 5.5 Discussion: (2+2+4 =8pts)
1. Which class of images were detected with highest accuracy?
2. Which class of images were hardest for the model to detect?
3. Explain 1-2 possible reasons why detection of some class can be harder for the same model.

1) I found the third and fifth classes of images to have the highest accuracy.

2) During the testing of the model, I discovered the second and the fourth classes of images to be difficult for the model to detect.

3) When it comes to the detection of some classes for the model, an object viewed from different angles may look different to the model. Such a difference creates some challenges for a model to detect such a class.