## COMP 4437 WEEK 8 Assignment

The task in this assignment is hand-written digit recognition, i.e., a classifier must predict the value of the digit given its image. You are provided the codes to load the dataset with on-the-fly pre-processing transformation, evaluate the performance, and plot the learning curves. The batch size is fixed to 64 with the DataLoader objects. You will build an MLP model and a CNN model. You are free in your network design, choice of loss function, and optimizer selection. Your implementation must be by PyTorch.

1. Design a simple MLP by implementing the class SimpleMLP. Then write the training and validation code. You MUST have at least 95% accuracy on the validation set.
2. Design a simple CNN by implementing the class SimpleCNN. Then write the training and validation code. You MUST have at least 96% accuracy on the validation set (a simple CNN is expected to work better than a simple MLP in this task).

Suggestion: Standard convolution blocks would be fine, i.e., convolution -> Max. Pooling -> ReLU. You can repeat this block and at the final stage, you can put just a single linear layer or, if you want, an MLP, on top of the convolutional layers.

Remember, your implementations must utilize **PyTorch's functionalities**, and you should justify your choices for the network architectures, loss function, and optimizer. Consider the trade-offs or benefits your selections might have on the performance and efficiency of the training process.


In [1]:
import numpy as np
import torch
import torchvision
import torchvision.transforms as transforms
from torch import nn, optim
import torch.nn.functional as F
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import matplotlib.pyplot as plt

In [2]:
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)

In [3]:
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])

trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

validationset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
validationloader = torch.utils.data.DataLoader(validationset, batch_size=64, shuffle=False)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data\MNIST\raw\train-images-idx3-ubyte.gz


100.0%


Extracting ./data\MNIST\raw\train-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data\MNIST\raw\train-labels-idx1-ubyte.gz


100.0%


Extracting ./data\MNIST\raw\train-labels-idx1-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data\MNIST\raw\t10k-images-idx3-ubyte.gz


100.0%


Extracting ./data\MNIST\raw\t10k-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz


100.0%

Extracting ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw






In [4]:
def plot_losses(train_losses, val_losses, num_epochs):
    """
    Plots the training and validation losses.

    Parameters:
    - train_losses (list of float): A list containing the training loss values.
    - val_losses (list of float): A list containing the validation loss values.
    - num_epochs (int): The number of epochs the model was trained for.

    Returns:
    None
    """
    plt.figure(figsize=(12, 5))
    plt.plot(range(1, num_epochs + 1), train_losses, label='Training Loss')
    plt.plot(range(1, num_epochs + 1), val_losses, label='Validation Loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.title('Training and Validation Losses')
    plt.legend()
    plt.show()


In [5]:
def plot_accuracies(train_accuracies, val_accuracies, num_epochs):
    """
    Plots the training and validation accuracies.

    Parameters:
    - train_accuracies (list of float): A list containing the training accuracy values.
    - val_accuracies (list of float): A list containing the validation accuracy values.
    - num_epochs (int): The number of epochs the model was trained for.

    Returns:
    None
    """
    plt.figure(figsize=(12, 5))
    plt.plot(range(1, num_epochs + 1), train_accuracies, label='Training Accuracy')
    plt.plot(range(1, num_epochs + 1), val_accuracies, label='Validation Accuracy')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.title('Training and Validation Accuracies')
    plt.legend()
    plt.show()


In [7]:
import matplotlib.pyplot as plt

def plot_class_accuracies(class_correct, class_total, classes):
    """
    Plots the class-wise accuracies as a bar chart.

    Parameters:
    - class_correct (list): A list containing the number of correct predictions for each class.
    - class_total (list): A list containing the total number of predictions for each class.
    - classes (list): A list of classes.

    Returns:
    None
    """
    class_accuracies = [class_correct[i] / class_total[i] for i in classes]
    plt.figure(figsize=(12, 5))
    plt.bar(classes, class_accuracies)
    plt.xlabel('Class')
    plt.ylabel('Accuracy')
    plt.title('Class-wise Accuracies')
    plt.xticks(classes)
    plt.show()


---
---

In [8]:
class SimpleMLP(nn.Module):

  """
  A simple Multi-Layer Perceptron (MLP) for classification tasks that
  inherits from nn.Module, which is a base class for all neural network modules in PyTorch.

  Initialize the SimpleMLP model by setting up the layers.

  Define the forward pass of the MLP.
          Parameters:
          - x (Tensor): The input tensor containing the batch of images. The images
            must be flattened before being passed to the network.

          Returns:
          - x (Tensor): The output tensor containing the logits for each class in the
            batch. To convert these logits to probabilities, apply a softmax function
            outside this method.
  """

  def __init__(self):
      super().__init__()
      self.flatten = nn.Flatten()
      self.linear_relu_stack = nn.Sequential(
          nn.Linear(28*28, 512),
          nn.ReLU(),
          nn.Linear(512, 512),
          nn.ReLU(),
          nn.Linear(512, 10),
          nn.Softmax(1)
      )

  def forward(self, x):
      x = self.flatten(x)
      logits = self.linear_relu_stack(x)
      return logits


mlp = SimpleMLP()

a = mlp.forward(trainset[0][0])

print(torch.max(a))


tensor(0.1150, grad_fn=<MaxBackward1>)


In [9]:
class SimpleCNN(nn.Module):

  """
  A simple CNN for classification tasks that
  inherits from nn.Module, which is a base class for all neural network modules in PyTorch.

  Initialize the SimpleCNN model by setting up the layers.

  Define the forward pass of the CNN.
          Parameters:
          - x (Tensor): The input tensor containing the batch of images. The images
            must be flattened before being passed to the network.

          Returns:
          - x (Tensor): The output tensor containing the logits for each class in the
            batch. To convert these logits to probabilities, apply a softmax function
            outside this method.
  """


  def __init__(self):
      super().__init__()
      self.flatten = nn.Flatten()
      self.conv1 = nn.Conv2d(1,32,5,1,2)
      self.conv2 =nn.Conv2d(32,64,5,1,2)
      self.fc1 = nn.Linear(64 * 7*7, 512)
      self.fc2 = nn.Linear(512, 10)
      self.pool = nn.MaxPool2d(2,2,0)

  def forward(self, x):
      x = self.pool(F.relu(self.conv1(x)))
      x = self.pool(F.relu(self.conv2(x)))
      x = self.flatten(x)
      x = F.relu(self.fc1(x))
      x = self.fc2(x)
      return x


cnn = SimpleCNN().to(device)

In [10]:
"""
Add criterion (loss function) and optimizer
"""
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(cnn.parameters(), lr=1e-3)

In [11]:
num_epochs = 10
train_losses = []
val_losses = []
train_accuracies = []
val_accuracies = []
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
classes = range(10)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")


epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(trainloader, cnn, loss_fn, optimizer)
    test(validationloader, cnn, loss_fn)
print("Done!")

"""
    Epochs: An epoch is a single pass through the entire training dataset.
    The code is set to run for a total of 10 epochs.

    Loss and Accuracy Tracking: Separate lists are maintained to track training and validation losses
    (train_losses and val_losses) as well as accuracies (train_accuracies and val_accuracies).

    Class-wise Correct Predictions: class_correct and class_total lists are used to track the
    number of correct predictions and the total number of predictions, respectively, for each class
    across all validation batches.


    Training Phase (per epoch)
    #Mode Setting: The network is set to training mode (net.train()) to enable dropout
    layers and batch normalization layers to function in training mode.
    #Batch Processing: The training dataset is processed in batches. For each batch:
        #Forward Pass: The network computes the output based on the input data.
        #Loss Calculation: The loss is calculated using a criterion comparing the output and the true labels.
        #Backward Pass and Optimization: The gradients are calculated by backpropagating the loss,
        and the optimizer updates the model's parameters.
        #Accuracy Tracking: The total and correct predictions are tracked to calculate the accuracy for the epoch.
    #Loss and Accuracy Calculation: The average loss and accuracy for the epoch are appended to their respective lists.



    Validation Phase (per epoch)

    #Mode Setting: The network is set to evaluation mode (net.eval()) to disable dropout layers
    and batch normalization layers from updating, ensuring the model's behavior is consistent for evaluation.
    #Batch Processing: Similar to the training phase but with key differences:
        #No Gradient Calculations: Gradients are not computed (torch.no_grad()),
        which reduces memory consumption and speeds up the process.
        #Loss and Accuracy Tracking: As in training, but also includes class-wise accuracy calculations.
        #Class-wise Accuracy: For each prediction that matches the true label,
        the count of correct predictions per class is updated.

"""



Epoch 1
-------------------------------
loss: 2.296448  [   64/60000]
loss: 0.070572  [ 6464/60000]
loss: 0.135450  [12864/60000]
loss: 0.011585  [19264/60000]
loss: 0.112728  [25664/60000]
loss: 0.092556  [32064/60000]
loss: 0.043416  [38464/60000]
loss: 0.038307  [44864/60000]
loss: 0.014449  [51264/60000]
loss: 0.018752  [57664/60000]
Test Error: 
 Accuracy: 98.8%, Avg loss: 0.036713 

Epoch 2
-------------------------------
loss: 0.104605  [   64/60000]
loss: 0.026199  [ 6464/60000]
loss: 0.043216  [12864/60000]
loss: 0.010360  [19264/60000]
loss: 0.029768  [25664/60000]
loss: 0.019513  [32064/60000]
loss: 0.088705  [38464/60000]
loss: 0.061155  [44864/60000]
loss: 0.025440  [51264/60000]
loss: 0.012512  [57664/60000]
Test Error: 
 Accuracy: 98.8%, Avg loss: 0.035795 

Epoch 3
-------------------------------
loss: 0.001348  [   64/60000]
loss: 0.001878  [ 6464/60000]
loss: 0.014097  [12864/60000]
loss: 0.007091  [19264/60000]
loss: 0.029405  [25664/60000]
loss: 0.099096  [32064/600

KeyboardInterrupt: 

In [25]:
import torch
from sklearn.metrics import accuracy_score, precision_recall_fscore_support

def evaluate_model(loader, network):
    """
    Evaluates the performance of a neural network model on a dataset.

    Parameters:
    - loader (DataLoader): A DataLoader object that provides batches of the dataset for evaluation.
      It should yield two elements: inputs and labels.
    - network (nn.Module): The neural network model to be evaluated. This can be any model that is a
      subclass of torch.nn.Module, such as CNNs, MLPs, etc.

    Returns:
    - accuracy (float): The accuracy of the model on the dataset, defined as the proportion of
      correct predictions over the total number of instances evaluated.
    - precision (float): The macro-averaged precision of the model, which calculates precision
      (the ratio of true positive predictions to the total positive predictions) for each label,
      and then averages these values.
    - recall (float): The macro-averaged recall of the model, which calculates recall
      (the ratio of true positive predictions to the actual positive instances) for each label,
      and then averages these values.
    - f1 (float): The macro-averaged F1 score of the model, which is the harmonic mean of precision
      and recall. The F1 score is particularly useful in situations where there is an uneven class
      distribution or when false positives and false negatives have different costs.

    The function computes these metrics by first setting the model to evaluation mode and then
    iterating over the dataset, collecting the model's predictions. It leverages PyTorch's
    torch.no_grad() context to avoid calculating gradients during evaluation, which saves memory
    and computational resources. The sklearn.metrics library is used to compute the final metrics
    from the true labels and the predicted labels.

    """

    net = network
    net.eval()  # Set the model to evaluation mode
    y_true = []  # List to store the true labels
    y_pred = []  # List to store the predicted labels

    with torch.no_grad():  # Do not compute gradient since we're only predicting
        for inputs, labels in loader:
            outputs = net(inputs)
            _, predicted = torch.max(outputs, 1)  # Get the index of the max log-probability
            y_pred.extend(predicted.numpy())  # Convert predictions to numpy and add to list
            y_true.extend(labels.numpy())  # Convert true labels to numpy and add to list

    # Compute metrics
    accuracy = accuracy_score(y_true, y_pred)
    precision, recall, f1, _ = precision_recall_fscore_support(y_true, y_pred, average='macro')

    return accuracy, precision, recall, f1


"""
Note: Change the 'network' parameter name as appropriate for the model being evaluated, (cnn, mlp).
"""

accuracy, precision, recall, f1 = evaluate_model(validationloader, cnn)
print(f'Accuracy: {accuracy}\nPrecision: {precision}\nRecall: {recall}\nF1 Score: {f1}')

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

In [None]:
"""
Call Plotting Functions Here
"""