Name: Arjun Bhan UNI: AB5666

# MNIST CNN Digit Recognition Network

For this problem, you will code a basic digit recognition network. The data are images which specify the digits 1 to 10 as (1, 28, 28) data - this data is black and white images. Each pixed of the image is an intensity between 0 and 255, and together the (1, 28, 28) pixel image can be visualized as a picture of a digit. The data is given to you as $\{(x^{(i)}, y^{(i)})\}_{i=1}^{N}$ where $y$ is the given label and x is the (1, 28, 28) data. This data will be gotten from `torchvision`, a repository of computer vision data and models.

Highlevel, the model and notebook goes as follows:
*   You first download the data and specify the batch size of B = 16. Each image will need to be turned from a (1, 28, 28) volume into a serious of other volumes either via convolutional layers or max pooling layers.
*   You will pass the data through several layers to built a CNN classfier. Use the hints below to get the right dimensions and figure out what the layers should be. Be careful with the loss function. Add regularization (L1 and L2) manually.







In [None]:
!pip install torchmetrics



In [None]:
import torchvision
from torchvision import transforms
import torch
from torch.utils.data import DataLoader, TensorDataset
import torch.nn as nn
import torchmetrics
from torchmetrics import Accuracy

In [None]:
SEED = 1
torch.manual_seed(SEED)
_FILL_ = '_FILL_'

In [None]:
image_path = './'

transform = transforms.Compose([transforms.ToTensor()])

mnist_train_dataset = torchvision.datasets.MNIST(
    root = image_path, train = True, transform = transform, download = True
  )

mnist_test_dataset = torchvision.datasets.MNIST(
    root = image_path, train = False, transform = transform, download = True
)

In [None]:
BATCH_SIZE = 16
LR = 0.1
L1_WEIGHT = 1e-10
L2_WEIGHT = 1e-12
EPOCHS = 20
train_dl = DataLoader(mnist_train_dataset, batch_size = BATCH_SIZE, shuffle = True)
test_dl = DataLoader(mnist_test_dataset, batch_size = BATCH_SIZE, shuffle = True)

In [None]:
mnist_train_dataset

Dataset MNIST
    Number of datapoints: 60000
    Root location: ./
    Split: Train
    StandardTransform
Transform: Compose(
               ToTensor()
           )

In [None]:
class CNNClassifier(nn.Module):

  def __init__(self):
    super().__init__()
    self.cnn1 = nn.Conv2d(1, 32, kernel_size= 3, stride=1, padding=0)
    self.cnn2 = nn.Conv2d(32, 16, kernel_size= 3, stride=1)
    self.cnn3 = nn.Conv2d(16, 1, kernel_size= 1)
    self.linear = nn.Linear(25, 10)

  def forward(self, x):
    assert(x.shape == (BATCH_SIZE, 1, 28, 28))

    x =  self.cnn1(x)
    assert(x.shape == (BATCH_SIZE, 32, 26, 26))


    maxPool1 = nn.MaxPool2d(kernel_size= 2, stride=2)
    x = maxPool1(x)
    assert(x.shape == (BATCH_SIZE, 32, 13, 13))
    m = nn.ReLU()
    x = m(x)


    x = self.cnn2(x)
    assert(x.shape == (BATCH_SIZE, 16, 11, 11))


    maxPool2 = nn.MaxPool2d(kernel_size= 2, stride=2)
    x = maxPool2(x)
    assert(x.shape == (BATCH_SIZE, 16, 5, 5))

    x = m(x)


    x = self.cnn3(x)
    assert(x.shape == (BATCH_SIZE, 1, 5, 5))

    x =  m(x)


    x = torch.flatten(x, start_dim = 1)
    assert(x.shape == (BATCH_SIZE, 25))


    x = self.linear(x)
    assert(x.shape == (BATCH_SIZE, 10))
    return x

model = CNNClassifier()

In [None]:

loss_fn = nn.CrossEntropyLoss()


optimizer = torch.optim.SGD(model.parameters(), lr = LR)

torch.manual_seed(SEED)


for epoch in range(EPOCHS):
    accuracy_hist_train = 0
    auroc_hist_train = 0.0
    loss_hist_train = 0
    for x_batch, y_batch in train_dl:
        y_pred = model(x_batch)

        loss = loss_fn(y_pred, y_batch)

        l1_reg = torch.tensor(0.).to(y_batch.device)
        for i in model.parameters():
          l1_reg += i.abs().sum()

        l2_reg = torch.tensor(0.).to(y_batch.device)
        for i in model.parameters():
          l2_reg += i.pow(2).sum()

        loss += l1_reg * L1_WEIGHT +  l2_reg * L2_WEIGHT

        loss.backward()

        loss_hist_train += loss.item() * x_batch.size(0)

        optimizer.step()

        optimizer.zero_grad()


        is_correct = torchmetrics.Accuracy(task="multiclass", num_classes = 10)(y_pred, y_batch).item() * x_batch.size(0)

        accuracy_hist_train += is_correct
    accuracy_hist_train /= len(train_dl.dataset)
    loss_hist_train /= len(train_dl.dataset)
    print(f'Train Metrics Epoch {epoch} Loss {loss_hist_train:.4f} Accuracy {accuracy_hist_train:.4f}')

    accuracy_hist_test = 0
    loss_hist_test = 00

    with torch.no_grad():
      accuracy_hist_test = 0
      auroc_hist_test = 0.0
      for x_batch, y_batch in test_dl:
          y_batch_pred = model(x_batch)

          loss = loss_fn(y_batch_pred, y_batch)

          l1_reg = torch.tensor(0.).to(y_batch.device)
          for i in model.parameters():
            l1_reg += i.abs().sum()

          l2_reg = torch.tensor(0.).to(y_batch.device)
          for i in model.parameters():
            l2_reg += i.pow(2).sum()

          loss += l1_reg * L1_WEIGHT +  l2_reg * L2_WEIGHT

          loss_hist_test += loss.item() * x_batch.size(0)

          is_correct = torchmetrics.Accuracy(task="multiclass", num_classes = 10)(y_batch_pred, y_batch).item() * x_batch.size(0)

          accuracy_hist_test += is_correct

      accuracy_hist_test /= len(test_dl.dataset)
      loss_hist_test /= len(test_dl.dataset)
      print(f'Test Metrics Epoch {epoch} Loss {loss_hist_test:.4f} Accuracy {accuracy_hist_test:.4f}')

Train Metrics Epoch 0 Loss 0.4118 Accuracy 0.8717
Test Metrics Epoch 0 Loss 0.2048 Accuracy 0.9362
Train Metrics Epoch 1 Loss 0.2293 Accuracy 0.9303
Test Metrics Epoch 1 Loss 0.1661 Accuracy 0.9477
Train Metrics Epoch 2 Loss 0.1966 Accuracy 0.9405
Test Metrics Epoch 2 Loss 0.1864 Accuracy 0.9450
Train Metrics Epoch 3 Loss 0.1813 Accuracy 0.9442
Test Metrics Epoch 3 Loss 0.1521 Accuracy 0.9548
Train Metrics Epoch 4 Loss 0.1691 Accuracy 0.9486
Test Metrics Epoch 4 Loss 0.1888 Accuracy 0.9394
Train Metrics Epoch 5 Loss 0.1600 Accuracy 0.9506
Test Metrics Epoch 5 Loss 0.1381 Accuracy 0.9574
Train Metrics Epoch 6 Loss 0.1527 Accuracy 0.9538
Test Metrics Epoch 6 Loss 0.1771 Accuracy 0.9434
Train Metrics Epoch 7 Loss 0.1477 Accuracy 0.9551
Test Metrics Epoch 7 Loss 0.1405 Accuracy 0.9549
Train Metrics Epoch 8 Loss 0.1434 Accuracy 0.9556
Test Metrics Epoch 8 Loss 0.1360 Accuracy 0.9584
Train Metrics Epoch 9 Loss 0.1395 Accuracy 0.9568
Test Metrics Epoch 9 Loss 0.1207 Accuracy 0.9616
Train Metr