For this notebook, please insert where there is `_FILL_` either code or logic to make this work.



# MNIST CNN Digit Recognition Network

For this problem, you will code a basic digit recognition network. The data are images which specify the digits 1 to 10 as (1, 28, 28) data - this data is black and white images. Each pixed of the image is an intensity between 0 and 255, and together the (1, 28, 28) pixel image can be visualized as a picture of a digit. The data is given to you as $\{(x^{(i)}, y^{(i)})\}_{i=1}^{N}$ where $y$ is the given label and x is the (1, 28, 28) data. This data will be gotten from `torchvision`, a repository of computer vision data and models.

Highlevel, the model and notebook goes as follows:
*   You first download the data and specify the batch size of B = 16. Each image will need to be turned from a (1, 28, 28) volume into a serious of other volumes either via convolutional layers or max pooling layers.
*   You will pass the data through several layers to built a CNN classfier. Use the hints below to get the right dimensions and figure out what the layers should be. Be careful with the loss function. Add regularization (L1 and L2) manually.

See the comments below and fill in the analysis where there is `_FILL_` specified. All asserts should pass and Test accuracy should be about 95%.






In [1]:
!pip install portalocker
!pip install torchmetrics
!pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 torchtext==0.17.0 --index-url https://download.pytorch.org/whl/cu121
!pip install torchdata

Looking in indexes: https://download.pytorch.org/whl/cu121


In [2]:
!pip install torchmetrics



In [3]:
import torchvision
from torchvision import transforms
import torch
from torch.utils.data import DataLoader, TensorDataset
import torch.nn as nn
import torchmetrics

In [4]:
SEED = 1
torch.manual_seed(SEED)
_FILL_ = '_FILL_'

In [5]:
image_path = './'

# Use ToTensor
transform = transforms.Compose([transforms.ToTensor()])

mnist_train_dataset = torchvision.datasets.MNIST(
    root=image_path,
    train=True,
    transform=transform,
    download=True
  )

mnist_test_dataset = torchvision.datasets.MNIST(
    root=image_path,
    train=False,
    transform=transform,
    download=True
)

In [6]:
BATCH_SIZE = 16
LR = 0.1
L1_WEIGHT = 1e-10
L2_WEIGHT = 1e-12
EPOCHS = 20
# Get the dataloader for train and test
train_dl = DataLoader(mnist_train_dataset, BATCH_SIZE, shuffle=True)
test_dl = DataLoader(mnist_test_dataset, BATCH_SIZE, shuffle=True)

In [7]:
class CNNClassifier(nn.Module):

  def __init__(self):
    super().__init__()
    self.cnn1 = nn.Conv2d(1, 32, kernel_size=3)
    self.cnn2 = nn.Conv2d(32, 16, kernel_size=3)
    self.cnn3 = nn.Conv2d(16, 1, kernel_size=1)
    self.linear = nn.Linear(25, 10)
    self.pool = nn.MaxPool2d(2, 2)

  def forward(self, x):
    # Flatten x to be of last dimension 784
    assert(x.shape == (BATCH_SIZE, 1, 28, 28))

    # Pass through cnn layer 1
    # (28, 28, 1) -> (26, 26, 32)
    x = self.cnn1(x)
    assert(x.shape == (BATCH_SIZE, 32, 26, 26))

    # Pass through max pooling to give the result shape below
    # (26, 26, 32) -> (13, 13, 32)
    x = self.pool(x)
    assert(x.shape == (BATCH_SIZE, 32, 13, 13))

    # Apply ReLU
    x = torch.relu(x)

    # Pass through cnn layer 2 to give the result below
    # (13, 13, 32) -> (11, 11, 16)
    x = self.cnn2(x)
    assert(x.shape == (BATCH_SIZE, 16, 11, 11))

    # Pass through max pooling pool to give the result below
    # (11, 11, 16) -> (5, 5, 16)
    x = self.pool(x)
    assert(x.shape == (BATCH_SIZE, 16, 5, 5))

    # Apply rely
    x = torch.relu(x)

    # Pass through cnn layer 3 to give the result below
    # (5, 5, 16) -> (5, 5, 1)
    x = self.cnn3(x)
    assert(x.shape == (BATCH_SIZE, 1, 5, 5))

    # Apply rely
    x = torch.relu(x)

    # Flatten to get the result below
    # (5, 5, 1) - > (25, )
    x = x.view(x.size(0), -1)
    assert(x.shape == (BATCH_SIZE, 25))

    # Pass through linear layer to get the result below
    # (25, ) -> (16, )
    x = self.linear(x)
    assert(x.shape == (BATCH_SIZE, 10))

    # Return the logits
    return x

model = CNNClassifier()

In [8]:
# Get the loss function; remember you are outputting the logits
loss_fn = nn.CrossEntropyLoss()

# Set the optimizer to SGD and let the learning rate be LR
# Do not add L2 regularization; add it manually below ...
optimizer = torch.optim.SGD(model.parameters(), lr=LR)

torch.manual_seed(SEED)

for epoch in range(EPOCHS):

    accuracy_hist_train = 0
    auroc_hist_train = 0.0
    loss_hist_train = 0

    # Loop through the x and y pairs of data
    for x_batch, y_batch in train_dl:
        # Get he the model predictions
        y_pred = model(x_batch)

        # Get the loss
        loss = loss_fn(y_pred, y_batch)

        # Add an L1 regularizaton with a weight of L1_WEIGHT to the objective
        l1_reg = torch.tensor(0.0)
        for param in model.parameters():
            l1_reg += torch.norm(param, 1)

        # Add an L2 regularization with a weight of L2_WEIGHT to the objective
        l2_reg = torch.tensor(0.0)
        for param in model.parameters():
            l2_reg += torch.norm(param, 2)

        # Add the regularizers to the objective
        loss += L1_WEIGHT * l1_reg + L2_WEIGHT * l2_reg

        # Get the gradients
        loss.backward()

        # Add to the loss
        # Remember loss: is a mean over the batch size and we need the total sum over the number of samples in the dataset
        loss_hist_train += loss.item() * x_batch.size(0)

        # Update the parameters
        optimizer.step()

        # Zero out the gradient
        optimizer.zero_grad()

        # Get the number of correct predictions, do this with torchmetrics
        is_correct = torchmetrics.functional.accuracy(y_pred, y_batch, task="multiclass", num_classes=10) * x_batch.size(0)
        accuracy_hist_train += is_correct

    accuracy_hist_train /= len(train_dl.dataset)
    loss_hist_train /= len(train_dl.dataset)
    print(f'Train Metrics Epoch {epoch} Loss {loss_hist_train:.4f} Accuracy {accuracy_hist_train:.4f}')

    accuracy_hist_test = 0
    loss_hist_test = 00
    # Get the average value of each metric across the test batches
    with torch.no_grad():
      accuracy_hist_test = 0
      auroc_hist_test = 0.0
      # Loop through the x and y pairs of data
      for x_batch, y_batch in test_dl:
          # Get he the model predictions
          y_batch_pred = model(x_batch)

          # Get the loss
          loss = loss_fn(y_batch_pred, y_batch)

          # Add an L1 regularizaton with a weight of L1_WEIGHT to the objective
          l1_reg = torch.tensor(0.0)
          for param in model.parameters():
              l1_reg += torch.norm(param, 1)

          # Add an L2 regularization with a weight of L2_WEIGHT to the objective
          l2_reg = torch.tensor(0.0)
          for param in model.parameters():
              l2_reg += torch.norm(param, 2)

          # Add the regularizers to the objective
          loss += L1_WEIGHT * l1_reg + L2_WEIGHT * l2_reg

          # Add to the loss
          # Remember loss: is a mean over the batch size and we need the total sum over the number of samples in the dataset
          loss_hist_test += loss.item() * x_batch.size(0)

          # Get the number of correct predictions via torchmetrics
          is_correct = torchmetrics.functional.accuracy(y_batch_pred, y_batch, task="multiclass", num_classes=10) * x_batch.size(0)

          # Get the accuracy
          accuracy_hist_test += is_correct

      # Normalize the metrics by the right number
      accuracy_hist_test /= len(test_dl.dataset)
      loss_hist_test /= len(test_dl.dataset)
      print(f'Test Metrics Epoch {epoch} Loss {loss_hist_test:.4f} Accuracy {accuracy_hist_test:.4f}')

Train Metrics Epoch 0 Loss 0.4138 Accuracy 0.8712
Test Metrics Epoch 0 Loss 0.2162 Accuracy 0.9352
Train Metrics Epoch 1 Loss 0.2322 Accuracy 0.9293
Test Metrics Epoch 1 Loss 0.1667 Accuracy 0.9494
Train Metrics Epoch 2 Loss 0.1948 Accuracy 0.9402
Test Metrics Epoch 2 Loss 0.1817 Accuracy 0.9460
Train Metrics Epoch 3 Loss 0.1782 Accuracy 0.9455
Test Metrics Epoch 3 Loss 0.1470 Accuracy 0.9547
Train Metrics Epoch 4 Loss 0.1668 Accuracy 0.9492
Test Metrics Epoch 4 Loss 0.1971 Accuracy 0.9371
Train Metrics Epoch 5 Loss 0.1571 Accuracy 0.9518
Test Metrics Epoch 5 Loss 0.1335 Accuracy 0.9589
Train Metrics Epoch 6 Loss 0.1527 Accuracy 0.9524
Test Metrics Epoch 6 Loss 0.1758 Accuracy 0.9462
Train Metrics Epoch 7 Loss 0.1470 Accuracy 0.9542
Test Metrics Epoch 7 Loss 0.1346 Accuracy 0.9570
Train Metrics Epoch 8 Loss 0.1433 Accuracy 0.9562
Test Metrics Epoch 8 Loss 0.1267 Accuracy 0.9594
Train Metrics Epoch 9 Loss 0.1397 Accuracy 0.9569
Test Metrics Epoch 9 Loss 0.1198 Accuracy 0.9611
Train Metr