# 4. DNNs vs. CNNs on the MNIST dataset

### About this notebook

This notebook was used in the 50.039 Deep Learning course at the Singapore University of Technology and Design.

**Author:** Matthieu DE MARI (matthieu_demari@sutd.edu.sg)

**Version:** 1.0 (27/12/2022)

**Requirements:**
- Python 3 (tested on v3.9.6)
- Matplotlib (tested on v3.5.1)
- Torch (tested on v1.12.1)
- Torchvision (tested on v0.13.1)

### Imports and CUDA

In [1]:
# Matplotlib
import matplotlib.pyplot as plt
from matplotlib import cm
# Torch
import torch
import torchvision
from torch.utils.data import Dataset
from torchvision import datasets
import torch.optim as optim
from torchvision.transforms import ToTensor, Compose, Normalize
from torchvision.datasets import MNIST
import torch.nn.functional as F
import torch.nn as nn

In [2]:
# Use GPU if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


### MNIST Dataset

As before

In [3]:
# Define transform to convert images to tensors and normalize them
transform_data = Compose([ToTensor(),
                          Normalize((0.1307,), (0.3081,))])

# Load the data
batch_size = 256
train_dataset = MNIST(root='./mnist/', train = True, download = True, transform = transform_data)
test_dataset = MNIST(root='./mnist/', train = False, download = True, transform = transform_data)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size = batch_size, shuffle = False)

### Our CNN model

As in the previous notebook

In [4]:
class MNIST_CNN(nn.Module):
    def __init__(self):
        super(MNIST_CNN, self).__init__()
        # Two convolutional layers
        self.conv1 = nn.Conv2d(1, 32, kernel_size = 3, stride = 1, padding = 1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1)
        # Two fully connected layers
        self.fc1 = nn.Linear(64*28*28, 128) # 64*28*28 = 50176
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        # Pass input through first convolutional layer
        x = self.conv1(x)
        x = F.relu(x)
        # Pass output of first conv layer through second convolutional layer
        x = self.conv2(x)
        x = F.relu(x)
        # Flatten output of second conv layer
        x = x.view(-1, 64*28*28)
        # Pass flattened output through first fully connected layer
        x = self.fc1(x)
        x = F.relu(x)
        # Pass output of first fully connected layer through second fully connected layer
        x = self.fc2(x)
        return x

### Writing a trainer function like before

Below is a trainer function for our CNN model.
- We will be using the Adam optimizer, like before.
- Loss is cross entropy, like before.
- We will keep track of the train losses, test losses, train accuracies and test accuracies, and display them in training performance curves later.
- Over a given number of iterations, we will use stochastic mini-batches of the train dataset, eventually leveraging the power of the **backward()** PyTorch method to update parameters automatically for us in all the Conv2d and fully connected layers. We will also update the losses and accuracies on the fly.
- We will then set the model on eval mode and compute losses and accuracies on the testing set.

In [5]:
def train(model, train_loader, test_loader, epochs = 10, lr = 0.001):
    # Use Adam optimizer to update model weights
    optimizer = optim.Adam(model.parameters(), lr=lr)
    # Use cross-entropy loss function
    criterion = nn.CrossEntropyLoss()
    # Performance curves data
    train_losses = []
    train_accuracies = []
    test_losses = []
    test_accuracies = []
    
    for epoch in range(epochs):
        # Set model to training mode
        model.train()
        # Initialize epoch loss and accuracy
        epoch_loss = 0.0
        correct = 0
        total = 0
        # Iterate over training data
        for batch_number, (inputs, labels) in enumerate(train_loader):
            # Get from dataloader and send to device
            inputs = inputs.to(device)
            labels = labels.to(device)
            # Zero out gradients
            optimizer.zero_grad()
            # Compute model output and loss
            outputs = model(inputs)
            _, predicted = torch.max(outputs.data, 1)
            loss = criterion(outputs, labels)
            # Backpropagate loss and update model weights
            loss.backward()
            optimizer.step()
            # Accumulate loss and correct predictions for epoch
            epoch_loss += loss.item()
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
            print(f'Epoch {epoch+1}/{epochs}, Batch number: {batch_number}, Cumulated accuracy: {correct/total}')
        # Calculate epoch loss and accuracy
        epoch_loss /= len(train_loader)
        epoch_acc = correct/total
        train_losses.append(epoch_loss)
        train_accuracies.append(epoch_acc)
        print(f'--- Epoch {epoch+1}/{epochs}: Train loss: {epoch_loss:.4f}, Train accuracy: {epoch_acc:.4f}')
        
        # Set model to evaluation mode
        model.eval()
        # Initialize epoch loss and accuracy
        epoch_loss = 0.0
        correct = 0
        total = 0
        # Iterate over test data
        for inputs, labels in test_loader:
            # Get from dataloader and send to device
            inputs = inputs.to(device)
            labels = labels.to(device)
            # Compute model output and loss
            # (No grad computation here, as it is the test data)
            with torch.no_grad():
                outputs = model(inputs)
                _, predicted = torch.max(outputs.data, 1)
                loss = criterion(outputs, labels)
            # Accumulate loss and correct predictions for epoch
            epoch_loss += loss.item()
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
        # Calculate epoch loss and accuracy
        epoch_loss /= len(test_loader)
        epoch_acc = correct/total
        test_losses.append(epoch_loss)
        test_accuracies.append(epoch_acc)
        print(f'--- Epoch {epoch+1}/{epochs}: Test loss: {epoch_loss:.4f}, Test accuracy: {epoch_acc:.4f}')
    
    return train_losses, train_accuracies, test_losses, test_accuracies

We will train for 3 epochs only, which should prove enough to converge already.

In [6]:
model = MNIST_CNN().to(device)
train_losses, train_accuracies, test_losses, test_accuracies = train(model, \
                                                                     train_loader, \
                                                                     test_loader, \
                                                                     epochs = 3, \
                                                                     lr = 1e-3)

Epoch 1/10, Batch number: 0, Cumulated accuracy: 0.10546875
Epoch 1/10, Batch number: 1, Cumulated accuracy: 0.109375
Epoch 1/10, Batch number: 2, Cumulated accuracy: 0.13541666666666666
Epoch 1/10, Batch number: 3, Cumulated accuracy: 0.2373046875
Epoch 1/10, Batch number: 4, Cumulated accuracy: 0.2796875
Epoch 1/10, Batch number: 5, Cumulated accuracy: 0.3216145833333333
Epoch 1/10, Batch number: 6, Cumulated accuracy: 0.35435267857142855
Epoch 1/10, Batch number: 7, Cumulated accuracy: 0.38671875
Epoch 1/10, Batch number: 8, Cumulated accuracy: 0.421875
Epoch 1/10, Batch number: 9, Cumulated accuracy: 0.458203125
Epoch 1/10, Batch number: 10, Cumulated accuracy: 0.4889914772727273
Epoch 1/10, Batch number: 11, Cumulated accuracy: 0.5179036458333334
Epoch 1/10, Batch number: 12, Cumulated accuracy: 0.5420673076923077
Epoch 1/10, Batch number: 13, Cumulated accuracy: 0.5611049107142857
Epoch 1/10, Batch number: 14, Cumulated accuracy: 0.5802083333333333
Epoch 1/10, Batch number: 15, C

Epoch 1/10, Batch number: 121, Cumulated accuracy: 0.9011590676229508
Epoch 1/10, Batch number: 122, Cumulated accuracy: 0.9016133130081301
Epoch 1/10, Batch number: 123, Cumulated accuracy: 0.9022177419354839
Epoch 1/10, Batch number: 124, Cumulated accuracy: 0.90265625
Epoch 1/10, Batch number: 125, Cumulated accuracy: 0.9031498015873016
Epoch 1/10, Batch number: 126, Cumulated accuracy: 0.9037278543307087
Epoch 1/10, Batch number: 127, Cumulated accuracy: 0.90423583984375
Epoch 1/10, Batch number: 128, Cumulated accuracy: 0.904766230620155
Epoch 1/10, Batch number: 129, Cumulated accuracy: 0.9052584134615385
Epoch 1/10, Batch number: 130, Cumulated accuracy: 0.9058027194656488
Epoch 1/10, Batch number: 131, Cumulated accuracy: 0.9063683712121212
Epoch 1/10, Batch number: 132, Cumulated accuracy: 0.9067199248120301
Epoch 1/10, Batch number: 133, Cumulated accuracy: 0.9073285914179104
Epoch 1/10, Batch number: 134, Cumulated accuracy: 0.9078703703703703
Epoch 1/10, Batch number: 135, 

Epoch 2/10, Batch number: 3, Cumulated accuracy: 0.9794921875
Epoch 2/10, Batch number: 4, Cumulated accuracy: 0.978125
Epoch 2/10, Batch number: 5, Cumulated accuracy: 0.98046875
Epoch 2/10, Batch number: 6, Cumulated accuracy: 0.98046875
Epoch 2/10, Batch number: 7, Cumulated accuracy: 0.98095703125
Epoch 2/10, Batch number: 8, Cumulated accuracy: 0.9826388888888888
Epoch 2/10, Batch number: 9, Cumulated accuracy: 0.98359375
Epoch 2/10, Batch number: 10, Cumulated accuracy: 0.9822443181818182
Epoch 2/10, Batch number: 11, Cumulated accuracy: 0.982421875
Epoch 2/10, Batch number: 12, Cumulated accuracy: 0.9822716346153846
Epoch 2/10, Batch number: 13, Cumulated accuracy: 0.9827008928571429
Epoch 2/10, Batch number: 14, Cumulated accuracy: 0.9815104166666667
Epoch 2/10, Batch number: 15, Cumulated accuracy: 0.98046875
Epoch 2/10, Batch number: 16, Cumulated accuracy: 0.9809283088235294
Epoch 2/10, Batch number: 17, Cumulated accuracy: 0.9813368055555556
Epoch 2/10, Batch number: 18, Cu

Epoch 2/10, Batch number: 124, Cumulated accuracy: 0.98621875
Epoch 2/10, Batch number: 125, Cumulated accuracy: 0.986297123015873
Epoch 2/10, Batch number: 126, Cumulated accuracy: 0.9862512303149606
Epoch 2/10, Batch number: 127, Cumulated accuracy: 0.986236572265625
Epoch 2/10, Batch number: 128, Cumulated accuracy: 0.9861615794573644
Epoch 2/10, Batch number: 129, Cumulated accuracy: 0.9861778846153846
Epoch 2/10, Batch number: 130, Cumulated accuracy: 0.9861641221374046
Epoch 2/10, Batch number: 131, Cumulated accuracy: 0.986032196969697
Epoch 2/10, Batch number: 132, Cumulated accuracy: 0.986078477443609
Epoch 2/10, Batch number: 133, Cumulated accuracy: 0.9859783115671642
Epoch 2/10, Batch number: 134, Cumulated accuracy: 0.9859375
Epoch 2/10, Batch number: 135, Cumulated accuracy: 0.9859260110294118
Epoch 2/10, Batch number: 136, Cumulated accuracy: 0.9858861770072993
Epoch 2/10, Batch number: 137, Cumulated accuracy: 0.9859035326086957
Epoch 2/10, Batch number: 138, Cumulated 

Epoch 3/10, Batch number: 6, Cumulated accuracy: 0.9933035714285714
Epoch 3/10, Batch number: 7, Cumulated accuracy: 0.99365234375
Epoch 3/10, Batch number: 8, Cumulated accuracy: 0.9930555555555556
Epoch 3/10, Batch number: 9, Cumulated accuracy: 0.99375
Epoch 3/10, Batch number: 10, Cumulated accuracy: 0.9943181818181818
Epoch 3/10, Batch number: 11, Cumulated accuracy: 0.994140625
Epoch 3/10, Batch number: 12, Cumulated accuracy: 0.9939903846153846
Epoch 3/10, Batch number: 13, Cumulated accuracy: 0.9938616071428571
Epoch 3/10, Batch number: 14, Cumulated accuracy: 0.9940104166666667
Epoch 3/10, Batch number: 15, Cumulated accuracy: 0.993896484375
Epoch 3/10, Batch number: 16, Cumulated accuracy: 0.9940257352941176
Epoch 3/10, Batch number: 17, Cumulated accuracy: 0.9939236111111112
Epoch 3/10, Batch number: 18, Cumulated accuracy: 0.9940378289473685
Epoch 3/10, Batch number: 19, Cumulated accuracy: 0.9943359375
Epoch 3/10, Batch number: 20, Cumulated accuracy: 0.9940476190476191
Ep

Epoch 3/10, Batch number: 127, Cumulated accuracy: 0.992645263671875
Epoch 3/10, Batch number: 128, Cumulated accuracy: 0.9926114341085271
Epoch 3/10, Batch number: 129, Cumulated accuracy: 0.9926081730769231
Epoch 3/10, Batch number: 130, Cumulated accuracy: 0.992575143129771
Epoch 3/10, Batch number: 131, Cumulated accuracy: 0.9925426136363636
Epoch 3/10, Batch number: 132, Cumulated accuracy: 0.9925693139097744
Epoch 3/10, Batch number: 133, Cumulated accuracy: 0.9925373134328358
Epoch 3/10, Batch number: 134, Cumulated accuracy: 0.9925057870370371
Epoch 3/10, Batch number: 135, Cumulated accuracy: 0.9924747242647058
Epoch 3/10, Batch number: 136, Cumulated accuracy: 0.992501140510949
Epoch 3/10, Batch number: 137, Cumulated accuracy: 0.9924422554347826
Epoch 3/10, Batch number: 138, Cumulated accuracy: 0.9924966276978417
Epoch 3/10, Batch number: 139, Cumulated accuracy: 0.9925502232142858
Epoch 3/10, Batch number: 140, Cumulated accuracy: 0.9925753546099291
Epoch 3/10, Batch numbe

Epoch 4/10, Batch number: 9, Cumulated accuracy: 0.996484375
Epoch 4/10, Batch number: 10, Cumulated accuracy: 0.9964488636363636
Epoch 4/10, Batch number: 11, Cumulated accuracy: 0.99609375
Epoch 4/10, Batch number: 12, Cumulated accuracy: 0.9963942307692307
Epoch 4/10, Batch number: 13, Cumulated accuracy: 0.9963727678571429
Epoch 4/10, Batch number: 14, Cumulated accuracy: 0.9966145833333333
Epoch 4/10, Batch number: 15, Cumulated accuracy: 0.99658203125
Epoch 4/10, Batch number: 16, Cumulated accuracy: 0.9967830882352942
Epoch 4/10, Batch number: 17, Cumulated accuracy: 0.9969618055555556
Epoch 4/10, Batch number: 18, Cumulated accuracy: 0.9969161184210527
Epoch 4/10, Batch number: 19, Cumulated accuracy: 0.9966796875
Epoch 4/10, Batch number: 20, Cumulated accuracy: 0.9966517857142857
Epoch 4/10, Batch number: 21, Cumulated accuracy: 0.9968039772727273
Epoch 4/10, Batch number: 22, Cumulated accuracy: 0.9969429347826086
Epoch 4/10, Batch number: 23, Cumulated accuracy: 0.996907552

Epoch 4/10, Batch number: 131, Cumulated accuracy: 0.9961233428030303
Epoch 4/10, Batch number: 132, Cumulated accuracy: 0.9960350093984962
Epoch 4/10, Batch number: 133, Cumulated accuracy: 0.996035447761194
Epoch 4/10, Batch number: 134, Cumulated accuracy: 0.9960069444444445
Epoch 4/10, Batch number: 135, Cumulated accuracy: 0.9960363051470589
Epoch 4/10, Batch number: 136, Cumulated accuracy: 0.9960652372262774
Epoch 4/10, Batch number: 137, Cumulated accuracy: 0.9960371376811594
Epoch 4/10, Batch number: 138, Cumulated accuracy: 0.9960375449640287
Epoch 4/10, Batch number: 139, Cumulated accuracy: 0.9960658482142857
Epoch 4/10, Batch number: 140, Cumulated accuracy: 0.9960383421985816
Epoch 4/10, Batch number: 141, Cumulated accuracy: 0.9960387323943662
Epoch 4/10, Batch number: 142, Cumulated accuracy: 0.9959571678321678
Epoch 4/10, Batch number: 143, Cumulated accuracy: 0.9959581163194444
Epoch 4/10, Batch number: 144, Cumulated accuracy: 0.9959590517241379
Epoch 4/10, Batch num

Epoch 5/10, Batch number: 13, Cumulated accuracy: 0.9983258928571429
Epoch 5/10, Batch number: 14, Cumulated accuracy: 0.9981770833333333
Epoch 5/10, Batch number: 15, Cumulated accuracy: 0.998291015625
Epoch 5/10, Batch number: 16, Cumulated accuracy: 0.9981617647058824
Epoch 5/10, Batch number: 17, Cumulated accuracy: 0.9982638888888888
Epoch 5/10, Batch number: 18, Cumulated accuracy: 0.9983552631578947
Epoch 5/10, Batch number: 19, Cumulated accuracy: 0.9984375
Epoch 5/10, Batch number: 20, Cumulated accuracy: 0.9985119047619048
Epoch 5/10, Batch number: 21, Cumulated accuracy: 0.9982244318181818
Epoch 5/10, Batch number: 22, Cumulated accuracy: 0.9981317934782609
Epoch 5/10, Batch number: 23, Cumulated accuracy: 0.9982096354166666
Epoch 5/10, Batch number: 24, Cumulated accuracy: 0.99828125
Epoch 5/10, Batch number: 25, Cumulated accuracy: 0.9981971153846154
Epoch 5/10, Batch number: 26, Cumulated accuracy: 0.9982638888888888
Epoch 5/10, Batch number: 27, Cumulated accuracy: 0.998

Epoch 5/10, Batch number: 133, Cumulated accuracy: 0.9971723414179104
Epoch 5/10, Batch number: 134, Cumulated accuracy: 0.997193287037037
Epoch 5/10, Batch number: 135, Cumulated accuracy: 0.9972139246323529
Epoch 5/10, Batch number: 136, Cumulated accuracy: 0.9972057481751825
Epoch 5/10, Batch number: 137, Cumulated accuracy: 0.9971127717391305
Epoch 5/10, Batch number: 138, Cumulated accuracy: 0.9970773381294964
Epoch 5/10, Batch number: 139, Cumulated accuracy: 0.9970982142857143
Epoch 5/10, Batch number: 140, Cumulated accuracy: 0.9971187943262412
Epoch 5/10, Batch number: 141, Cumulated accuracy: 0.9970840669014085
Epoch 5/10, Batch number: 142, Cumulated accuracy: 0.997104458041958
Epoch 5/10, Batch number: 143, Cumulated accuracy: 0.9970431857638888
Epoch 5/10, Batch number: 144, Cumulated accuracy: 0.997009698275862
Epoch 5/10, Batch number: 145, Cumulated accuracy: 0.9970034246575342
Epoch 5/10, Batch number: 146, Cumulated accuracy: 0.9970238095238095
Epoch 5/10, Batch numbe

KeyboardInterrupt: 

### Training curves

The training curves here, show that the model was able to train just fine. It eventually nears 100% accuracy on the test set, while our DNN was stuck around 96-97%. This proves the superiority of CNN models and why the Conv2d operation is typically much preferable when processing images.

The reason for that is rather simple.

Convolutional layers (conv2d) are often prefered in image processing tasks because they are able to extract spatial information from the image in a way that is translation invariant. In other words, the learned features are not affected by the location of the object in the image. Linear layers, on the other hand, do not have this property and are typically used for tasks where the position of the object in the input is not important (and that is rarely the case with images!).

Additionally, convolutional layers are able to learn a large number of features by using a small number of parameters, which is computationally efficient and helps to prevent overfitting.

### What's next?

In the next notebook, we will investigate three additional operations, typically used in CNNs, along with Conv2d, namely the pooling, dropout and batchnorm operations.