# Import Libraries

In [1]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

## Data Transformations

We first start with defining our data transformations. We need to think what our data is and how can we augment it to correct represent images which it might not see otherwise.


In [2]:
# Train Phase transformations
train_transforms = transforms.Compose([
                                       transforms.RandomRotation((-7.0, 7.0), fill=(1,)),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])

# Test Phase transformations
test_transforms = transforms.Compose([
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])


# Dataset and Creating Train/Test Split

In [3]:
train = datasets.MNIST('./data', train=True, download=True, transform=train_transforms)
test = datasets.MNIST('./data', train=False, download=True, transform=test_transforms)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 96500758.40it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 101284024.94it/s]

Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz



100%|██████████| 1648877/1648877 [00:00<00:00, 26049438.20it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 19927331.35it/s]

Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw






# Dataloader Arguments & Test/Train Dataloaders


In [4]:
SEED = 1

# CUDA?
cuda = torch.cuda.is_available()
print("CUDA Available?", cuda)

# For reproducibility
torch.manual_seed(SEED)

if cuda:
    torch.cuda.manual_seed(SEED)

# dataloader arguments - something you'll fetch these from cmdprmt
dataloader_args = dict(shuffle=True, batch_size=128, num_workers=4, pin_memory=True) if cuda else dict(shuffle=True, batch_size=64)

# train dataloader
train_loader = torch.utils.data.DataLoader(train, **dataloader_args)

# test dataloader
test_loader = torch.utils.data.DataLoader(test, **dataloader_args)

CUDA Available? True




# The model
Let's start with the model we first saw

In [5]:
from models import model_v1

# Model Params
Can't emphasize on how important viewing Model Summary is.
Unfortunately, there is no in-built model visualizer, so we have to take external help

In [6]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device)
model = model_v1().to(device)
summary(model, input_size=(1, 28, 28))

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
cuda
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 26, 26]              72
              ReLU-2            [-1, 8, 26, 26]               0
       BatchNorm2d-3            [-1, 8, 26, 26]              16
           Dropout-4            [-1, 8, 26, 26]               0
            Conv2d-5           [-1, 16, 24, 24]           1,152
              ReLU-6           [-1, 16, 24, 24]               0
       BatchNorm2d-7           [-1, 16, 24, 24]              32
           Dropout-8           [-1, 16, 24, 24]               0
            Conv2d-9            [-1, 8, 24, 24]             128
        MaxPool2d-10            [-1, 8, 12, 12]               0
           Conv2d-11           [-1, 16, 10, 10]           1,152
             ReLU-12           [-1, 16, 10, 10]               0

# Training and Testing

All right, so we have 24M params, and that's too many, we know that. But the purpose of this notebook is to set things right for our future experiments.

Looking at logs can be boring, so we'll introduce **tqdm** progressbar to get cooler logs.

Let's write train and test functions

In [7]:
from utils import train, test

In [8]:
from torch.optim.lr_scheduler import StepLR

model =  model_v1().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
# scheduler = StepLR(optimizer, step_size=6, gamma=0.1)


EPOCHS = 20
for epoch in range(EPOCHS):
    print("EPOCH:", epoch)
    train(model, device, train_loader, optimizer, epoch)
    # scheduler.step()
    test(model, device, test_loader)

EPOCH: 0


Loss=0.0713736042380333 Batch_id=468 Accuracy=87.48: 100%|██████████| 469/469 [00:23<00:00, 19.92it/s]



Test set: Average loss: 0.0810, Accuracy: 9774/10000 (97.74%)

EPOCH: 1


Loss=0.044182419776916504 Batch_id=468 Accuracy=97.49: 100%|██████████| 469/469 [00:23<00:00, 19.62it/s]



Test set: Average loss: 0.0409, Accuracy: 9885/10000 (98.85%)

EPOCH: 2


Loss=0.06880763918161392 Batch_id=468 Accuracy=98.01: 100%|██████████| 469/469 [00:23<00:00, 19.59it/s]



Test set: Average loss: 0.0345, Accuracy: 9890/10000 (98.90%)

EPOCH: 3


Loss=0.08027249574661255 Batch_id=468 Accuracy=98.23: 100%|██████████| 469/469 [00:23<00:00, 20.10it/s]



Test set: Average loss: 0.0293, Accuracy: 9908/10000 (99.08%)

EPOCH: 4


Loss=0.049239251762628555 Batch_id=468 Accuracy=98.46: 100%|██████████| 469/469 [00:23<00:00, 20.18it/s]



Test set: Average loss: 0.0307, Accuracy: 9905/10000 (99.05%)

EPOCH: 5


Loss=0.09139256924390793 Batch_id=468 Accuracy=98.61: 100%|██████████| 469/469 [00:23<00:00, 20.37it/s]



Test set: Average loss: 0.0313, Accuracy: 9899/10000 (98.99%)

EPOCH: 6


Loss=0.028757015243172646 Batch_id=468 Accuracy=98.58: 100%|██████████| 469/469 [00:23<00:00, 20.35it/s]



Test set: Average loss: 0.0230, Accuracy: 9927/10000 (99.27%)

EPOCH: 7


Loss=0.021889401599764824 Batch_id=468 Accuracy=98.67: 100%|██████████| 469/469 [00:23<00:00, 20.09it/s]



Test set: Average loss: 0.0247, Accuracy: 9924/10000 (99.24%)

EPOCH: 8


Loss=0.05286281183362007 Batch_id=468 Accuracy=98.75: 100%|██████████| 469/469 [00:23<00:00, 19.85it/s]



Test set: Average loss: 0.0252, Accuracy: 9916/10000 (99.16%)

EPOCH: 9


Loss=0.05508983135223389 Batch_id=468 Accuracy=98.73: 100%|██████████| 469/469 [00:23<00:00, 20.13it/s]



Test set: Average loss: 0.0213, Accuracy: 9932/10000 (99.32%)

EPOCH: 10


Loss=0.05303044244647026 Batch_id=468 Accuracy=98.85: 100%|██████████| 469/469 [00:23<00:00, 20.29it/s]



Test set: Average loss: 0.0250, Accuracy: 9925/10000 (99.25%)

EPOCH: 11


Loss=0.01810811273753643 Batch_id=468 Accuracy=98.89: 100%|██████████| 469/469 [00:22<00:00, 20.57it/s]



Test set: Average loss: 0.0218, Accuracy: 9928/10000 (99.28%)

EPOCH: 12


Loss=0.012993051670491695 Batch_id=468 Accuracy=98.86: 100%|██████████| 469/469 [00:22<00:00, 20.83it/s]



Test set: Average loss: 0.0202, Accuracy: 9923/10000 (99.23%)

EPOCH: 13


Loss=0.05571891739964485 Batch_id=468 Accuracy=98.94: 100%|██████████| 469/469 [00:22<00:00, 20.76it/s]



Test set: Average loss: 0.0198, Accuracy: 9934/10000 (99.34%)

EPOCH: 14


Loss=0.013053487055003643 Batch_id=468 Accuracy=98.94: 100%|██████████| 469/469 [00:22<00:00, 20.90it/s]



Test set: Average loss: 0.0231, Accuracy: 9927/10000 (99.27%)

EPOCH: 15


Loss=0.022172359749674797 Batch_id=468 Accuracy=99.00: 100%|██████████| 469/469 [00:22<00:00, 21.12it/s]



Test set: Average loss: 0.0223, Accuracy: 9923/10000 (99.23%)

EPOCH: 16


Loss=0.07151781767606735 Batch_id=468 Accuracy=99.00: 100%|██████████| 469/469 [00:22<00:00, 20.65it/s]



Test set: Average loss: 0.0185, Accuracy: 9946/10000 (99.46%)

EPOCH: 17


Loss=0.04009398818016052 Batch_id=468 Accuracy=98.98: 100%|██████████| 469/469 [00:22<00:00, 20.55it/s]



Test set: Average loss: 0.0210, Accuracy: 9936/10000 (99.36%)

EPOCH: 18


Loss=0.0787324458360672 Batch_id=468 Accuracy=99.07: 100%|██████████| 469/469 [00:23<00:00, 20.29it/s]



Test set: Average loss: 0.0221, Accuracy: 9930/10000 (99.30%)

EPOCH: 19


Loss=0.039261095225811005 Batch_id=468 Accuracy=98.98: 100%|██████████| 469/469 [00:23<00:00, 20.12it/s]



Test set: Average loss: 0.0188, Accuracy: 9943/10000 (99.43%)



## Target :- 
**1. Add Augmentation** 

**2. Results :**

        1. Parameters : 10.3 K
        2. Best Train Accuracy : 99.07 %
        3. Best Test Accuracy : 99.46 % (16th epoch)

**3. Analysis :**

        1. Model has ran for 19 epochs
        2. Added dropout of 0.1
        3. Due to augmentation and dropout layers model is training hard, so there is scope of learning more