# Import Libraries

In [None]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

## Data Transformations

We first start with defining our data transformations. We need to think what our data is and how can we augment it to correct represent images which it might not see otherwise.


In [None]:
# Train Phase transformations
train_transforms = transforms.Compose([
                                       transforms.RandomRotation((-7.0, 7.0), fill=(1,)),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])

# Test Phase transformations
test_transforms = transforms.Compose([
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])


# Dataset and Creating Train/Test Split

In [None]:
train = datasets.MNIST('./data', train=True, download=True, transform=train_transforms)
test = datasets.MNIST('./data', train=False, download=True, transform=test_transforms)

# Dataloader Arguments & Test/Train Dataloaders


In [None]:
SEED = 1

# CUDA?
cuda = torch.cuda.is_available()
print("CUDA Available?", cuda)

# For reproducibility
torch.manual_seed(SEED)

if cuda:
    torch.cuda.manual_seed(SEED)

# dataloader arguments - something you'll fetch these from cmdprmt
dataloader_args = dict(shuffle=True, batch_size=128, num_workers=4, pin_memory=True) if cuda else dict(shuffle=True, batch_size=64)

# train dataloader
train_loader = torch.utils.data.DataLoader(train, **dataloader_args)

# test dataloader
test_loader = torch.utils.data.DataLoader(test, **dataloader_args)

CUDA Available? True


# The model
Let's start with the model we first saw

In [None]:
from models import model_v3 

# Model Params
Can't emphasize on how important viewing Model Summary is.
Unfortunately, there is no in-built model visualizer, so we have to take external help

In [None]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device)
model = model_v3().to(device)
summary(model, input_size=(1, 28, 28))

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
cuda
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 16, 26, 26]             144
              ReLU-2           [-1, 16, 26, 26]               0
       BatchNorm2d-3           [-1, 16, 26, 26]              32
            Conv2d-4           [-1, 20, 24, 24]           2,880
              ReLU-5           [-1, 20, 24, 24]               0
       BatchNorm2d-6           [-1, 20, 24, 24]              40
            Conv2d-7           [-1, 10, 24, 24]             200
         MaxPool2d-8           [-1, 10, 12, 12]               0
            Conv2d-9           [-1, 10, 12, 12]             900
             ReLU-10           [-1, 10, 12, 12]               0
      BatchNorm2d-11           [-1, 10, 12, 12]              20
          Dropout-12           [-1, 10, 12, 12]               0

# Training and Testing

All right, so we have 24M params, and that's too many, we know that. But the purpose of this notebook is to set things right for our future experiments.

Looking at logs can be boring, so we'll introduce **tqdm** progressbar to get cooler logs.

Let's write train and test functions

In [None]:
from utils import train, test

In [None]:
from torch.optim.lr_scheduler import StepLR

model =  model_v3().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
# scheduler = StepLR(optimizer, step_size=6, gamma=0.1)


EPOCHS = 16
for epoch in range(EPOCHS):
    print("EPOCH:", epoch)
    train(model, device, train_loader, optimizer, epoch)
    # scheduler.step()
    test(model, device, test_loader)

EPOCH: 0


Loss=0.10085026174783707 Batch_id=468 Accuracy=91.19: 100%|██████████| 469/469 [00:26<00:00, 17.69it/s]



Test set: Average loss: 0.0621, Accuracy: 9826/10000 (98.26%)

EPOCH: 1


Loss=0.03887877240777016 Batch_id=468 Accuracy=98.03: 100%|██████████| 469/469 [00:22<00:00, 20.41it/s]



Test set: Average loss: 0.0432, Accuracy: 9874/10000 (98.74%)

EPOCH: 2


Loss=0.0618673674762249 Batch_id=468 Accuracy=98.43: 100%|██████████| 469/469 [00:22<00:00, 20.96it/s]



Test set: Average loss: 0.0370, Accuracy: 9881/10000 (98.81%)

EPOCH: 3


Loss=0.11553207039833069 Batch_id=468 Accuracy=98.75: 100%|██████████| 469/469 [00:21<00:00, 21.44it/s]



Test set: Average loss: 0.0437, Accuracy: 9863/10000 (98.63%)

EPOCH: 4


Loss=0.10878857225179672 Batch_id=468 Accuracy=98.78: 100%|██████████| 469/469 [00:21<00:00, 21.44it/s]



Test set: Average loss: 0.0329, Accuracy: 9904/10000 (99.04%)

EPOCH: 5


Loss=0.06395141780376434 Batch_id=468 Accuracy=98.89: 100%|██████████| 469/469 [00:21<00:00, 21.70it/s]



Test set: Average loss: 0.0281, Accuracy: 9915/10000 (99.15%)

EPOCH: 6


Loss=0.03760405629873276 Batch_id=468 Accuracy=98.98: 100%|██████████| 469/469 [00:21<00:00, 21.45it/s]



Test set: Average loss: 0.0294, Accuracy: 9912/10000 (99.12%)

EPOCH: 7


Loss=0.04193821921944618 Batch_id=468 Accuracy=99.05: 100%|██████████| 469/469 [00:22<00:00, 21.22it/s]



Test set: Average loss: 0.0277, Accuracy: 9915/10000 (99.15%)

EPOCH: 8


Loss=0.043630387634038925 Batch_id=468 Accuracy=99.12: 100%|██████████| 469/469 [00:21<00:00, 21.40it/s]



Test set: Average loss: 0.0309, Accuracy: 9900/10000 (99.00%)

EPOCH: 9


Loss=0.06975232809782028 Batch_id=468 Accuracy=99.14: 100%|██████████| 469/469 [00:22<00:00, 20.80it/s]



Test set: Average loss: 0.0239, Accuracy: 9931/10000 (99.31%)

EPOCH: 10


Loss=0.046166181564331055 Batch_id=468 Accuracy=99.15: 100%|██████████| 469/469 [00:23<00:00, 19.62it/s]



Test set: Average loss: 0.0244, Accuracy: 9924/10000 (99.24%)

EPOCH: 11


Loss=0.007830402813851833 Batch_id=468 Accuracy=99.13: 100%|██████████| 469/469 [00:22<00:00, 20.72it/s]



Test set: Average loss: 0.0243, Accuracy: 9925/10000 (99.25%)

EPOCH: 12


Loss=0.006819351110607386 Batch_id=468 Accuracy=99.25: 100%|██████████| 469/469 [00:22<00:00, 20.46it/s]



Test set: Average loss: 0.0244, Accuracy: 9923/10000 (99.23%)

EPOCH: 13


Loss=0.06264819949865341 Batch_id=468 Accuracy=99.23: 100%|██████████| 469/469 [00:22<00:00, 20.57it/s]



Test set: Average loss: 0.0210, Accuracy: 9932/10000 (99.32%)

EPOCH: 14


Loss=0.0030536248814314604 Batch_id=468 Accuracy=99.29: 100%|██████████| 469/469 [00:22<00:00, 20.57it/s]



Test set: Average loss: 0.0235, Accuracy: 9924/10000 (99.24%)

EPOCH: 15


Loss=0.05123462900519371 Batch_id=468 Accuracy=99.27: 100%|██████████| 469/469 [00:23<00:00, 20.11it/s]



Test set: Average loss: 0.0202, Accuracy: 9941/10000 (99.41%)



## Target :- 
**1. Add Augmentation** 

**2. Results :**

        1. Parameters : 8K
        2. Best Train Accuracy : 99.29 %
        3. Best Test Accuracy : 99.41% (15th epoch)

**3. Analysis :**

        1. Model has ran for 15 epochs
        2. Since model was underfitting and training accuracy was not increasing so removed almost 90% of the dropout layers and were able to achieve good accuracy