## `Model_1`
### Target:
- Getting the skeleton right
- Perform MaxPooling after the layer with 5 rececptive field
- Include dropout of 10% after all convolutional layers
- Have fully convolutional layer instead of fully connected layer

### Results:
- Parameters: 8.2 K
- Best Train Accuracy: 98.07
- Best Test Accuracy: 98.69

### Analysis:
- Model is underfitting - Current learning rate 0.001 might be too slow; LR scheduler can help
- Adam might perform better than SGD optimizer 
- May be add more channels in initial convolutional layers

# Import Libraries

In [2]:
! nvidia-smi

zsh:1: command not found: nvidia-smi


In [3]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

## Data Transformations

We first start with defining our data transformations. We need to think what our data is and how can we augment it to correct represent images which it might not see otherwise.


In [4]:
# Train Phase transformations
train_transforms = transforms.Compose(
    [
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))
        ])

# Test Phase transformations
test_transforms = transforms.Compose(
    [
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))
        ])


# Dataset and Creating Train/Test Split

In [5]:
from utils import prepare_mnist_data
train_loader, test_loader = prepare_mnist_data(
    train_transforms, test_transforms, batch_size=256)

CUDA Available? False


# The model

# Model Params
Can't emphasize on how important viewing Model Summary is.
Unfortunately, there is no in-built model visualizer, so we have to take external help

In [6]:
!pip install torchsummary
from torchsummary import summary
from model import Net
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device)
model = Net(
    list('CCcPCCCCGc'),
    [16, 16, 10, 10,
     16, 8, 8, 16,
     16, 10],
    dropout_value=0.03).to(device)
summary(model, input_size=(1, 28, 28))

cpu
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 16, 28, 28]             144
              ReLU-2           [-1, 16, 28, 28]               0
       BatchNorm2d-3           [-1, 16, 28, 28]              32
           Dropout-4           [-1, 16, 28, 28]               0
            Conv2d-5           [-1, 16, 28, 28]           2,304
              ReLU-6           [-1, 16, 28, 28]               0
       BatchNorm2d-7           [-1, 16, 28, 28]              32
           Dropout-8           [-1, 16, 28, 28]               0
            Conv2d-9           [-1, 10, 28, 28]             160
        MaxPool2d-10           [-1, 10, 14, 14]               0
           Conv2d-11           [-1, 16, 14, 14]           1,440
             ReLU-12           [-1, 16, 14, 14]               0
      BatchNorm2d-13           [-1, 16, 14, 14]              32
          Dropout-14           [-1,

# Training and Testing

Looking at logs can be boring, so we'll introduce **tqdm** progressbar to get cooler logs.

Let's write train and test functions

# Let's Train and test our model

In [14]:
from model import train, test
model =  Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

EPOCHS = 20
for epoch in range(EPOCHS):
    print("EPOCH:", epoch)
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

EPOCH: 0


Loss=1.36038076877594 Batch_id=468 Accuracy=39.93: 100%|██████████| 469/469 [00:18<00:00, 24.99it/s]



Test set: Average loss: 1.3963, Accuracy: 5309/10000 (53.09%)

EPOCH: 1


Loss=0.6103813052177429 Batch_id=468 Accuracy=76.34: 100%|██████████| 469/469 [00:19<00:00, 23.56it/s]



Test set: Average loss: 0.6982, Accuracy: 8568/10000 (85.68%)

EPOCH: 2


Loss=0.317636638879776 Batch_id=468 Accuracy=91.28: 100%|██████████| 469/469 [00:19<00:00, 24.44it/s]



Test set: Average loss: 0.3241, Accuracy: 9394/10000 (93.94%)

EPOCH: 3


Loss=0.21827930212020874 Batch_id=468 Accuracy=94.11: 100%|██████████| 469/469 [00:18<00:00, 26.01it/s]



Test set: Average loss: 0.2121, Accuracy: 9566/10000 (95.66%)

EPOCH: 4


Loss=0.13848648965358734 Batch_id=468 Accuracy=95.25: 100%|██████████| 469/469 [00:18<00:00, 26.02it/s]



Test set: Average loss: 0.1696, Accuracy: 9623/10000 (96.23%)

EPOCH: 5


Loss=0.18899254500865936 Batch_id=468 Accuracy=95.86: 100%|██████████| 469/469 [00:18<00:00, 25.95it/s]



Test set: Average loss: 0.1250, Accuracy: 9694/10000 (96.94%)

EPOCH: 6


Loss=0.11657620221376419 Batch_id=468 Accuracy=96.29: 100%|██████████| 469/469 [00:17<00:00, 26.15it/s]



Test set: Average loss: 0.0991, Accuracy: 9755/10000 (97.55%)

EPOCH: 7


Loss=0.0870925784111023 Batch_id=468 Accuracy=96.60: 100%|██████████| 469/469 [00:18<00:00, 25.11it/s]



Test set: Average loss: 0.0951, Accuracy: 9757/10000 (97.57%)

EPOCH: 8


Loss=0.14262831211090088 Batch_id=468 Accuracy=96.86: 100%|██████████| 469/469 [00:18<00:00, 25.85it/s]



Test set: Average loss: 0.0811, Accuracy: 9782/10000 (97.82%)

EPOCH: 9


Loss=0.0745767131447792 Batch_id=468 Accuracy=97.07: 100%|██████████| 469/469 [00:20<00:00, 23.28it/s]



Test set: Average loss: 0.0683, Accuracy: 9817/10000 (98.17%)

EPOCH: 10


Loss=0.07470259815454483 Batch_id=468 Accuracy=97.33: 100%|██████████| 469/469 [00:17<00:00, 26.19it/s]



Test set: Average loss: 0.0689, Accuracy: 9805/10000 (98.05%)

EPOCH: 11


Loss=0.12477552890777588 Batch_id=468 Accuracy=97.43: 100%|██████████| 469/469 [00:19<00:00, 24.47it/s]



Test set: Average loss: 0.0623, Accuracy: 9828/10000 (98.28%)

EPOCH: 12


Loss=0.07088539749383926 Batch_id=468 Accuracy=97.43: 100%|██████████| 469/469 [00:17<00:00, 26.35it/s]



Test set: Average loss: 0.0603, Accuracy: 9829/10000 (98.29%)

EPOCH: 13


Loss=0.0866084098815918 Batch_id=468 Accuracy=97.64: 100%|██████████| 469/469 [00:19<00:00, 24.67it/s]



Test set: Average loss: 0.0529, Accuracy: 9854/10000 (98.54%)

EPOCH: 14


Loss=0.06816496700048447 Batch_id=468 Accuracy=97.75: 100%|██████████| 469/469 [00:17<00:00, 26.50it/s]



Test set: Average loss: 0.0508, Accuracy: 9846/10000 (98.46%)

EPOCH: 15


Loss=0.07287620007991791 Batch_id=468 Accuracy=97.91: 100%|██████████| 469/469 [00:18<00:00, 24.83it/s]



Test set: Average loss: 0.0489, Accuracy: 9862/10000 (98.62%)

EPOCH: 16


Loss=0.07862820476293564 Batch_id=468 Accuracy=97.91: 100%|██████████| 469/469 [00:18<00:00, 25.00it/s]



Test set: Average loss: 0.0491, Accuracy: 9855/10000 (98.55%)

EPOCH: 17


Loss=0.12939494848251343 Batch_id=468 Accuracy=97.98: 100%|██████████| 469/469 [00:18<00:00, 25.08it/s]



Test set: Average loss: 0.0469, Accuracy: 9870/10000 (98.70%)

EPOCH: 18


Loss=0.04437898471951485 Batch_id=468 Accuracy=98.06: 100%|██████████| 469/469 [00:17<00:00, 26.70it/s]



Test set: Average loss: 0.0435, Accuracy: 9878/10000 (98.78%)

EPOCH: 19


Loss=0.05325952544808388 Batch_id=468 Accuracy=98.07: 100%|██████████| 469/469 [00:18<00:00, 25.20it/s]



Test set: Average loss: 0.0447, Accuracy: 9869/10000 (98.69%)

