Target:
* Reduce overfitting by adding Batch Normalization to every convolution block till the Adaptive Average Pool layer

Result:
* The parameter count has gone from ~9000 to ~9200 because of the batch norm
* Max Train accuracy (in 15 epochs): 99.4%.
* Max test accuracy (in 15 epochs): 99.4%

Analysis:
*  Compared the previous model, the magnitude of overfit has considerably reduced.
* The training accuracy is plateauing at 99.4% after epoch 14.
* The model still has ~ 1200 more parameters than the desired target so I need to fix that in the next model iteration.

In [11]:
!git clone "https://github.com/jyanivaddi/ERA_V1.git"
!git pull

Cloning into 'ERA_V1'...
remote: Enumerating objects: 190, done.[K
remote: Counting objects: 100% (190/190), done.[K
remote: Compressing objects: 100% (152/152), done.[K
remote: Total 190 (delta 84), reused 111 (delta 33), pack-reused 0[K
Receiving objects: 100% (190/190), 3.49 MiB | 2.03 MiB/s, done.
Resolving deltas: 100% (84/84), done.
Already up to date.


Add all the imports

In [1]:
from __future__ import print_function
import sys
sys.path.append("ERA_V1/session_7")
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import transforms
from s7_utils import load_mnist_data, preview_batch_images, plot_statistics
from s7_model import Model_4_Net, model_summary, model_train, model_test

Allocate GPU

In [2]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
mnist_model = Model_4_Net().to(device)
model_summary(mnist_model, input_size=(1,28,28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 10, 26, 26]              90
       BatchNorm2d-2           [-1, 10, 26, 26]              20
              ReLU-3           [-1, 10, 26, 26]               0
            Conv2d-4           [-1, 10, 24, 24]             900
       BatchNorm2d-5           [-1, 10, 24, 24]              20
              ReLU-6           [-1, 10, 24, 24]               0
            Conv2d-7           [-1, 10, 22, 22]             900
       BatchNorm2d-8           [-1, 10, 22, 22]              20
              ReLU-9           [-1, 10, 22, 22]               0
        MaxPool2d-10           [-1, 10, 11, 11]               0
           Conv2d-11             [-1, 16, 9, 9]           1,440
      BatchNorm2d-12             [-1, 16, 9, 9]              32
             ReLU-13             [-1, 16, 9, 9]               0
           Conv2d-14             [-1, 1

Define Transforms

In [None]:
train_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])
test_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])

Download Data

In [None]:
train_data, test_data = load_mnist_data(train_transforms, test_transforms)

Define train and test loaders

In [None]:
torch.manual_seed(1)
batch_size = 128
kwargs = {'num_workers': 2, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle = True, **kwargs)
test_loader = torch.utils.data.DataLoader(test_data, batch_size = batch_size, shuffle = True, **kwargs)

Preview data

In [None]:
preview_batch_images(train_loader)

In [None]:
model = Model_4_Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
train_losses = []
test_losses = []
train_acc = []
test_acc = []
for epoch in range(1,20):
    print(f"epoch: {epoch}")
    model_train(model, device, train_loader, optimizer, train_acc, train_losses)
    model_test(model, device, test_loader, test_acc, test_losses)


Plot Statistics

In [None]:
plot_statistics(train_losses, train_acc, test_losses, test_acc)