Target:
* Add dropout regularization with a probability of 10% after each layer (except the final layer) to reduce overfitting

Result:
* Max Train accuracy (in 15 epochs): 98.67%.
* Max test accuracy (in 15 epochs): 99%

Analysis:
*  The dropout has reduced the accuracy of both train and test which is expected. However the test accuracy is generally higher than train accuracy.
* I need to boost my trainset by adding data transformations and augmentations. I will include image rotation in the next iteration to try and improve train accuracy

In [None]:
!git clone "https://github.com/jyanivaddi/ERA_V1.git"
!git pull

Add all the imports

In [1]:
from __future__ import print_function
import sys
sys.path.append("ERA_V1/session_7")
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import transforms
from s7_utils import load_mnist_data, preview_batch_images, plot_statistics
from s7_model import Model_6_Net, model_summary, model_train, model_test

Allocate GPU

In [2]:
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
mnist_model = Model_6_Net().to(device)
model_summary(mnist_model, input_size=(1,28,28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 12, 26, 26]             108
       BatchNorm2d-2           [-1, 12, 26, 26]              24
              ReLU-3           [-1, 12, 26, 26]               0
           Dropout-4           [-1, 12, 26, 26]               0
            Conv2d-5           [-1, 12, 24, 24]           1,296
       BatchNorm2d-6           [-1, 12, 24, 24]              24
              ReLU-7           [-1, 12, 24, 24]               0
           Dropout-8           [-1, 12, 24, 24]               0
            Conv2d-9           [-1, 10, 22, 22]           1,080
      BatchNorm2d-10           [-1, 10, 22, 22]              20
             ReLU-11           [-1, 10, 22, 22]               0
          Dropout-12           [-1, 10, 22, 22]               0
        MaxPool2d-13           [-1, 10, 11, 11]               0
           Conv2d-14             [-1, 1

Define Transforms

In [None]:
train_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])
test_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,),(0.3081,))
])

Download Data

In [None]:
train_data, test_data = load_mnist_data(train_transforms, test_transforms)

Define train and test loaders

In [None]:
torch.manual_seed(1)
batch_size = 128
kwargs = {'num_workers': 2, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle = True, **kwargs)
test_loader = torch.utils.data.DataLoader(test_data, batch_size = batch_size, shuffle = True, **kwargs)

Preview data

In [None]:
preview_batch_images(train_loader)

In [None]:
model = Model_6_Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1, verbose=True)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min')
train_losses = []
test_losses = []
train_acc = []
test_acc = []
for epoch in range(1,20):
    print(f"epoch: {epoch}")
    this_train_loss = model_train(model, device, train_loader, optimizer, train_acc, train_losses)
    this_loss = model_test(model, device, test_loader, test_acc, test_losses)
    scheduler.step(this_train_loss)


Plot Statistics

In [None]:
plot_statistics(train_losses, train_acc, test_losses, test_acc)