Target:

MNIST Basic Neural Network model included with Batch Normalization,Drop out and GAP layers.

Results:



*   Parameters: 5,088
*   Best Train Accuracy: 97.88%
*   Best Test Accuracy: 98.77%

Analysis:

# Import Libraries

Let's first import all the necessary libraries

In [1]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

# Let's visualize some of the images
%matplotlib inline
import matplotlib.pyplot as plt

# Defining Model
 Create a CNN Model Skeleton

In [5]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()


        #Block 1
        self.conv1 = nn.Sequential(
            nn.Conv2d(1, 8, 3,padding=0,bias=False),  # 28x28 output 28x28 RF : 3x3
            nn.BatchNorm2d(8),
            nn.Dropout(0.10),
            nn.ReLU(),

            nn.Conv2d(8, 16, 3,padding=0,bias=False), # 28x28 output 28x28 RF : 5x5
            nn.BatchNorm2d(16),
            nn.Dropout(0.10),
            nn.ReLU(),

                    
        )

        #Transition Block (MaxPool + 1x1)
        self.trans1 = nn.Sequential(
            nn.MaxPool2d(2, 2),
            # 1x1 convolution
            nn.Conv2d(16, 8, 1,bias=False), # 26x26 output - 26x26 RF 14x14
            nn.BatchNorm2d(8),
            nn.Dropout(0.10),
            nn.ReLU(),

              # 26x26 output - 13x13 RF 14x14

        )

        #Block 2
        self.conv2 =  nn.Sequential(

            nn.Conv2d(8, 10, 3,padding=0, bias=False), # 13x13 output - 11x11 RF 16x16
            nn.BatchNorm2d(10),
            nn.Dropout(0.10),
            nn.ReLU(),

            nn.Conv2d(10, 16, 3,padding=0, bias=False),  # 11x11 output - 9x9 RF 18x18
            nn.BatchNorm2d(16),
            nn.Dropout(0.10),
            nn.ReLU(),

            nn.Conv2d(16, 10, 3,padding=0, bias=False), # 9x9 output - 7x7 RF 20x20
            nn.BatchNorm2d(10),
            nn.Dropout(0.10),
            nn.ReLU(),

            
            
        )
        self.avgpool2d = nn.AvgPool2d(kernel_size=6)

        

    def forward(self, x):
        x = self.conv1(x)
        x = self.trans1(x)
        x = self.conv2(x)
        x = self.avgpool2d(x)
        #x = self.conv3(x)
        x = x.view(-1,10)

        return F.log_softmax(x,dim=1)

# Model Summary
 To view and to understand Model Trainable parameteres

In [6]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = Net().to(device)
summary(model, input_size=(1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 26, 26]              72
       BatchNorm2d-2            [-1, 8, 26, 26]              16
           Dropout-3            [-1, 8, 26, 26]               0
              ReLU-4            [-1, 8, 26, 26]               0
            Conv2d-5           [-1, 16, 24, 24]           1,152
       BatchNorm2d-6           [-1, 16, 24, 24]              32
           Dropout-7           [-1, 16, 24, 24]               0
              ReLU-8           [-1, 16, 24, 24]               0
         MaxPool2d-9           [-1, 16, 12, 12]               0
           Conv2d-10            [-1, 8, 12, 12]             128
      BatchNorm2d-11            [-1, 8, 12, 12]              16
          Dropout-12            [-1, 8, 12, 12]               0
             ReLU-13            [-1, 8, 12, 12]               0
           Conv2d-14           [-1, 10,

# The Model


In [7]:
model.eval()

Net(
  (conv1): Sequential(
    (0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), bias=False)
    (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): Dropout(p=0.1, inplace=False)
    (3): ReLU()
    (4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), bias=False)
    (5): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Dropout(p=0.1, inplace=False)
    (7): ReLU()
  )
  (trans1): Sequential(
    (0): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (1): Conv2d(16, 8, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (2): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.1, inplace=False)
    (4): ReLU()
  )
  (conv2): Sequential(
    (0): Conv2d(8, 10, kernel_size=(3, 3), stride=(1, 1), bias=False)
    (1): BatchNorm2d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): Dropout(p=0.1, inplace=Fal

## Load and Prepare Dataset

MNIST contains 70,000 images of handwritten digits: 60,000 for training and 10,000 for testing. The images are grayscale, 28x28 pixels

We load the PIL images using torchvision.datasets.MNIST, while loading the image we transform he data to tensor and normalize the images with mean and std deviation of MNIST images.

In [8]:
torch.manual_seed(1)
batch_size = 128

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train = datasets.MNIST('../data', train=True, download=True,
                    transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ]))

test = datasets.MNIST('../data', train=False, transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ]))
train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, shuffle=False, **kwargs)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=9912422.0), HTML(value='')))


Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=28881.0), HTML(value='')))


Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=1648877.0), HTML(value='')))


Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(FloatProgress(value=0.0, max=4542.0), HTML(value='')))


Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw

Processing...
Done!


  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


# Training & Testing Functions
 Creating Training and Testing functions.

In [9]:
from tqdm import tqdm
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    epoch_loss=0
    correct = 0
    pbar = tqdm(train_loader)
    for batch_idx, (data, target) in enumerate(pbar):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        epoch_loss += loss.item()
        loss.backward()
        optimizer.step()

        pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
        correct += pred.eq(target.view_as(pred)).sum().item()

        pbar.set_description(desc= f'epoch={epoch} Loss={loss.item()} batch_id={batch_idx:05d}')


    train_loss = epoch_loss / len(train_loader.dataset)

    print('Train set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        train_loss, correct, len(train_loader.dataset),
        100. * correct / len(train_loader.dataset)))
    train_acc=100.*correct/len(train_loader.dataset)
    return train_loss,train_acc


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    test_acc=100. * correct / len(test_loader.dataset)
    return test_loss,test_acc

# Train & Test our Model
 Let's train and test our model

In [10]:
model =  Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
EPOCHS = 15
for epoch in range(EPOCHS):
    print("EPOCH:", epoch)
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

  0%|          | 0/469 [00:00<?, ?it/s]

EPOCH: 0


epoch=0 Loss=0.3383824825286865 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.56it/s]

Train set: Average loss: 0.0068, Accuracy: 49804/60000 (83.01%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.2673, Accuracy: 9531/10000 (95.31%)

EPOCH: 1


epoch=1 Loss=0.19722819328308105 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.45it/s]

Train set: Average loss: 0.0020, Accuracy: 57087/60000 (95.14%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.1590, Accuracy: 9673/10000 (96.73%)

EPOCH: 2


epoch=2 Loss=0.1373169720172882 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.31it/s]

Train set: Average loss: 0.0014, Accuracy: 57727/60000 (96.21%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.1091, Accuracy: 9723/10000 (97.23%)

EPOCH: 3


epoch=3 Loss=0.11332836747169495 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.75it/s]

Train set: Average loss: 0.0011, Accuracy: 58042/60000 (96.74%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0858, Accuracy: 9802/10000 (98.02%)

EPOCH: 4


epoch=4 Loss=0.08680078387260437 batch_id=00468: 100%|██████████| 469/469 [00:11<00:00, 39.66it/s]

Train set: Average loss: 0.0010, Accuracy: 58205/60000 (97.01%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0728, Accuracy: 9834/10000 (98.34%)

EPOCH: 5


epoch=5 Loss=0.05757491663098335 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 39.04it/s]

Train set: Average loss: 0.0009, Accuracy: 58360/60000 (97.27%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0701, Accuracy: 9819/10000 (98.19%)

EPOCH: 6


epoch=6 Loss=0.2014341950416565 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.71it/s]

Train set: Average loss: 0.0008, Accuracy: 58395/60000 (97.33%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0629, Accuracy: 9833/10000 (98.33%)

EPOCH: 7


epoch=7 Loss=0.10797689110040665 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.20it/s]

Train set: Average loss: 0.0008, Accuracy: 58500/60000 (97.50%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0530, Accuracy: 9862/10000 (98.62%)

EPOCH: 8


epoch=8 Loss=0.09269193559885025 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 37.39it/s]

Train set: Average loss: 0.0007, Accuracy: 58611/60000 (97.69%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0634, Accuracy: 9818/10000 (98.18%)

EPOCH: 9


epoch=9 Loss=0.09947828203439713 batch_id=00468: 100%|██████████| 469/469 [00:13<00:00, 35.59it/s]

Train set: Average loss: 0.0007, Accuracy: 58585/60000 (97.64%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0541, Accuracy: 9844/10000 (98.44%)

EPOCH: 10


epoch=10 Loss=0.07908926159143448 batch_id=00468: 100%|██████████| 469/469 [00:13<00:00, 35.12it/s]

Train set: Average loss: 0.0007, Accuracy: 58649/60000 (97.75%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0457, Accuracy: 9878/10000 (98.78%)

EPOCH: 11


epoch=11 Loss=0.0856403112411499 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 36.54it/s]

Train set: Average loss: 0.0007, Accuracy: 58658/60000 (97.76%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0487, Accuracy: 9877/10000 (98.77%)

EPOCH: 12


epoch=12 Loss=0.0876322016119957 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 37.21it/s]

Train set: Average loss: 0.0006, Accuracy: 58742/60000 (97.90%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0462, Accuracy: 9870/10000 (98.70%)

EPOCH: 13


epoch=13 Loss=0.11492786556482315 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.50it/s]

Train set: Average loss: 0.0006, Accuracy: 58757/60000 (97.93%)




  0%|          | 0/469 [00:00<?, ?it/s]

Test set: Average loss: 0.0465, Accuracy: 9869/10000 (98.69%)

EPOCH: 14


epoch=14 Loss=0.12077370285987854 batch_id=00468: 100%|██████████| 469/469 [00:12<00:00, 38.43it/s]

Train set: Average loss: 0.0006, Accuracy: 58725/60000 (97.88%)






Test set: Average loss: 0.0453, Accuracy: 9877/10000 (98.77%)

