<a href="https://colab.research.google.com/github/rssubramaniyan1/EVA8/blob/main/EVA8_Assignment4_Attempt_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Attempt 1**

**Target:**

> Get the set-up right - Done

> Set Transforms - Done

> Set Data Loader - Done

> Set Basic Working Code -Done

> Set Basic Training  & Test Loop --Done

> Start with a network with less than 10k parameters - Basically a good start

**Results:** 

> Parameters: 9198

> Best Training Accuracy: 98.9267

> Best Test Accuracy: 99.31

**Analysis:**

> Model is generalizing well with lower training accuracy and higher test accuracy

> Changing the model in the next step to increase the training accuracy and in turn increase the test accuracy while trying to work with same or lesser number of features




In [1]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torchsummary import summary
from torch.optim import lr_scheduler
from tqdm import tqdm

In [2]:
torch.manual_seed(1)
batch_size = 64
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                    transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)



def train(model, device, train_loader, optimizer, epoch):
    model.train()
    pbar = tqdm(train_loader)
    correct = 0
    processed = 0
    for batch_idx, (data, target) in enumerate(pbar):
        # get samples
        data, target = data.to(device), target.to(device)

        # Init
        optimizer.zero_grad()
        # Predict
        y_pred = model(data)

        # Calculate loss
        loss = F.nll_loss(y_pred, target)
        #train_losses.append(loss)

        # Backpropagation
        loss.backward()
        optimizer.step()

        # Update pbar-tqdm
        pred = y_pred.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
        correct += pred.eq(target.view_as(pred)).sum().item()
        processed += len(data)

        pbar.set_description(desc= f'loss={loss.item()} batch_id={batch_idx} Accuracy={100*correct/processed:0.4f}')
#

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.4f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw



In [3]:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        #CONV BLOCK 1

        self.conv1 = nn.Sequential(
            nn.Conv2d(1, 8, 3, padding=1), #input size 28x28x1, output size 28x28x8, RF 3x3
            nn.ReLU(),
            nn.BatchNorm2d(8),
            nn.Dropout(0.1),

            nn.Conv2d(8, 10, 3, padding=1),#input size 28x28x8, output size 28x28x10, RF 5x5
            nn.ReLU(),
            nn.BatchNorm2d(10),
            nn.Dropout(0.1),

            nn.Conv2d(10, 12, 3, padding=1),#input size 28x28x10, output size 28x28x12, RF 7x7
            nn.ReLU(),
            nn.BatchNorm2d(12),
            nn.Dropout(0.1)

        )

        # MAX POOL - 1

        self.pool1 = nn.MaxPool2d(2, 2) #input size 28x28x12, output size 14x14x12, RF 7x7

        #CONV BLOCK 2

        self.conv2 = nn.Sequential(
            nn.Conv2d(12, 16, 3, padding=1), #input size 14x14x12, output size 14x14x16, RF 14x14
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(0.1),

            nn.Conv2d(16, 16, 3, padding=1), #input size 14x14x16, output size 14x14x16, RF 16x16
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(0.1)
        )

        #MAX POOL -2

        self.pool2 = nn.MaxPool2d(2, 2)   #input size 14x14x16, output size 7x7x16, RF 16x16

        #CONV BLOCK 3
        self.conv3 = nn.Sequential(
            nn.Conv2d(16, 8, 1, padding=0), #input size 7x7x16, output size 7x7x8, RF 32x32
            nn.ReLU(),
            nn.BatchNorm2d(8),
            nn.Dropout(0.1),

            nn.Conv2d(8, 10, 3), #input size 7x7x8, output size 5x5x10, RF 34x34
            nn.ReLU(),
            nn.BatchNorm2d(10),
            nn.Dropout(0.1),

            nn.Conv2d(10, 12, 3), #input size 5x5x10, output size 3x3x12, RF 36x36
            nn.ReLU(),
            nn.BatchNorm2d(12),
            nn.Dropout(0.1)
        )        

      #OUTPUT BLOCK 

        self.conv4 = nn.Sequential(
            nn.Conv2d(12, 10, 3) #input size 3x3x12, output size 1x1x10, RF 38x38
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        
        x = self.conv4(x)
        x = x.view(-1, 10)
        return F.log_softmax(x,dim=1)



model = Net().to(device)
summary(model, input_size=(1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 28, 28]              80
              ReLU-2            [-1, 8, 28, 28]               0
       BatchNorm2d-3            [-1, 8, 28, 28]              16
           Dropout-4            [-1, 8, 28, 28]               0
            Conv2d-5           [-1, 10, 28, 28]             730
              ReLU-6           [-1, 10, 28, 28]               0
       BatchNorm2d-7           [-1, 10, 28, 28]              20
           Dropout-8           [-1, 10, 28, 28]               0
            Conv2d-9           [-1, 12, 28, 28]           1,092
             ReLU-10           [-1, 12, 28, 28]               0
      BatchNorm2d-11           [-1, 12, 28, 28]              24
          Dropout-12           [-1, 12, 28, 28]               0
        MaxPool2d-13           [-1, 12, 14, 14]               0
           Conv2d-14           [-1, 16,

In [4]:
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
#scheduler = lr_scheduler.StepLR(optimizer, step_size=6, gamma=0.1)

for epoch in range(1, 15):
    print('Epoch:', epoch)
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)
    #scheduler.step()


Epoch: 1


loss=0.17531421780586243 batch_id=937 Accuracy=93.9367: 100%|██████████| 938/938 [00:22<00:00, 42.50it/s]



Test set: Average loss: 0.0600, Accuracy: 9810/10000 (98.1000%)

Epoch: 2


loss=0.059755243360996246 batch_id=937 Accuracy=97.7133: 100%|██████████| 938/938 [00:21<00:00, 43.48it/s]



Test set: Average loss: 0.0352, Accuracy: 9888/10000 (98.8800%)

Epoch: 3


loss=0.06495413184165955 batch_id=937 Accuracy=98.0700: 100%|██████████| 938/938 [00:22<00:00, 41.94it/s]



Test set: Average loss: 0.0344, Accuracy: 9887/10000 (98.8700%)

Epoch: 4


loss=0.2087959200143814 batch_id=937 Accuracy=98.2450: 100%|██████████| 938/938 [00:21<00:00, 43.65it/s]



Test set: Average loss: 0.0422, Accuracy: 9867/10000 (98.6700%)

Epoch: 5


loss=0.26374122500419617 batch_id=937 Accuracy=98.4400: 100%|██████████| 938/938 [00:21<00:00, 43.01it/s]



Test set: Average loss: 0.0300, Accuracy: 9895/10000 (98.9500%)

Epoch: 6


loss=0.019635550677776337 batch_id=937 Accuracy=98.5117: 100%|██████████| 938/938 [00:21<00:00, 43.14it/s]



Test set: Average loss: 0.0348, Accuracy: 9881/10000 (98.8100%)

Epoch: 7


loss=0.0021518259309232235 batch_id=937 Accuracy=98.5767: 100%|██████████| 938/938 [00:21<00:00, 42.71it/s]



Test set: Average loss: 0.0242, Accuracy: 9914/10000 (99.1400%)

Epoch: 8


loss=0.006935893092304468 batch_id=937 Accuracy=98.6683: 100%|██████████| 938/938 [00:21<00:00, 42.98it/s]



Test set: Average loss: 0.0267, Accuracy: 9914/10000 (99.1400%)

Epoch: 9


loss=0.011555458419024944 batch_id=937 Accuracy=98.7400: 100%|██████████| 938/938 [00:22<00:00, 41.88it/s]



Test set: Average loss: 0.0265, Accuracy: 9913/10000 (99.1300%)

Epoch: 10


loss=0.0011945656733587384 batch_id=937 Accuracy=98.7783: 100%|██████████| 938/938 [00:21<00:00, 43.05it/s]



Test set: Average loss: 0.0254, Accuracy: 9916/10000 (99.1600%)

Epoch: 11


loss=0.1523539274930954 batch_id=937 Accuracy=98.7767: 100%|██████████| 938/938 [00:21<00:00, 43.15it/s]



Test set: Average loss: 0.0223, Accuracy: 9931/10000 (99.3100%)

Epoch: 12


loss=0.009935079142451286 batch_id=937 Accuracy=98.8817: 100%|██████████| 938/938 [00:21<00:00, 43.04it/s]



Test set: Average loss: 0.0262, Accuracy: 9910/10000 (99.1000%)

Epoch: 13


loss=0.07501953095197678 batch_id=937 Accuracy=98.8717: 100%|██████████| 938/938 [00:21<00:00, 43.21it/s]



Test set: Average loss: 0.0258, Accuracy: 9921/10000 (99.2100%)

Epoch: 14


loss=0.040603213012218475 batch_id=937 Accuracy=98.9267: 100%|██████████| 938/938 [00:22<00:00, 41.65it/s]



Test set: Average loss: 0.0252, Accuracy: 9924/10000 (99.2400%)

