<a href="https://colab.research.google.com/github/gremlin97/EVA-8/blob/main/S5/Eva3_Step2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Target:**

Added LR Schdeuler with gamma=0.1 and step=6, reduced channel size by reducing the number of kernels for each convolution block. Pushed parms below 10k. Increased learning rate to increasing learning for epochs below 15 and to offset the regularization. Removed Padding=1 to reduce feature map size faster. Added random rotation to image of 7 degrees

**Results**:

* Parameters: 9,866
* Best Train Accuracy: 98.68
* Best Test Accuracy: 99.23

**Analysis:**
I was able to reduce the model parameters below 10k by reducing the number of filter and maintaining the number of out channels as 16 after each channel. The train accuracy was lower by the test accuracy by around 1% indicating that my model can learn more and achieve higher accuracy. The learning has become harder to to the multiple form of regularizations (dropout, random rotations). Somehow I need to increase the learning.

In [1]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

In [44]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, 3, padding=0) # RF:1+(3-1)1=3; ji=1,jo=1; 28x28x1 -> 26x26x16
        self.bn1 = nn.BatchNorm2d(16)
        self.pool1 = nn.MaxPool2d(2, 2) # RF:3+(2-1)1=4; ji=1,jo=2; 26x26x16 -> 13x13x16
        self.drop1 = nn.Dropout(0.1)
        self.conv3 = nn.Conv2d(16, 16, 3, padding=0) # RF:4+(3-1)2=8; ji=2,jo=2; 13x13x16 -> 11x11x16
        self.bn2 = nn.BatchNorm2d(16)
        self.pool2 = nn.MaxPool2d(2, 2) # RF:8+(2-1)2=10; ji=2,jo=4; 11x11x16 -> 5x5x16
        self.drop2 = nn.Dropout(0.1)
        self.conv5 = nn.Conv2d(16, 16, 3, padding=1) # RF:10+(3-1)4=18; ji=4,jo=4; 5x5x16 -> 5x5x16
        self.bn3 = nn.BatchNorm2d(16) 
        self.pool3 = nn.MaxPool2d(2, 2) # RF:18+(2-1)4=22; ji=4,jo=8; 5x5x16 -> 2x2x16
        self.drop3 = nn.Dropout(0.1)
        self.conv6 = nn.Conv2d(16, 32, 3, padding=1) # RF:22+(3-1)8=38; ji=4,jo=8; 2x2x16 -> 2x2x32

        self.gap = nn.AdaptiveAvgPool2d((1,1)) 

        self.lin = nn.Linear(32, 10)

    def forward(self, x):

        x = self.conv6(self.drop3(self.pool3(self.bn3(F.relu(self.conv5(self.drop2(self.pool2(self.bn2(F.relu(self.conv3(F.relu(self.drop1(self.pool1(self.bn1(F.relu(self.conv1(x)))))))))))))))))

        x = self.gap(x)

        x = x.view(-1, 32)

        x = self.lin(x)
        
        # x = x.view(-1, 10)
        return F.log_softmax(x)

In [45]:
model = Net()
out = model(torch.randn(1,1,28,28))
print(out.shape)

torch.Size([1, 10])


  return F.log_softmax(x)


In [46]:
model

Net(
  (conv1): Conv2d(1, 16, kernel_size=(3, 3), stride=(1, 1))
  (bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (drop1): Dropout(p=0.1, inplace=False)
  (conv3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1))
  (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (drop2): Dropout(p=0.1, inplace=False)
  (conv5): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (bn3): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (drop3): Dropout(p=0.1, inplace=False)
  (conv6): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (gap): AdaptiveAvgPool2d(output_size=(1, 1))
  (lin): Linear(in_featur

In [47]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = Net().to(device)
summary(model, input_size=(1, 28, 28))

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 16, 26, 26]             160
       BatchNorm2d-2           [-1, 16, 26, 26]              32
         MaxPool2d-3           [-1, 16, 13, 13]               0
           Dropout-4           [-1, 16, 13, 13]               0
            Conv2d-5           [-1, 16, 11, 11]           2,320
       BatchNorm2d-6           [-1, 16, 11, 11]              32
         MaxPool2d-7             [-1, 16, 5, 5]               0
           Dropout-8             [-1, 16, 5, 5]               0
            Conv2d-9             [-1, 16, 5, 5]           2,320
      BatchNorm2d-10             [-1, 16, 5, 5]              32
        MaxPool2d-11             [-1, 16, 2, 2]               0
          Dropout-12             [-1, 16, 2, 2]               0
    

  return F.log_softmax(x)


In [48]:
torch.manual_seed(1)
batch_size = 32

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                    transform=transforms.Compose([
                        transforms.RandomRotation((-5.0, 5.0), fill=(1,)),
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                        transforms.RandomRotation((-7.0, 7.0), fill=(1,)),
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)


In [49]:
from tqdm import tqdm
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    correct = 0
    pbar = tqdm(train_loader)
    for batch_idx, (data, target) in enumerate(pbar):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
        correct += pred.eq(target.view_as(pred)).sum().item()
        pbar.set_description(desc= f'loss={loss.item()} batch_id={batch_idx} Train Accuracy={100. * correct / len(train_loader.dataset)}')

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Test Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

In [51]:
model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.02, momentum=0.9)
scheduler =  optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

#lr=0.01

for epoch in range(15):
    print("Epoch: ",epoch+1)
    train(model, device, train_loader, optimizer, epoch)
    scheduler.step()
    test(model, device, test_loader)

Epoch:  1


  return F.log_softmax(x)
loss=0.17063243687152863 batch_id=1874 Train Accuracy=92.93166666666667: 100%|██████████| 1875/1875 [00:30<00:00, 60.75it/s]



Test set: Average loss: 0.0894, Test Accuracy: 9712/10000 (97%)

Epoch:  2


loss=0.03233985975384712 batch_id=1874 Train Accuracy=96.065: 100%|██████████| 1875/1875 [00:30<00:00, 61.40it/s]



Test set: Average loss: 0.0744, Test Accuracy: 9777/10000 (98%)

Epoch:  3


loss=0.5031666159629822 batch_id=1874 Train Accuracy=96.65: 100%|██████████| 1875/1875 [00:30<00:00, 60.69it/s]



Test set: Average loss: 0.0731, Test Accuracy: 9773/10000 (98%)

Epoch:  4


loss=0.1628304421901703 batch_id=1874 Train Accuracy=96.945: 100%|██████████| 1875/1875 [00:30<00:00, 61.13it/s]



Test set: Average loss: 0.0544, Test Accuracy: 9836/10000 (98%)

Epoch:  5


loss=0.06440648436546326 batch_id=1874 Train Accuracy=97.0: 100%|██████████| 1875/1875 [00:30<00:00, 60.85it/s]



Test set: Average loss: 0.0684, Test Accuracy: 9784/10000 (98%)

Epoch:  6


loss=0.14367364346981049 batch_id=1874 Train Accuracy=97.16166666666666: 100%|██████████| 1875/1875 [00:30<00:00, 61.28it/s]



Test set: Average loss: 0.0623, Test Accuracy: 9825/10000 (98%)

Epoch:  7


loss=0.04679401218891144 batch_id=1874 Train Accuracy=98.38333333333334: 100%|██████████| 1875/1875 [00:30<00:00, 61.35it/s]



Test set: Average loss: 0.0343, Test Accuracy: 9885/10000 (99%)

Epoch:  8


loss=0.02947418764233589 batch_id=1874 Train Accuracy=98.525: 100%|██████████| 1875/1875 [00:30<00:00, 60.76it/s]



Test set: Average loss: 0.0367, Test Accuracy: 9890/10000 (99%)

Epoch:  9


loss=0.07044631987810135 batch_id=1874 Train Accuracy=98.56: 100%|██████████| 1875/1875 [00:30<00:00, 60.84it/s]



Test set: Average loss: 0.0338, Test Accuracy: 9895/10000 (99%)

Epoch:  10


loss=0.13199523091316223 batch_id=1874 Train Accuracy=98.50666666666666: 100%|██████████| 1875/1875 [00:31<00:00, 60.16it/s]



Test set: Average loss: 0.0333, Test Accuracy: 9892/10000 (99%)

Epoch:  11


loss=0.006351368501782417 batch_id=1874 Train Accuracy=98.50333333333333: 100%|██████████| 1875/1875 [00:30<00:00, 61.18it/s]



Test set: Average loss: 0.0320, Test Accuracy: 9907/10000 (99%)

Epoch:  12


loss=0.0238038320094347 batch_id=1874 Train Accuracy=98.58666666666667: 100%|██████████| 1875/1875 [00:30<00:00, 60.70it/s]



Test set: Average loss: 0.0305, Test Accuracy: 9895/10000 (99%)

Epoch:  13


loss=0.15996260941028595 batch_id=1874 Train Accuracy=98.61166666666666: 100%|██████████| 1875/1875 [00:32<00:00, 57.68it/s]



Test set: Average loss: 0.0313, Test Accuracy: 9892/10000 (99%)

Epoch:  14


loss=0.11246927827596664 batch_id=1874 Train Accuracy=98.67333333333333: 100%|██████████| 1875/1875 [00:32<00:00, 58.25it/s]



Test set: Average loss: 0.0312, Test Accuracy: 9895/10000 (99%)

Epoch:  15


loss=0.004021728876978159 batch_id=1874 Train Accuracy=98.63333333333334: 100%|██████████| 1875/1875 [00:31<00:00, 60.40it/s]



Test set: Average loss: 0.0303, Test Accuracy: 9903/10000 (99%)



In [52]:
Training_Logs = '''



'''