# EVA 5 - Session 6

## 1. With L1 + BN: 

## 2. With L2 + BN: 

## 3. With L1 and L2 with BN: 

## 4. With GBN: 

## 5. With L1 and L2 with GBN: 

3. Results: 
    1. Parameters: 9,608
    2. Best Training Accuracy: 99.25%
    3. Best Test Accuracy: 99.48%

# Import Libraries

In [1]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

## Data Transformations

We first start with defining our data transformations. We need to think what our data is and how can we augment it to correct represent images which it might not see otherwise. 


In [2]:
# Train Phase transformations
train_transforms = transforms.Compose([
                                      #  transforms.Resize((28, 28)),
                                      #  transforms.ColorJitter(brightness=0.10, contrast=0.1, saturation=0.10, hue=0.1),
                                       transforms.RandomRotation((-7.0, 7.0), fill=(1,)),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,)) # The mean and std have to be sequences (e.g., tuples), therefore you should add a comma after the values. 
                                       # Note the difference between (0.1307) and (0.1307,)
                                       ])

# Test Phase transformations
test_transforms = transforms.Compose([
                                      #  transforms.Resize((28, 28)),
                                      #  transforms.ColorJitter(brightness=0.10, contrast=0.1, saturation=0.10, hue=0.1),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])


# Dataset and Creating Train/Test Split

In [3]:
train = datasets.MNIST('./data', train=True, download=True, transform=train_transforms)
test = datasets.MNIST('./data', train=False, download=True, transform=test_transforms)

# Dataloader Arguments & Test/Train Dataloaders


In [4]:
SEED = 1

# CUDA?
cuda = torch.cuda.is_available()
print("CUDA Available?", cuda)

# For reproducibility
torch.manual_seed(SEED)

if cuda:
    torch.cuda.manual_seed(SEED)

# dataloader arguments - something you'll fetch these from cmdprmt
dataloader_args = dict(shuffle=True, batch_size=64, num_workers=4, pin_memory=True) if cuda else dict(shuffle=True, batch_size=64)
g_dataloader_args = dict(shuffle=True, batch_size=256, num_workers=4, pin_memory=True) if cuda else dict(shuffle=True, batch_size=256)

# train dataloader
train_loader = torch.utils.data.DataLoader(train, **dataloader_args)
g_train_loader = torch.utils.data.DataLoader(train, **g_dataloader_args)

# test dataloader
test_loader = torch.utils.data.DataLoader(test, **dataloader_args)
g_test_loader = torch.utils.data.DataLoader(test, **g_dataloader_args)

CUDA Available? True


# Data Statistics

It is important to know your data very well. Let's check some of the statistics around our data and how it actually looks like

## MORE

It is important that we view as many images as possible. This is required to get some idea on image augmentation later on

# The model
Let's start with the model we first saw

In [5]:
import torch.nn.functional as F
dropout_value = 0.01
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # Input Block
        self.convblock1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=8, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(8),
            nn.Dropout(dropout_value)
        ) 

        # CONVOLUTION BLOCK 1
        self.convblock2 = nn.Sequential(
            nn.Conv2d(in_channels=8, out_channels=8, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(8),
            nn.Dropout(dropout_value)
        ) 

        self.convblock3 = nn.Sequential(
            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        )

        self.pool1 = nn.MaxPool2d(2, 2) 

        self.convblock4 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(1, 1), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        ) 

        # CONVOLUTION BLOCK 2
        self.convblock5 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        ) 
        
        self.convblock6 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        ) 

        self.convblock7 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        ) 

        self.convblock8 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=16, kernel_size=(1, 1), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16),
            nn.Dropout(dropout_value)
        ) 

        # OUTPUT BLOCK
        self.gap = nn.Sequential(
            nn.AvgPool2d(kernel_size=5)
        ) 

        self.convblock9 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=10, kernel_size=(1, 1), padding=0, bias=False),
            # nn.ReLU() NEVER!
        ) 

    def forward(self, x):
        x = self.convblock1(x)
        x = self.convblock2(x)
        x = self.convblock3(x)
        x = self.pool1(x)
        x = self.convblock4(x)
        x = self.convblock5(x)
        x = self.convblock6(x)
        x = self.convblock7(x)
        x = self.convblock8(x)
        x = self.gap(x)
        x = self.convblock9(x)

        x = x.view(-1, 10)
        return F.log_softmax(x, dim=-1)

# Model Params
Can't emphasize on how important viewing Model Summary is. 
Unfortunately, there is no in-built model visualizer, so we have to take external help

In [6]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device)
model = Net().to(device)
summary(model, input_size=(1, 28, 28))

cuda
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 8, 26, 26]              72
              ReLU-2            [-1, 8, 26, 26]               0
       BatchNorm2d-3            [-1, 8, 26, 26]              16
           Dropout-4            [-1, 8, 26, 26]               0
            Conv2d-5            [-1, 8, 24, 24]             576
              ReLU-6            [-1, 8, 24, 24]               0
       BatchNorm2d-7            [-1, 8, 24, 24]              16
           Dropout-8            [-1, 8, 24, 24]               0
            Conv2d-9           [-1, 16, 22, 22]           1,152
             ReLU-10           [-1, 16, 22, 22]               0
      BatchNorm2d-11           [-1, 16, 22, 22]              32
          Dropout-12           [-1, 16, 22, 22]               0
        MaxPool2d-13           [-1, 16, 11, 11]               0
           Conv2d-14           [-1

# Training and Testing

Looking at logs can be boring, so we'll introduce **tqdm** progressbar to get cooler logs. 

Let's write train and test functions

In [7]:
from tqdm import tqdm

train_losses = []
test_losses = []
train_acc = []
test_acc = []

def train(model, device, train_loader, optimizer, epoch, iter):
  model.train()
  pbar = tqdm(train_loader)
  correct = 0
  processed = 0
  for batch_idx, (data, target) in enumerate(pbar):
    # get samples
    data, target = data.to(device), target.to(device)

    # Init
    optimizer.zero_grad()
    # In PyTorch, we need to set the gradients to zero before starting to do backpropragation because PyTorch accumulates the gradients on subsequent backward passes. 
    # Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly.

    # Predict
    y_pred = model(data)

    # Calculate loss
    loss = F.nll_loss(y_pred, target)

    # Calculating L1 Loss.
    if(iter == 1 or iter == 3 or iter == 5):
      lambda_l1 = 0.001
      l1 = 0
      for p in model.parameters():
        l1 = l1 + p.abs().sum()
      loss = loss + lambda_l1 * l1
    train_losses.append(loss)

    # Backpropagation
    loss.backward()
    optimizer.step()

    # Update pbar-tqdm
    
    pred = y_pred.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
    correct += pred.eq(target.view_as(pred)).sum().item()
    processed += len(data)

    pbar.set_description(desc= f'Loss={loss.item()} Batch_id={batch_idx} Accuracy={100*correct/processed:0.2f}')
    train_acc.append(100*correct/processed)

def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)
    test_losses.append(test_loss)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    
    test_acc.append(100. * correct / len(test_loader.dataset))

# Let's Train and test our model

In [8]:
from torch.optim.lr_scheduler import StepLR

models = {}

for iter in range(1, 6): 
  # L2 Loss
  if(iter == 2 or iter == 3 or iter == 5):
    wd_val = 0.01
  else:
    wd_val = 0

  if(iter == 4 or iter == 5):
    learn_r = 1e-2
    wd_val = 5e-4

  model =  Net().to(device)
  optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=wd_val)
  scheduler = StepLR(optimizer, step_size=10, gamma=0.1)
  models[iter] = model

  EPOCHS = 25
  if(iter == 1):
    print("\n\nL1 + BN", end="\n\n")
  elif(iter == 2):
    print("L2 + BN", end="\n\n")
  elif(iter == 3):
    print("L1 and L2 with BN", end="\n\n")
  elif(iter == 4):
    print("GBN", end="\n\n")
  else:
    print("L1 and L2 with GBN", end="\n\n")
  for epoch in range(EPOCHS):
      print("EPOCH:", epoch)
      if(iter == 4 or iter == 5):
        train(model, device, g_train_loader, optimizer, epoch, iter)
        test(model, device, g_test_loader)
      else:
        train(model, device, train_loader, optimizer, epoch, iter)
        test(model, device, test_loader)

  0%|          | 0/938 [00:00<?, ?it/s]



L1 + BN

EPOCH: 0


Loss=0.5578035116195679 Batch_id=937 Accuracy=92.19: 100%|██████████| 938/938 [00:32<00:00, 28.62it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0878, Accuracy: 9768/10000 (97.68%)

EPOCH: 1


Loss=0.4370407164096832 Batch_id=937 Accuracy=97.09: 100%|██████████| 938/938 [00:32<00:00, 28.48it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0951, Accuracy: 9718/10000 (97.18%)

EPOCH: 2


Loss=0.4838656783103943 Batch_id=937 Accuracy=97.03: 100%|██████████| 938/938 [00:33<00:00, 28.36it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0596, Accuracy: 9818/10000 (98.18%)

EPOCH: 3


Loss=0.36408495903015137 Batch_id=937 Accuracy=97.15: 100%|██████████| 938/938 [00:32<00:00, 29.05it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0728, Accuracy: 9795/10000 (97.95%)

EPOCH: 4


Loss=0.45808809995651245 Batch_id=937 Accuracy=97.16: 100%|██████████| 938/938 [00:32<00:00, 28.93it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0686, Accuracy: 9801/10000 (98.01%)

EPOCH: 5


Loss=0.34309858083724976 Batch_id=937 Accuracy=97.23: 100%|██████████| 938/938 [00:32<00:00, 28.59it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0762, Accuracy: 9782/10000 (97.82%)

EPOCH: 6


Loss=0.27406546473503113 Batch_id=937 Accuracy=97.20: 100%|██████████| 938/938 [00:33<00:00, 28.19it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0630, Accuracy: 9824/10000 (98.24%)

EPOCH: 7


Loss=0.428606241941452 Batch_id=937 Accuracy=97.23: 100%|██████████| 938/938 [00:32<00:00, 28.48it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0812, Accuracy: 9757/10000 (97.57%)

EPOCH: 8


Loss=0.31011295318603516 Batch_id=937 Accuracy=97.20: 100%|██████████| 938/938 [00:33<00:00, 28.36it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0897, Accuracy: 9730/10000 (97.30%)

EPOCH: 9


Loss=0.27022063732147217 Batch_id=937 Accuracy=97.22: 100%|██████████| 938/938 [00:33<00:00, 27.90it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0815, Accuracy: 9757/10000 (97.57%)

EPOCH: 10


Loss=0.3342832624912262 Batch_id=937 Accuracy=97.23: 100%|██████████| 938/938 [00:32<00:00, 28.56it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0877, Accuracy: 9723/10000 (97.23%)

EPOCH: 11


Loss=0.40797364711761475 Batch_id=937 Accuracy=97.11: 100%|██████████| 938/938 [00:33<00:00, 28.28it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0608, Accuracy: 9817/10000 (98.17%)

EPOCH: 12


Loss=0.27329930663108826 Batch_id=937 Accuracy=97.22: 100%|██████████| 938/938 [00:31<00:00, 29.46it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0699, Accuracy: 9807/10000 (98.07%)

EPOCH: 13


Loss=0.3211140036582947 Batch_id=937 Accuracy=97.21: 100%|██████████| 938/938 [00:31<00:00, 29.64it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0961, Accuracy: 9722/10000 (97.22%)

EPOCH: 14


Loss=0.5632006525993347 Batch_id=937 Accuracy=97.19: 100%|██████████| 938/938 [00:31<00:00, 30.11it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1300, Accuracy: 9587/10000 (95.87%)

EPOCH: 15


Loss=0.3427913188934326 Batch_id=937 Accuracy=97.29: 100%|██████████| 938/938 [00:32<00:00, 29.22it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1559, Accuracy: 9554/10000 (95.54%)

EPOCH: 16


Loss=0.23742233216762543 Batch_id=937 Accuracy=97.23: 100%|██████████| 938/938 [00:32<00:00, 28.60it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0814, Accuracy: 9742/10000 (97.42%)

EPOCH: 17


Loss=0.3953821659088135 Batch_id=937 Accuracy=97.28: 100%|██████████| 938/938 [00:32<00:00, 28.83it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1267, Accuracy: 9588/10000 (95.88%)

EPOCH: 18


Loss=0.23908156156539917 Batch_id=937 Accuracy=97.34: 100%|██████████| 938/938 [00:31<00:00, 29.39it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0702, Accuracy: 9797/10000 (97.97%)

EPOCH: 19


Loss=0.25763440132141113 Batch_id=937 Accuracy=97.35: 100%|██████████| 938/938 [00:32<00:00, 28.88it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0857, Accuracy: 9723/10000 (97.23%)

EPOCH: 20


Loss=0.24203626811504364 Batch_id=937 Accuracy=97.28: 100%|██████████| 938/938 [00:32<00:00, 28.93it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0774, Accuracy: 9752/10000 (97.52%)

EPOCH: 21


Loss=0.3365561366081238 Batch_id=937 Accuracy=97.26: 100%|██████████| 938/938 [00:32<00:00, 29.06it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0622, Accuracy: 9804/10000 (98.04%)

EPOCH: 22


Loss=0.4518793821334839 Batch_id=937 Accuracy=97.23: 100%|██████████| 938/938 [00:31<00:00, 29.39it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0882, Accuracy: 9733/10000 (97.33%)

EPOCH: 23


Loss=0.29311978816986084 Batch_id=937 Accuracy=97.30: 100%|██████████| 938/938 [00:32<00:00, 28.63it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0711, Accuracy: 9774/10000 (97.74%)

EPOCH: 24


Loss=0.26778313517570496 Batch_id=937 Accuracy=97.31: 100%|██████████| 938/938 [00:31<00:00, 29.50it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0909, Accuracy: 9726/10000 (97.26%)

L2 + BN

EPOCH: 0


Loss=0.07399304211139679 Batch_id=937 Accuracy=93.14: 100%|██████████| 938/938 [00:27<00:00, 33.81it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0921, Accuracy: 9817/10000 (98.17%)

EPOCH: 1


Loss=0.16186930239200592 Batch_id=937 Accuracy=97.46: 100%|██████████| 938/938 [00:28<00:00, 32.98it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1042, Accuracy: 9789/10000 (97.89%)

EPOCH: 2


Loss=0.09638228267431259 Batch_id=937 Accuracy=97.68: 100%|██████████| 938/938 [00:28<00:00, 33.00it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1183, Accuracy: 9755/10000 (97.55%)

EPOCH: 3


Loss=0.0889323502779007 Batch_id=937 Accuracy=97.64: 100%|██████████| 938/938 [00:28<00:00, 32.40it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1645, Accuracy: 9635/10000 (96.35%)

EPOCH: 4


Loss=0.14495046436786652 Batch_id=937 Accuracy=97.75: 100%|██████████| 938/938 [00:27<00:00, 34.21it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0824, Accuracy: 9846/10000 (98.46%)

EPOCH: 5


Loss=0.07803767174482346 Batch_id=937 Accuracy=97.73: 100%|██████████| 938/938 [00:27<00:00, 33.75it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0900, Accuracy: 9832/10000 (98.32%)

EPOCH: 6


Loss=0.11087585240602493 Batch_id=937 Accuracy=97.85: 100%|██████████| 938/938 [00:28<00:00, 33.33it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1048, Accuracy: 9767/10000 (97.67%)

EPOCH: 7


Loss=0.03924611210823059 Batch_id=937 Accuracy=97.75: 100%|██████████| 938/938 [00:28<00:00, 33.28it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1117, Accuracy: 9755/10000 (97.55%)

EPOCH: 8


Loss=0.19390077888965607 Batch_id=937 Accuracy=97.81: 100%|██████████| 938/938 [00:27<00:00, 33.64it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1279, Accuracy: 9736/10000 (97.36%)

EPOCH: 9


Loss=0.10441046953201294 Batch_id=937 Accuracy=97.78: 100%|██████████| 938/938 [00:28<00:00, 32.83it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1281, Accuracy: 9725/10000 (97.25%)

EPOCH: 10


Loss=0.05887995660305023 Batch_id=937 Accuracy=97.86: 100%|██████████| 938/938 [00:27<00:00, 33.62it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1297, Accuracy: 9724/10000 (97.24%)

EPOCH: 11


Loss=0.09712930023670197 Batch_id=937 Accuracy=97.82: 100%|██████████| 938/938 [00:28<00:00, 33.12it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0797, Accuracy: 9826/10000 (98.26%)

EPOCH: 12


Loss=0.10346271842718124 Batch_id=937 Accuracy=97.88: 100%|██████████| 938/938 [00:28<00:00, 33.36it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0791, Accuracy: 9850/10000 (98.50%)

EPOCH: 13


Loss=0.2483828067779541 Batch_id=937 Accuracy=97.83: 100%|██████████| 938/938 [00:28<00:00, 32.79it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1048, Accuracy: 9777/10000 (97.77%)

EPOCH: 14


Loss=0.10067548602819443 Batch_id=937 Accuracy=97.92: 100%|██████████| 938/938 [00:28<00:00, 33.23it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1155, Accuracy: 9765/10000 (97.65%)

EPOCH: 15


Loss=0.09859055280685425 Batch_id=937 Accuracy=97.98: 100%|██████████| 938/938 [00:27<00:00, 33.79it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0837, Accuracy: 9872/10000 (98.72%)

EPOCH: 16


Loss=0.08702976256608963 Batch_id=937 Accuracy=97.88: 100%|██████████| 938/938 [00:27<00:00, 33.79it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.2074, Accuracy: 9487/10000 (94.87%)

EPOCH: 17


Loss=0.17856365442276 Batch_id=937 Accuracy=97.75: 100%|██████████| 938/938 [00:27<00:00, 34.04it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1239, Accuracy: 9736/10000 (97.36%)

EPOCH: 18


Loss=0.08406458795070648 Batch_id=937 Accuracy=97.73: 100%|██████████| 938/938 [00:28<00:00, 33.47it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0760, Accuracy: 9880/10000 (98.80%)

EPOCH: 19


Loss=0.1963195949792862 Batch_id=937 Accuracy=97.80: 100%|██████████| 938/938 [00:27<00:00, 33.71it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1042, Accuracy: 9791/10000 (97.91%)

EPOCH: 20


Loss=0.13514460623264313 Batch_id=937 Accuracy=97.78: 100%|██████████| 938/938 [00:27<00:00, 33.97it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0995, Accuracy: 9814/10000 (98.14%)

EPOCH: 21


Loss=0.058552712202072144 Batch_id=937 Accuracy=97.75: 100%|██████████| 938/938 [00:28<00:00, 33.21it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0797, Accuracy: 9846/10000 (98.46%)

EPOCH: 22


Loss=0.2236097902059555 Batch_id=937 Accuracy=97.86: 100%|██████████| 938/938 [00:28<00:00, 33.46it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1011, Accuracy: 9819/10000 (98.19%)

EPOCH: 23


Loss=0.1211937665939331 Batch_id=937 Accuracy=97.81: 100%|██████████| 938/938 [00:28<00:00, 32.90it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.0801, Accuracy: 9877/10000 (98.77%)

EPOCH: 24


Loss=0.1475110501050949 Batch_id=937 Accuracy=97.75: 100%|██████████| 938/938 [00:28<00:00, 32.82it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1238, Accuracy: 9732/10000 (97.32%)

L1 and L2 with BN

EPOCH: 0


Loss=0.39471694827079773 Batch_id=937 Accuracy=91.37: 100%|██████████| 938/938 [00:32<00:00, 28.82it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1612, Accuracy: 9724/10000 (97.24%)

EPOCH: 1


Loss=0.5107165575027466 Batch_id=937 Accuracy=96.41: 100%|██████████| 938/938 [00:32<00:00, 29.09it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1445, Accuracy: 9680/10000 (96.80%)

EPOCH: 2


Loss=0.39556998014450073 Batch_id=937 Accuracy=96.61: 100%|██████████| 938/938 [00:32<00:00, 29.31it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1503, Accuracy: 9729/10000 (97.29%)

EPOCH: 3


Loss=0.4657997488975525 Batch_id=937 Accuracy=96.53: 100%|██████████| 938/938 [00:33<00:00, 28.33it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1464, Accuracy: 9721/10000 (97.21%)

EPOCH: 4


Loss=0.3850330412387848 Batch_id=937 Accuracy=96.78: 100%|██████████| 938/938 [00:32<00:00, 28.51it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1404, Accuracy: 9718/10000 (97.18%)

EPOCH: 5


Loss=0.4406224489212036 Batch_id=937 Accuracy=96.65: 100%|██████████| 938/938 [00:32<00:00, 29.23it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1352, Accuracy: 9737/10000 (97.37%)

EPOCH: 6


Loss=0.34411853551864624 Batch_id=937 Accuracy=96.69: 100%|██████████| 938/938 [00:32<00:00, 28.84it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1438, Accuracy: 9749/10000 (97.49%)

EPOCH: 7


Loss=0.2984218895435333 Batch_id=937 Accuracy=96.58: 100%|██████████| 938/938 [00:32<00:00, 28.95it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1821, Accuracy: 9638/10000 (96.38%)

EPOCH: 8


Loss=0.4962604343891144 Batch_id=937 Accuracy=96.72: 100%|██████████| 938/938 [00:31<00:00, 29.64it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1940, Accuracy: 9536/10000 (95.36%)

EPOCH: 9


Loss=0.3458685278892517 Batch_id=937 Accuracy=96.62: 100%|██████████| 938/938 [00:32<00:00, 29.20it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1990, Accuracy: 9476/10000 (94.76%)

EPOCH: 10


Loss=0.3370135426521301 Batch_id=937 Accuracy=96.68: 100%|██████████| 938/938 [00:32<00:00, 29.06it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1663, Accuracy: 9704/10000 (97.04%)

EPOCH: 11


Loss=0.4302109479904175 Batch_id=937 Accuracy=96.52: 100%|██████████| 938/938 [00:31<00:00, 29.64it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.2591, Accuracy: 9397/10000 (93.97%)

EPOCH: 12


Loss=0.4735722243785858 Batch_id=937 Accuracy=96.58: 100%|██████████| 938/938 [00:32<00:00, 29.09it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1796, Accuracy: 9559/10000 (95.59%)

EPOCH: 13


Loss=0.3213300108909607 Batch_id=937 Accuracy=96.56: 100%|██████████| 938/938 [00:33<00:00, 28.41it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1678, Accuracy: 9646/10000 (96.46%)

EPOCH: 14


Loss=0.3434041142463684 Batch_id=937 Accuracy=96.53: 100%|██████████| 938/938 [00:31<00:00, 29.43it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.4033, Accuracy: 8787/10000 (87.87%)

EPOCH: 15


Loss=0.33322617411613464 Batch_id=937 Accuracy=96.50: 100%|██████████| 938/938 [00:32<00:00, 28.75it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1544, Accuracy: 9671/10000 (96.71%)

EPOCH: 16


Loss=0.4388670325279236 Batch_id=937 Accuracy=96.62: 100%|██████████| 938/938 [00:32<00:00, 29.11it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1678, Accuracy: 9557/10000 (95.57%)

EPOCH: 17


Loss=0.31960922479629517 Batch_id=937 Accuracy=96.56: 100%|██████████| 938/938 [00:32<00:00, 28.81it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1771, Accuracy: 9561/10000 (95.61%)

EPOCH: 18


Loss=0.3333653211593628 Batch_id=937 Accuracy=96.53: 100%|██████████| 938/938 [00:31<00:00, 29.35it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1208, Accuracy: 9793/10000 (97.93%)

EPOCH: 19


Loss=0.3176746964454651 Batch_id=937 Accuracy=96.63: 100%|██████████| 938/938 [00:31<00:00, 29.37it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.2389, Accuracy: 9420/10000 (94.20%)

EPOCH: 20


Loss=0.4686896800994873 Batch_id=937 Accuracy=96.66: 100%|██████████| 938/938 [00:32<00:00, 29.12it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.2103, Accuracy: 9394/10000 (93.94%)

EPOCH: 21


Loss=0.45313990116119385 Batch_id=937 Accuracy=96.66: 100%|██████████| 938/938 [00:32<00:00, 29.02it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1123, Accuracy: 9782/10000 (97.82%)

EPOCH: 22


Loss=0.4030514359474182 Batch_id=937 Accuracy=96.65: 100%|██████████| 938/938 [00:31<00:00, 29.57it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1460, Accuracy: 9738/10000 (97.38%)

EPOCH: 23


Loss=0.25276049971580505 Batch_id=937 Accuracy=96.59: 100%|██████████| 938/938 [00:32<00:00, 28.99it/s]
  0%|          | 0/938 [00:00<?, ?it/s]


Test set: Average loss: 0.1422, Accuracy: 9698/10000 (96.98%)

EPOCH: 24


Loss=0.35228562355041504 Batch_id=937 Accuracy=96.65: 100%|██████████| 938/938 [00:32<00:00, 28.52it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.1325, Accuracy: 9739/10000 (97.39%)

GBN

EPOCH: 0


Loss=0.17654041945934296 Batch_id=234 Accuracy=86.18: 100%|██████████| 235/235 [00:16<00:00, 13.87it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.1277, Accuracy: 9707/10000 (97.07%)

EPOCH: 1


Loss=0.09497051686048508 Batch_id=234 Accuracy=97.63: 100%|██████████| 235/235 [00:16<00:00, 14.39it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0610, Accuracy: 9849/10000 (98.49%)

EPOCH: 2


Loss=0.11442222446203232 Batch_id=234 Accuracy=98.41: 100%|██████████| 235/235 [00:16<00:00, 14.19it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0438, Accuracy: 9878/10000 (98.78%)

EPOCH: 3


Loss=0.030706753954291344 Batch_id=234 Accuracy=98.57: 100%|██████████| 235/235 [00:16<00:00, 14.19it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0459, Accuracy: 9865/10000 (98.65%)

EPOCH: 4


Loss=0.035772696137428284 Batch_id=234 Accuracy=98.78: 100%|██████████| 235/235 [00:16<00:00, 14.24it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0332, Accuracy: 9895/10000 (98.95%)

EPOCH: 5


Loss=0.09490583091974258 Batch_id=234 Accuracy=98.87: 100%|██████████| 235/235 [00:16<00:00, 14.13it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0350, Accuracy: 9899/10000 (98.99%)

EPOCH: 6


Loss=0.03382491692900658 Batch_id=234 Accuracy=98.93: 100%|██████████| 235/235 [00:16<00:00, 14.07it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0274, Accuracy: 9913/10000 (99.13%)

EPOCH: 7


Loss=0.041734885424375534 Batch_id=234 Accuracy=98.95: 100%|██████████| 235/235 [00:15<00:00, 14.69it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0289, Accuracy: 9919/10000 (99.19%)

EPOCH: 8


Loss=0.03748288378119469 Batch_id=234 Accuracy=99.08: 100%|██████████| 235/235 [00:16<00:00, 13.95it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0276, Accuracy: 9923/10000 (99.23%)

EPOCH: 9


Loss=0.015124752186238766 Batch_id=234 Accuracy=99.11: 100%|██████████| 235/235 [00:15<00:00, 14.70it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0270, Accuracy: 9927/10000 (99.27%)

EPOCH: 10


Loss=0.12474878877401352 Batch_id=234 Accuracy=99.09: 100%|██████████| 235/235 [00:17<00:00, 13.76it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0304, Accuracy: 9910/10000 (99.10%)

EPOCH: 11


Loss=0.07537544518709183 Batch_id=234 Accuracy=99.11: 100%|██████████| 235/235 [00:16<00:00, 14.55it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0245, Accuracy: 9922/10000 (99.22%)

EPOCH: 12


Loss=0.012820701114833355 Batch_id=234 Accuracy=99.17: 100%|██████████| 235/235 [00:16<00:00, 14.27it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0240, Accuracy: 9924/10000 (99.24%)

EPOCH: 13


Loss=0.006477653980255127 Batch_id=234 Accuracy=99.19: 100%|██████████| 235/235 [00:16<00:00, 13.99it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0233, Accuracy: 9934/10000 (99.34%)

EPOCH: 14


Loss=0.02071431651711464 Batch_id=234 Accuracy=99.25: 100%|██████████| 235/235 [00:16<00:00, 14.19it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0213, Accuracy: 9938/10000 (99.38%)

EPOCH: 15


Loss=0.08041749149560928 Batch_id=234 Accuracy=99.23: 100%|██████████| 235/235 [00:16<00:00, 14.47it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0253, Accuracy: 9927/10000 (99.27%)

EPOCH: 16


Loss=0.02119281142950058 Batch_id=234 Accuracy=99.23: 100%|██████████| 235/235 [00:16<00:00, 14.19it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0215, Accuracy: 9937/10000 (99.37%)

EPOCH: 17


Loss=0.05372453108429909 Batch_id=234 Accuracy=99.35: 100%|██████████| 235/235 [00:16<00:00, 13.95it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0230, Accuracy: 9927/10000 (99.27%)

EPOCH: 18


Loss=0.0254357922822237 Batch_id=234 Accuracy=99.31: 100%|██████████| 235/235 [00:16<00:00, 14.29it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0237, Accuracy: 9936/10000 (99.36%)

EPOCH: 19


Loss=0.017649056389927864 Batch_id=234 Accuracy=99.32: 100%|██████████| 235/235 [00:16<00:00, 14.55it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0230, Accuracy: 9935/10000 (99.35%)

EPOCH: 20


Loss=0.015163283795118332 Batch_id=234 Accuracy=99.33: 100%|██████████| 235/235 [00:15<00:00, 14.80it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0204, Accuracy: 9948/10000 (99.48%)

EPOCH: 21


Loss=0.008475442416965961 Batch_id=234 Accuracy=99.37: 100%|██████████| 235/235 [00:16<00:00, 14.45it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0211, Accuracy: 9934/10000 (99.34%)

EPOCH: 22


Loss=0.024662449955940247 Batch_id=234 Accuracy=99.34: 100%|██████████| 235/235 [00:16<00:00, 14.51it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0215, Accuracy: 9941/10000 (99.41%)

EPOCH: 23


Loss=0.014203175902366638 Batch_id=234 Accuracy=99.41: 100%|██████████| 235/235 [00:16<00:00, 14.14it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0224, Accuracy: 9933/10000 (99.33%)

EPOCH: 24


Loss=0.011715506203472614 Batch_id=234 Accuracy=99.36: 100%|██████████| 235/235 [00:16<00:00, 14.35it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0208, Accuracy: 9945/10000 (99.45%)

L1 and L2 with GBN

EPOCH: 0


Loss=0.6352581977844238 Batch_id=234 Accuracy=82.25: 100%|██████████| 235/235 [00:16<00:00, 13.96it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.1441, Accuracy: 9721/10000 (97.21%)

EPOCH: 1


Loss=0.5105639696121216 Batch_id=234 Accuracy=97.42: 100%|██████████| 235/235 [00:17<00:00, 13.22it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0865, Accuracy: 9812/10000 (98.12%)

EPOCH: 2


Loss=0.4415806531906128 Batch_id=234 Accuracy=97.66: 100%|██████████| 235/235 [00:17<00:00, 13.49it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0630, Accuracy: 9837/10000 (98.37%)

EPOCH: 3


Loss=0.3684942126274109 Batch_id=234 Accuracy=97.86: 100%|██████████| 235/235 [00:17<00:00, 13.19it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0706, Accuracy: 9802/10000 (98.02%)

EPOCH: 4


Loss=0.38264623284339905 Batch_id=234 Accuracy=98.16: 100%|██████████| 235/235 [00:17<00:00, 13.48it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0846, Accuracy: 9786/10000 (97.86%)

EPOCH: 5


Loss=0.3245072662830353 Batch_id=234 Accuracy=98.08: 100%|██████████| 235/235 [00:17<00:00, 13.28it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0527, Accuracy: 9860/10000 (98.60%)

EPOCH: 6


Loss=0.34456899762153625 Batch_id=234 Accuracy=98.20: 100%|██████████| 235/235 [00:17<00:00, 13.12it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0764, Accuracy: 9771/10000 (97.71%)

EPOCH: 7


Loss=0.2815341353416443 Batch_id=234 Accuracy=98.11: 100%|██████████| 235/235 [00:17<00:00, 13.49it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0571, Accuracy: 9842/10000 (98.42%)

EPOCH: 8


Loss=0.29801079630851746 Batch_id=234 Accuracy=98.19: 100%|██████████| 235/235 [00:17<00:00, 13.14it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0808, Accuracy: 9773/10000 (97.73%)

EPOCH: 9


Loss=0.30045610666275024 Batch_id=234 Accuracy=98.19: 100%|██████████| 235/235 [00:17<00:00, 13.50it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0758, Accuracy: 9787/10000 (97.87%)

EPOCH: 10


Loss=0.33587461709976196 Batch_id=234 Accuracy=98.17: 100%|██████████| 235/235 [00:17<00:00, 13.14it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0481, Accuracy: 9870/10000 (98.70%)

EPOCH: 11


Loss=0.2677241265773773 Batch_id=234 Accuracy=98.23: 100%|██████████| 235/235 [00:17<00:00, 13.27it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0756, Accuracy: 9781/10000 (97.81%)

EPOCH: 12


Loss=0.2912185490131378 Batch_id=234 Accuracy=98.23: 100%|██████████| 235/235 [00:17<00:00, 13.11it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0701, Accuracy: 9772/10000 (97.72%)

EPOCH: 13


Loss=0.2616482973098755 Batch_id=234 Accuracy=98.27: 100%|██████████| 235/235 [00:17<00:00, 13.51it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0826, Accuracy: 9765/10000 (97.65%)

EPOCH: 14


Loss=0.28624027967453003 Batch_id=234 Accuracy=98.30: 100%|██████████| 235/235 [00:17<00:00, 13.50it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0462, Accuracy: 9885/10000 (98.85%)

EPOCH: 15


Loss=0.2594011723995209 Batch_id=234 Accuracy=98.36: 100%|██████████| 235/235 [00:17<00:00, 13.69it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0541, Accuracy: 9848/10000 (98.48%)

EPOCH: 16


Loss=0.2621104121208191 Batch_id=234 Accuracy=98.29: 100%|██████████| 235/235 [00:18<00:00, 13.01it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0610, Accuracy: 9829/10000 (98.29%)

EPOCH: 17


Loss=0.2557208836078644 Batch_id=234 Accuracy=98.28: 100%|██████████| 235/235 [00:17<00:00, 13.24it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0514, Accuracy: 9860/10000 (98.60%)

EPOCH: 18


Loss=0.25353944301605225 Batch_id=234 Accuracy=98.26: 100%|██████████| 235/235 [00:17<00:00, 13.70it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0511, Accuracy: 9874/10000 (98.74%)

EPOCH: 19


Loss=0.29142463207244873 Batch_id=234 Accuracy=98.36: 100%|██████████| 235/235 [00:17<00:00, 13.22it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0577, Accuracy: 9837/10000 (98.37%)

EPOCH: 20


Loss=0.28625625371932983 Batch_id=234 Accuracy=98.37: 100%|██████████| 235/235 [00:17<00:00, 13.24it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0652, Accuracy: 9795/10000 (97.95%)

EPOCH: 21


Loss=0.24256888031959534 Batch_id=234 Accuracy=98.31: 100%|██████████| 235/235 [00:17<00:00, 13.33it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0561, Accuracy: 9856/10000 (98.56%)

EPOCH: 22


Loss=0.271170973777771 Batch_id=234 Accuracy=98.36: 100%|██████████| 235/235 [00:17<00:00, 13.22it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0522, Accuracy: 9849/10000 (98.49%)

EPOCH: 23


Loss=0.2025248110294342 Batch_id=234 Accuracy=98.34: 100%|██████████| 235/235 [00:17<00:00, 13.30it/s]
  0%|          | 0/235 [00:00<?, ?it/s]


Test set: Average loss: 0.0412, Accuracy: 9884/10000 (98.84%)

EPOCH: 24


Loss=0.2332032471895218 Batch_id=234 Accuracy=98.31: 100%|██████████| 235/235 [00:17<00:00, 13.56it/s]



Test set: Average loss: 0.0502, Accuracy: 9868/10000 (98.68%)



In [9]:
a = 1