<a href="https://colab.research.google.com/github/praveenraghuvanshi1512/EVA4/blob/Session-6/Session-6/Assignment-6/EVA_4_S6_Praveen_Raghuvanshi_all.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Import Libraries

In [0]:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms

## Data Transformations

We first start with defining our data transformations. We need to think what our data is and how can we augment it to correct represent images which it might not see otherwise. 

Here is the list of all the transformations which come pre-built with PyTorch

1.   Compose
2.   ToTensor
3.   ToPILImage
4. Normalize
5. Resize
6. Scale
7. CenterCrop
8. Pad
9. Lambda
10. RandomApply
11. RandomChoice
12. RandomOrder
13. RandomCrop
14. RandomHorizontalFlip
15. RandomVerticalFlip
16. RandomResizedCrop
17. RandomSizedCrop
18. FiveCrop
19. TenCrop
20. LinearTransformation
21. ColorJitter
22. RandomRotation
23. RandomAffine
24. Grayscale
25. RandomGrayscale
26. RandomPerspective
27. RandomErasing

You can read more about them [here](https://pytorch.org/docs/stable/_modules/torchvision/transforms/transforms.html)

In [0]:
# Train Phase transformations
train_transforms = transforms.Compose([
                                      #  transforms.Resize((28, 28)),
                                      #  transforms.ColorJitter(brightness=0.10, contrast=0.1, saturation=0.10, hue=0.1),
                                       transforms.RandomRotation((-20.0, 20.0), fill=(1,)),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,)) # The mean and std have to be sequences (e.g., tuples), therefore you should add a comma after the values. 
                                       # Note the difference between (0.1307) and (0.1307,)
                                       ])

# Test Phase transformations
test_transforms = transforms.Compose([
                                      #  transforms.Resize((28, 28)),
                                      #  transforms.ColorJitter(brightness=0.10, contrast=0.1, saturation=0.10, hue=0.1),
                                       transforms.ToTensor(),
                                       transforms.Normalize((0.1307,), (0.3081,))
                                       ])


# Dataset and Creating Train/Test Split

In [3]:
train = datasets.MNIST('./data', train=True, download=True, transform=train_transforms)
test = datasets.MNIST('./data', train=False, download=True, transform=test_transforms)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw
Processing...
Done!


# Dataloader Arguments & Test/Train Dataloaders


In [4]:
SEED = 1

# CUDA?
cuda = torch.cuda.is_available()
print("CUDA Available?", cuda)

# For reproducibility
torch.manual_seed(SEED)

if cuda:
    torch.cuda.manual_seed(SEED)

# dataloader arguments - something you'll fetch these from cmdprmt
dataloader_args = dict(shuffle=True, batch_size=128, num_workers=4, pin_memory=True) if cuda else dict(shuffle=True, batch_size=64)

# train dataloader
train_loader = torch.utils.data.DataLoader(train, **dataloader_args)

# test dataloader
test_loader = torch.utils.data.DataLoader(test, **dataloader_args)

CUDA Available? True


# Data Statistics

It is important to know your data very well. Let's check some of the statistics around our data and how it actually looks like

In [5]:
# We'd need to convert it into Numpy! Remember above we have converted it into tensors already
train_data = train.train_data
train_data = train.transform(train_data.numpy())

print('[Train]')
print(' - Numpy Shape:', train.train_data.cpu().numpy().shape)
print(' - Tensor Shape:', train.train_data.size())
print(' - min:', torch.min(train_data))
print(' - max:', torch.max(train_data))
print(' - mean:', torch.mean(train_data))
print(' - std:', torch.std(train_data))
print(' - var:', torch.var(train_data))

dataiter = iter(train_loader)
images, labels = dataiter.next()

print(images.shape)
print(labels.shape)

# Let's visualize some of the images
%matplotlib inline
import matplotlib.pyplot as plt

plt.imshow(images[0].numpy().squeeze(), cmap='gray_r')




TypeError: ignored

## MORE

It is important that we view as many images as possible. This is required to get some idea on image augmentation later on

In [6]:
figure = plt.figure()
num_of_images = 60
for index in range(1, num_of_images + 1):
    plt.subplot(6, 10, index)
    plt.axis('off')
    plt.imshow(images[index].numpy().squeeze(), cmap='gray_r')

NameError: ignored

# How did we get those mean and std values which we used above?

Let's run a small experiment

In [7]:
# simple transform
simple_transforms = transforms.Compose([
                                      #  transforms.Resize((28, 28)),
                                      #  transforms.ColorJitter(brightness=0.10, contrast=0.1, saturation=0.10, hue=0.1),
                                       transforms.ToTensor(),
                                      #  transforms.Normalize((0.1307,), (0.3081,)) # The mean and std have to be sequences (e.g., tuples), therefore you should add a comma after the values. 
                                       # Note the difference between (0.1307) and (0.1307,)
                                       ])
exp = datasets.MNIST('./data', train=True, download=True, transform=simple_transforms)
exp_data = exp.train_data
exp_data = exp.transform(exp_data.numpy())

print('[Train]')
print(' - Numpy Shape:', exp.train_data.cpu().numpy().shape)
print(' - Tensor Shape:', exp.train_data.size())
print(' - min:', torch.min(exp_data))
print(' - max:', torch.max(exp_data))
print(' - mean:', torch.mean(exp_data))
print(' - std:', torch.std(exp_data))
print(' - var:', torch.var(exp_data))



[Train]
 - Numpy Shape: (60000, 28, 28)
 - Tensor Shape: torch.Size([60000, 28, 28])
 - min: tensor(0.)
 - max: tensor(1.)
 - mean: tensor(0.1305)
 - std: tensor(0.3081)
 - var: tensor(0.0949)


# The model
Let's start with the model we first saw

In [0]:
'''Neural Network                      Input      Output     RF
  Input Block(Conv1) ->                 28          26        3     3
  Conv Block-1(Conv2) ->                26          24        5     5
  Transition Block-1(Conv3) ->          24          12        10    6
  Conv Block-2(Conv4 -> Conv5) ->       12          8         14    10
  Transition Block-1(Conv3) ->          8           4         28    14
  Output Block(Conv8 -> Conv9)          4           1         ??    
'''
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        # Input Block
        self.convblock1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=14, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(14)
        ) # output_size = 26

        # CONVOLUTION BLOCK 1
        self.convblock2 = nn.Sequential(
            nn.Conv2d(in_channels=14, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16)
        ) # output_size = 24

        # TRANSITION BLOCK 1
        self.pool1 = nn.MaxPool2d(2, 2) # output_size = 12
        self.convblock3 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=8, kernel_size=(1, 1), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(8)
        ) # output_size = 12

        # CONVOLUTION BLOCK 2
        self.convblock4 = nn.Sequential(
            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16)
        ) # output_size = 10
        self.convblock5 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(32)
        ) # output_size = 8

        # TRANSITION BLOCK 2
        self.pool2 = nn.MaxPool2d(2, 2) # output_size = 4
        self.convblock6 = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=8, kernel_size=(1, 1), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(8)
        ) # output_size = 4

        # OUTPUT BLOCK
        self.convblock7 = nn.Sequential(
            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=(3, 3), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(16)
        ) # output_size = 2

        self.convblock8 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=10, kernel_size=(1, 1), padding=0, bias=False),
            nn.ReLU(),
            nn.BatchNorm2d(10)
        ) # output_size = 2
        self.gap = nn.Sequential(
            nn.AvgPool2d(kernel_size=2)
        ) # output_size = 1

    def forward(self, x):
        # Input block
        x = self.convblock1(x)
        # Block-1
        x = self.convblock2(x)
        # Transition Block-1
        x = self.pool1(x)
        x = self.convblock3(x)
        # Block-2        
        x = self.convblock4(x)
        x = self.convblock5(x)
        # Transition Block-2
        x = self.pool2(x)
        x = self.convblock6(x) 
        # Output Block   
        x = self.convblock7(x)  
        x = self.convblock8(x)    
        x = self.gap(x)
        # Reshape
        x = x.view(-1, 10)
        return F.log_softmax(x, dim=-1) # Classification

# Model Params
Can't emphasize on how important viewing Model Summary is. 
Unfortunately, there is no in-built model visualizer, so we have to take external help

In [9]:
!pip install torchsummary
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
print(device)
model = Net().to(device)
summary(model, input_size=(1, 28, 28))

cuda
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 14, 26, 26]             126
              ReLU-2           [-1, 14, 26, 26]               0
       BatchNorm2d-3           [-1, 14, 26, 26]              28
            Conv2d-4           [-1, 16, 24, 24]           2,016
              ReLU-5           [-1, 16, 24, 24]               0
       BatchNorm2d-6           [-1, 16, 24, 24]              32
         MaxPool2d-7           [-1, 16, 12, 12]               0
            Conv2d-8            [-1, 8, 12, 12]             128
              ReLU-9            [-1, 8, 12, 12]               0
      BatchNorm2d-10            [-1, 8, 12, 12]              16
           Conv2d-11           [-1, 16, 10, 10]           1,152
             ReLU-12           [-1, 16, 10, 10]               0
      BatchNorm2d-13           [-1, 16, 10, 10]              32
           Conv2d-14             [

# Training and Testing

All right, so we have 6.3M params, and that's too many, we know that. But the purpose of this notebook is to set things right for our future experiments. 

Looking at logs can be boring, so we'll introduce **tqdm** progressbar to get cooler logs. 

Let's write train and test functions

In [0]:
from tqdm import tqdm

train_losses = []
test_losses = []
train_acc = []
test_acc = []

def train(model, device, train_loader, optimizer, epoch, l1):
  model.train()
  pbar = tqdm(train_loader)
  correct = 0
  processed = 0
  for batch_idx, (data, target) in enumerate(pbar):
    # get samples
    data, target = data.to(device), target.to(device)

    # Init
    optimizer.zero_grad()
    # In PyTorch, we need to set the gradients to zero before starting to do backpropragation because PyTorch accumulates the gradients on subsequent backward passes. 
    # Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly.

    # Predict
    y_pred = model(data)

    # Calculate loss
    loss = F.nll_loss(y_pred, target)
    if l1 > 0:  # Apply L1 regularization
            l1_criteria = nn.L1Loss(size_average=False)
            regularizer_loss = 0
            for parameter in model.parameters():
                regularizer_loss += l1_criteria(parameter, torch.zeros_like(parameter))
            loss += l1 * regularizer_loss
    train_losses.append(loss)

    # Backpropagation
    loss.backward()
    optimizer.step()

    # Update pbar-tqdm
    
    pred = y_pred.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
    correct += pred.eq(target.view_as(pred)).sum().item()
    processed += len(data)

    pbar.set_description(desc= f'Loss={loss.item()} Batch_id={batch_idx} Accuracy={100*correct/processed:0.2f}')
    train_acc.append(100*correct/processed)

def test(model, device, test_loader, losses, accuracies, misclassified):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            img_batch = data  
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            result = pred.eq(target.view_as(pred))

            if len(misclassified) < 25:
                for i in range(test_loader.batch_size):
                    if not list(result)[i]:
                        misclassified.append({
                            '[P]': list(pred)[i],
                            '[A]': list(target.view_as(pred))[i],
                            'image': list(img_batch)[i]
                        })

            correct += result.sum().item()

    test_loss /= len(test_loader.dataset)
    test_losses.append(test_loss)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    
    test_acc.append(100. * correct / len(test_loader.dataset))

# Let's Train and test our model

In [0]:
from torch.optim.lr_scheduler import StepLR

def run(l1=0.0, l2=0.0):
    losses = []
    accuracies = []
    misclassified = []
    model =  Net().to(device)
    optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=l2)
    scheduler = StepLR(optimizer, step_size=6, gamma=0.1)
    EPOCHS = 40 
    for epoch in range(EPOCHS):
        print("EPOCH:", epoch)
        train(model, device, train_loader, optimizer, epoch, l1)
        test(model, device, test_loader, losses, accuracies, misclassified)

    return losses, accuracies, misclassified

## **Without L1 and L2**

In [18]:
loss, accuracy, misclassified = run()

  0%|          | 0/469 [00:00<?, ?it/s]

EPOCH: 0


Loss=0.12423249334096909 Batch_id=468 Accuracy=91.79: 100%|██████████| 469/469 [00:12<00:00, 37.96it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0812, Accuracy: 9829/10000 (98.29%)

EPOCH: 1


Loss=0.10322093963623047 Batch_id=468 Accuracy=97.42: 100%|██████████| 469/469 [00:12<00:00, 36.52it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0493, Accuracy: 9879/10000 (98.79%)

EPOCH: 2


Loss=0.12232490628957748 Batch_id=468 Accuracy=97.92: 100%|██████████| 469/469 [00:12<00:00, 45.89it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0426, Accuracy: 9886/10000 (98.86%)

EPOCH: 3


Loss=0.09445225447416306 Batch_id=468 Accuracy=98.14: 100%|██████████| 469/469 [00:13<00:00, 35.75it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0369, Accuracy: 9911/10000 (99.11%)

EPOCH: 4


Loss=0.07303611934185028 Batch_id=468 Accuracy=98.27: 100%|██████████| 469/469 [00:12<00:00, 37.64it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0373, Accuracy: 9901/10000 (99.01%)

EPOCH: 5


Loss=0.039764728397130966 Batch_id=468 Accuracy=98.37: 100%|██████████| 469/469 [00:12<00:00, 36.53it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0326, Accuracy: 9919/10000 (99.19%)

EPOCH: 6


Loss=0.10401376336812973 Batch_id=468 Accuracy=98.43: 100%|██████████| 469/469 [00:12<00:00, 37.03it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0276, Accuracy: 9932/10000 (99.32%)

EPOCH: 7


Loss=0.015208925120532513 Batch_id=468 Accuracy=98.58: 100%|██████████| 469/469 [00:13<00:00, 35.87it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0284, Accuracy: 9926/10000 (99.26%)

EPOCH: 8


Loss=0.09035984426736832 Batch_id=468 Accuracy=98.50: 100%|██████████| 469/469 [00:12<00:00, 37.46it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0285, Accuracy: 9922/10000 (99.22%)

EPOCH: 9


Loss=0.07208465784788132 Batch_id=468 Accuracy=98.64: 100%|██████████| 469/469 [00:12<00:00, 42.29it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0260, Accuracy: 9926/10000 (99.26%)

EPOCH: 10


Loss=0.030765675008296967 Batch_id=468 Accuracy=98.64: 100%|██████████| 469/469 [00:12<00:00, 36.86it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0272, Accuracy: 9920/10000 (99.20%)

EPOCH: 11


Loss=0.0459224171936512 Batch_id=468 Accuracy=98.70: 100%|██████████| 469/469 [00:13<00:00, 35.36it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0279, Accuracy: 9921/10000 (99.21%)

EPOCH: 12


Loss=0.073151133954525 Batch_id=468 Accuracy=98.76: 100%|██████████| 469/469 [00:12<00:00, 37.44it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0223, Accuracy: 9938/10000 (99.38%)

EPOCH: 13


Loss=0.015373636968433857 Batch_id=468 Accuracy=98.72: 100%|██████████| 469/469 [00:12<00:00, 37.68it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0267, Accuracy: 9925/10000 (99.25%)

EPOCH: 14


Loss=0.02672460675239563 Batch_id=468 Accuracy=98.75: 100%|██████████| 469/469 [00:12<00:00, 36.85it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0234, Accuracy: 9931/10000 (99.31%)

EPOCH: 15


Loss=0.025027522817254066 Batch_id=468 Accuracy=98.88: 100%|██████████| 469/469 [00:12<00:00, 36.76it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0220, Accuracy: 9935/10000 (99.35%)

EPOCH: 16


Loss=0.0859452560544014 Batch_id=468 Accuracy=98.82: 100%|██████████| 469/469 [00:12<00:00, 36.35it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0230, Accuracy: 9931/10000 (99.31%)

EPOCH: 17


Loss=0.06828480213880539 Batch_id=468 Accuracy=98.87: 100%|██████████| 469/469 [00:12<00:00, 37.33it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0227, Accuracy: 9940/10000 (99.40%)

EPOCH: 18


Loss=0.013350551016628742 Batch_id=468 Accuracy=98.91: 100%|██████████| 469/469 [00:12<00:00, 36.94it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0215, Accuracy: 9938/10000 (99.38%)

EPOCH: 19


Loss=0.012302853167057037 Batch_id=468 Accuracy=98.96: 100%|██████████| 469/469 [00:12<00:00, 44.54it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0227, Accuracy: 9933/10000 (99.33%)

EPOCH: 20


Loss=0.03606025502085686 Batch_id=468 Accuracy=98.91: 100%|██████████| 469/469 [00:13<00:00, 35.66it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0214, Accuracy: 9933/10000 (99.33%)

EPOCH: 21


Loss=0.0489632673561573 Batch_id=468 Accuracy=98.97: 100%|██████████| 469/469 [00:12<00:00, 37.35it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0231, Accuracy: 9930/10000 (99.30%)

EPOCH: 22


Loss=0.010651159100234509 Batch_id=468 Accuracy=98.90: 100%|██████████| 469/469 [00:12<00:00, 36.42it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0213, Accuracy: 9936/10000 (99.36%)

EPOCH: 23


Loss=0.02916882373392582 Batch_id=468 Accuracy=99.03: 100%|██████████| 469/469 [00:12<00:00, 36.74it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0211, Accuracy: 9933/10000 (99.33%)

EPOCH: 24


Loss=0.0039624921046197414 Batch_id=468 Accuracy=98.95: 100%|██████████| 469/469 [00:13<00:00, 35.24it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0202, Accuracy: 9942/10000 (99.42%)

EPOCH: 25


Loss=0.02626786381006241 Batch_id=468 Accuracy=98.95: 100%|██████████| 469/469 [00:12<00:00, 37.16it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0204, Accuracy: 9942/10000 (99.42%)

EPOCH: 26


Loss=0.01328247506171465 Batch_id=468 Accuracy=98.98: 100%|██████████| 469/469 [00:12<00:00, 36.58it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0183, Accuracy: 9950/10000 (99.50%)

EPOCH: 27


Loss=0.05475961044430733 Batch_id=468 Accuracy=98.99: 100%|██████████| 469/469 [00:12<00:00, 36.83it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0195, Accuracy: 9947/10000 (99.47%)

EPOCH: 28


Loss=0.05022174119949341 Batch_id=468 Accuracy=99.07: 100%|██████████| 469/469 [00:13<00:00, 34.88it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0209, Accuracy: 9943/10000 (99.43%)

EPOCH: 29


Loss=0.06065211072564125 Batch_id=468 Accuracy=98.97: 100%|██████████| 469/469 [00:12<00:00, 37.08it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0198, Accuracy: 9941/10000 (99.41%)

EPOCH: 30


Loss=0.02082725055515766 Batch_id=468 Accuracy=99.05: 100%|██████████| 469/469 [00:12<00:00, 36.93it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0194, Accuracy: 9948/10000 (99.48%)

EPOCH: 31


Loss=0.05180234834551811 Batch_id=468 Accuracy=99.03: 100%|██████████| 469/469 [00:12<00:00, 36.76it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0189, Accuracy: 9946/10000 (99.46%)

EPOCH: 32


Loss=0.07328400760889053 Batch_id=468 Accuracy=99.14: 100%|██████████| 469/469 [00:13<00:00, 36.07it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0213, Accuracy: 9936/10000 (99.36%)

EPOCH: 33


Loss=0.0067439922131598 Batch_id=468 Accuracy=99.07: 100%|██████████| 469/469 [00:12<00:00, 36.77it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0188, Accuracy: 9951/10000 (99.51%)

EPOCH: 34


Loss=0.005664224270731211 Batch_id=468 Accuracy=99.11: 100%|██████████| 469/469 [00:13<00:00, 36.00it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0204, Accuracy: 9942/10000 (99.42%)

EPOCH: 35


Loss=0.07841864973306656 Batch_id=468 Accuracy=99.06: 100%|██████████| 469/469 [00:13<00:00, 34.17it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0183, Accuracy: 9944/10000 (99.44%)

EPOCH: 36


Loss=0.10423204302787781 Batch_id=468 Accuracy=99.11: 100%|██████████| 469/469 [00:13<00:00, 33.54it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0192, Accuracy: 9941/10000 (99.41%)

EPOCH: 37


Loss=0.009697210974991322 Batch_id=468 Accuracy=99.12: 100%|██████████| 469/469 [00:13<00:00, 34.80it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0211, Accuracy: 9934/10000 (99.34%)

EPOCH: 38


Loss=0.00925732683390379 Batch_id=468 Accuracy=99.16: 100%|██████████| 469/469 [00:13<00:00, 34.93it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0184, Accuracy: 9937/10000 (99.37%)

EPOCH: 39


Loss=0.0430166982114315 Batch_id=468 Accuracy=99.16: 100%|██████████| 469/469 [00:13<00:00, 34.25it/s]



Test set: Average loss: 0.0181, Accuracy: 9946/10000 (99.46%)



## **With L1 Only**

In [21]:
l1_loss, l1_accuracy, misclassified_pred_l1 = run(l1=0.01)

  0%|          | 0/469 [00:00<?, ?it/s]

EPOCH: 0


Loss=1.8362228870391846 Batch_id=468 Accuracy=90.22: 100%|██████████| 469/469 [00:15<00:00, 31.11it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2950, Accuracy: 9418/10000 (94.18%)

EPOCH: 1


Loss=1.2245789766311646 Batch_id=468 Accuracy=93.31: 100%|██████████| 469/469 [00:16<00:00, 29.19it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4038, Accuracy: 8832/10000 (88.32%)

EPOCH: 2


Loss=0.9079774618148804 Batch_id=468 Accuracy=92.81: 100%|██████████| 469/469 [00:14<00:00, 32.58it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3046, Accuracy: 9163/10000 (91.63%)

EPOCH: 3


Loss=0.8908101320266724 Batch_id=468 Accuracy=92.86: 100%|██████████| 469/469 [00:14<00:00, 32.07it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2770, Accuracy: 9346/10000 (93.46%)

EPOCH: 4


Loss=0.8875406384468079 Batch_id=468 Accuracy=92.74: 100%|██████████| 469/469 [00:14<00:00, 31.92it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4513, Accuracy: 8722/10000 (87.22%)

EPOCH: 5


Loss=1.0605084896087646 Batch_id=468 Accuracy=92.63: 100%|██████████| 469/469 [00:14<00:00, 32.97it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4934, Accuracy: 8694/10000 (86.94%)

EPOCH: 6


Loss=0.9159767031669617 Batch_id=468 Accuracy=92.79: 100%|██████████| 469/469 [00:14<00:00, 33.04it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.8842, Accuracy: 7217/10000 (72.17%)

EPOCH: 7


Loss=0.9918097257614136 Batch_id=468 Accuracy=92.62: 100%|██████████| 469/469 [00:14<00:00, 32.85it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3926, Accuracy: 8797/10000 (87.97%)

EPOCH: 8


Loss=0.8949739336967468 Batch_id=468 Accuracy=92.39: 100%|██████████| 469/469 [00:15<00:00, 31.13it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2685, Accuracy: 9313/10000 (93.13%)

EPOCH: 9


Loss=0.9096891283988953 Batch_id=468 Accuracy=92.22: 100%|██████████| 469/469 [00:14<00:00, 32.92it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3961, Accuracy: 8865/10000 (88.65%)

EPOCH: 10


Loss=0.9802489280700684 Batch_id=468 Accuracy=92.19: 100%|██████████| 469/469 [00:14<00:00, 32.13it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3479, Accuracy: 9166/10000 (91.66%)

EPOCH: 11


Loss=1.0013489723205566 Batch_id=468 Accuracy=92.37: 100%|██████████| 469/469 [00:14<00:00, 32.79it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4659, Accuracy: 8598/10000 (85.98%)

EPOCH: 12


Loss=0.8720172047615051 Batch_id=468 Accuracy=92.27: 100%|██████████| 469/469 [00:14<00:00, 31.37it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.6143, Accuracy: 8284/10000 (82.84%)

EPOCH: 13


Loss=0.9149147272109985 Batch_id=468 Accuracy=92.33: 100%|██████████| 469/469 [00:14<00:00, 32.84it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.5989, Accuracy: 8172/10000 (81.72%)

EPOCH: 14


Loss=0.8723356127738953 Batch_id=468 Accuracy=92.25: 100%|██████████| 469/469 [00:14<00:00, 32.12it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.9247, Accuracy: 7034/10000 (70.34%)

EPOCH: 15


Loss=0.9015133380889893 Batch_id=468 Accuracy=92.22: 100%|██████████| 469/469 [00:14<00:00, 31.80it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4380, Accuracy: 8898/10000 (88.98%)

EPOCH: 16


Loss=1.014743685722351 Batch_id=468 Accuracy=92.08: 100%|██████████| 469/469 [00:14<00:00, 31.72it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3036, Accuracy: 9052/10000 (90.52%)

EPOCH: 17


Loss=0.8705806732177734 Batch_id=468 Accuracy=92.40: 100%|██████████| 469/469 [00:14<00:00, 31.86it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4002, Accuracy: 8881/10000 (88.81%)

EPOCH: 18


Loss=0.849733293056488 Batch_id=468 Accuracy=92.06: 100%|██████████| 469/469 [00:14<00:00, 32.55it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3054, Accuracy: 9266/10000 (92.66%)

EPOCH: 19


Loss=1.0912258625030518 Batch_id=468 Accuracy=92.17: 100%|██████████| 469/469 [00:15<00:00, 31.21it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.7280, Accuracy: 7697/10000 (76.97%)

EPOCH: 20


Loss=0.9703906178474426 Batch_id=468 Accuracy=92.57: 100%|██████████| 469/469 [00:14<00:00, 40.78it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3145, Accuracy: 9114/10000 (91.14%)

EPOCH: 21


Loss=1.0009448528289795 Batch_id=468 Accuracy=92.23: 100%|██████████| 469/469 [00:14<00:00, 30.86it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4130, Accuracy: 8985/10000 (89.85%)

EPOCH: 22


Loss=0.8795629143714905 Batch_id=468 Accuracy=92.25: 100%|██████████| 469/469 [00:14<00:00, 33.14it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2496, Accuracy: 9345/10000 (93.45%)

EPOCH: 23


Loss=0.9173283576965332 Batch_id=468 Accuracy=92.18: 100%|██████████| 469/469 [00:14<00:00, 31.33it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.3893, Accuracy: 4734/10000 (47.34%)

EPOCH: 24


Loss=0.8916817903518677 Batch_id=468 Accuracy=92.12: 100%|██████████| 469/469 [00:14<00:00, 33.11it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4685, Accuracy: 8594/10000 (85.94%)

EPOCH: 25


Loss=0.8408294320106506 Batch_id=468 Accuracy=92.39: 100%|██████████| 469/469 [00:14<00:00, 32.09it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.5699, Accuracy: 8354/10000 (83.54%)

EPOCH: 26


Loss=0.852911114692688 Batch_id=468 Accuracy=92.23: 100%|██████████| 469/469 [00:14<00:00, 38.55it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4162, Accuracy: 9143/10000 (91.43%)

EPOCH: 27


Loss=0.8091352581977844 Batch_id=468 Accuracy=92.12: 100%|██████████| 469/469 [00:14<00:00, 31.73it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3762, Accuracy: 8901/10000 (89.01%)

EPOCH: 28


Loss=0.9052120447158813 Batch_id=468 Accuracy=92.18: 100%|██████████| 469/469 [00:14<00:00, 33.21it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.8149, Accuracy: 7265/10000 (72.65%)

EPOCH: 29


Loss=0.8758479356765747 Batch_id=468 Accuracy=92.27: 100%|██████████| 469/469 [00:14<00:00, 32.28it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.4089, Accuracy: 4293/10000 (42.93%)

EPOCH: 30


Loss=0.8942387104034424 Batch_id=468 Accuracy=92.06: 100%|██████████| 469/469 [00:14<00:00, 32.07it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2941, Accuracy: 9281/10000 (92.81%)

EPOCH: 31


Loss=0.8396036624908447 Batch_id=468 Accuracy=92.20: 100%|██████████| 469/469 [00:14<00:00, 31.93it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3775, Accuracy: 8820/10000 (88.20%)

EPOCH: 32


Loss=0.886276364326477 Batch_id=468 Accuracy=92.19: 100%|██████████| 469/469 [00:14<00:00, 33.05it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4785, Accuracy: 8609/10000 (86.09%)

EPOCH: 33


Loss=0.9665266275405884 Batch_id=468 Accuracy=92.16: 100%|██████████| 469/469 [00:14<00:00, 32.04it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4919, Accuracy: 8629/10000 (86.29%)

EPOCH: 34


Loss=0.9045964479446411 Batch_id=468 Accuracy=92.33: 100%|██████████| 469/469 [00:14<00:00, 31.57it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3181, Accuracy: 9124/10000 (91.24%)

EPOCH: 35


Loss=0.7868598103523254 Batch_id=468 Accuracy=92.29: 100%|██████████| 469/469 [00:14<00:00, 32.59it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4190, Accuracy: 8920/10000 (89.20%)

EPOCH: 36


Loss=0.9481356143951416 Batch_id=468 Accuracy=92.31: 100%|██████████| 469/469 [00:14<00:00, 32.06it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2220, Accuracy: 9370/10000 (93.70%)

EPOCH: 37


Loss=0.894740879535675 Batch_id=468 Accuracy=92.36: 100%|██████████| 469/469 [00:14<00:00, 33.13it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4736, Accuracy: 8546/10000 (85.46%)

EPOCH: 38


Loss=0.8910624980926514 Batch_id=468 Accuracy=92.32: 100%|██████████| 469/469 [00:14<00:00, 38.98it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3467, Accuracy: 9060/10000 (90.60%)

EPOCH: 39


Loss=0.8638330698013306 Batch_id=468 Accuracy=92.26: 100%|██████████| 469/469 [00:14<00:00, 32.81it/s]



Test set: Average loss: 0.7461, Accuracy: 7541/10000 (75.41%)



## **With L2 Only**

In [22]:
l2_loss, l2_accuracy, misclassified_pred_l2 = run(l2=0.0001)

  0%|          | 0/469 [00:00<?, ?it/s]

EPOCH: 0


Loss=0.14748382568359375 Batch_id=468 Accuracy=92.05: 100%|██████████| 469/469 [00:13<00:00, 35.60it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0848, Accuracy: 9803/10000 (98.03%)

EPOCH: 1


Loss=0.073601134121418 Batch_id=468 Accuracy=97.19: 100%|██████████| 469/469 [00:13<00:00, 36.04it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0648, Accuracy: 9846/10000 (98.46%)

EPOCH: 2


Loss=0.1855216771364212 Batch_id=468 Accuracy=97.80: 100%|██████████| 469/469 [00:13<00:00, 35.00it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0488, Accuracy: 9879/10000 (98.79%)

EPOCH: 3


Loss=0.10547564178705215 Batch_id=468 Accuracy=97.87: 100%|██████████| 469/469 [00:12<00:00, 36.44it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0412, Accuracy: 9900/10000 (99.00%)

EPOCH: 4


Loss=0.058942168951034546 Batch_id=468 Accuracy=98.11: 100%|██████████| 469/469 [00:12<00:00, 36.38it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0371, Accuracy: 9903/10000 (99.03%)

EPOCH: 5


Loss=0.03049815632402897 Batch_id=468 Accuracy=98.27: 100%|██████████| 469/469 [00:12<00:00, 36.76it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0342, Accuracy: 9906/10000 (99.06%)

EPOCH: 6


Loss=0.03812265396118164 Batch_id=468 Accuracy=98.36: 100%|██████████| 469/469 [00:13<00:00, 34.62it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0325, Accuracy: 9912/10000 (99.12%)

EPOCH: 7


Loss=0.05218842998147011 Batch_id=468 Accuracy=98.44: 100%|██████████| 469/469 [00:13<00:00, 34.54it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0298, Accuracy: 9921/10000 (99.21%)

EPOCH: 8


Loss=0.07529573142528534 Batch_id=468 Accuracy=98.53: 100%|██████████| 469/469 [00:13<00:00, 35.30it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0275, Accuracy: 9921/10000 (99.21%)

EPOCH: 9


Loss=0.03361820429563522 Batch_id=468 Accuracy=98.61: 100%|██████████| 469/469 [00:13<00:00, 35.69it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0293, Accuracy: 9920/10000 (99.20%)

EPOCH: 10


Loss=0.014891778118908405 Batch_id=468 Accuracy=98.70: 100%|██████████| 469/469 [00:13<00:00, 35.24it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0285, Accuracy: 9917/10000 (99.17%)

EPOCH: 11


Loss=0.048964276909828186 Batch_id=468 Accuracy=98.74: 100%|██████████| 469/469 [00:12<00:00, 43.60it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0289, Accuracy: 9918/10000 (99.18%)

EPOCH: 12


Loss=0.029137810692191124 Batch_id=468 Accuracy=98.75: 100%|██████████| 469/469 [00:13<00:00, 35.43it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0250, Accuracy: 9928/10000 (99.28%)

EPOCH: 13


Loss=0.0204994585365057 Batch_id=468 Accuracy=98.83: 100%|██████████| 469/469 [00:12<00:00, 36.92it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0251, Accuracy: 9923/10000 (99.23%)

EPOCH: 14


Loss=0.058078449219465256 Batch_id=468 Accuracy=98.81: 100%|██████████| 469/469 [00:13<00:00, 34.76it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0269, Accuracy: 9927/10000 (99.27%)

EPOCH: 15


Loss=0.0680830106139183 Batch_id=468 Accuracy=98.82: 100%|██████████| 469/469 [00:13<00:00, 35.97it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0253, Accuracy: 9923/10000 (99.23%)

EPOCH: 16


Loss=0.025370245799422264 Batch_id=468 Accuracy=98.86: 100%|██████████| 469/469 [00:12<00:00, 36.10it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0267, Accuracy: 9924/10000 (99.24%)

EPOCH: 17


Loss=0.022043714299798012 Batch_id=468 Accuracy=98.79: 100%|██████████| 469/469 [00:12<00:00, 36.56it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0251, Accuracy: 9927/10000 (99.27%)

EPOCH: 18


Loss=0.0337248258292675 Batch_id=468 Accuracy=98.91: 100%|██████████| 469/469 [00:13<00:00, 35.34it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0201, Accuracy: 9948/10000 (99.48%)

EPOCH: 19


Loss=0.009085521101951599 Batch_id=468 Accuracy=98.91: 100%|██████████| 469/469 [00:12<00:00, 36.19it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0238, Accuracy: 9934/10000 (99.34%)

EPOCH: 20


Loss=0.08125833421945572 Batch_id=468 Accuracy=98.88: 100%|██████████| 469/469 [00:13<00:00, 35.23it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0241, Accuracy: 9925/10000 (99.25%)

EPOCH: 21


Loss=0.019840732216835022 Batch_id=468 Accuracy=98.93: 100%|██████████| 469/469 [00:12<00:00, 36.58it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0222, Accuracy: 9932/10000 (99.32%)

EPOCH: 22


Loss=0.01800696551799774 Batch_id=468 Accuracy=98.97: 100%|██████████| 469/469 [00:13<00:00, 40.72it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0221, Accuracy: 9929/10000 (99.29%)

EPOCH: 23


Loss=0.021687529981136322 Batch_id=468 Accuracy=98.93: 100%|██████████| 469/469 [00:13<00:00, 35.59it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0200, Accuracy: 9946/10000 (99.46%)

EPOCH: 24


Loss=0.0472661554813385 Batch_id=468 Accuracy=99.02: 100%|██████████| 469/469 [00:12<00:00, 36.63it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0213, Accuracy: 9940/10000 (99.40%)

EPOCH: 25


Loss=0.02454710565507412 Batch_id=468 Accuracy=98.97: 100%|██████████| 469/469 [00:13<00:00, 35.75it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0208, Accuracy: 9946/10000 (99.46%)

EPOCH: 26


Loss=0.005358214024454355 Batch_id=468 Accuracy=98.96: 100%|██████████| 469/469 [00:12<00:00, 32.92it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0208, Accuracy: 9930/10000 (99.30%)

EPOCH: 27


Loss=0.03341931477189064 Batch_id=468 Accuracy=98.97: 100%|██████████| 469/469 [00:13<00:00, 35.85it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0205, Accuracy: 9946/10000 (99.46%)

EPOCH: 28


Loss=0.0064318180084228516 Batch_id=468 Accuracy=99.05: 100%|██████████| 469/469 [00:12<00:00, 36.24it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0216, Accuracy: 9934/10000 (99.34%)

EPOCH: 29


Loss=0.025259843096137047 Batch_id=468 Accuracy=99.08: 100%|██████████| 469/469 [00:12<00:00, 36.21it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0194, Accuracy: 9938/10000 (99.38%)

EPOCH: 30


Loss=0.007158726453781128 Batch_id=468 Accuracy=99.03: 100%|██████████| 469/469 [00:12<00:00, 43.81it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0209, Accuracy: 9944/10000 (99.44%)

EPOCH: 31


Loss=0.06918210536241531 Batch_id=468 Accuracy=99.08: 100%|██████████| 469/469 [00:13<00:00, 35.62it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0222, Accuracy: 9937/10000 (99.37%)

EPOCH: 32


Loss=0.008598770014941692 Batch_id=468 Accuracy=99.10: 100%|██████████| 469/469 [00:12<00:00, 36.72it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0193, Accuracy: 9944/10000 (99.44%)

EPOCH: 33


Loss=0.041089463979005814 Batch_id=468 Accuracy=99.11: 100%|██████████| 469/469 [00:13<00:00, 35.51it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0226, Accuracy: 9937/10000 (99.37%)

EPOCH: 34


Loss=0.07031505554914474 Batch_id=468 Accuracy=99.02: 100%|██████████| 469/469 [00:12<00:00, 36.44it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0219, Accuracy: 9937/10000 (99.37%)

EPOCH: 35


Loss=0.050776585936546326 Batch_id=468 Accuracy=99.06: 100%|██████████| 469/469 [00:12<00:00, 41.89it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0196, Accuracy: 9936/10000 (99.36%)

EPOCH: 36


Loss=0.023206740617752075 Batch_id=468 Accuracy=99.09: 100%|██████████| 469/469 [00:12<00:00, 36.84it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0228, Accuracy: 9936/10000 (99.36%)

EPOCH: 37


Loss=0.04634742811322212 Batch_id=468 Accuracy=99.11: 100%|██████████| 469/469 [00:13<00:00, 35.82it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0226, Accuracy: 9937/10000 (99.37%)

EPOCH: 38


Loss=0.03461523726582527 Batch_id=468 Accuracy=99.14: 100%|██████████| 469/469 [00:13<00:00, 35.78it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.0195, Accuracy: 9938/10000 (99.38%)

EPOCH: 39


Loss=0.012321293354034424 Batch_id=468 Accuracy=99.12: 100%|██████████| 469/469 [00:13<00:00, 34.85it/s]



Test set: Average loss: 0.0219, Accuracy: 9943/10000 (99.43%)



## **With L1 and L2 both**

In [23]:
l1_l2_loss, l1_l2_accuracy, misclassified_pred_l1_l2 = run(l1=0.01, l2=0.0001)

  0%|          | 0/469 [00:00<?, ?it/s]

EPOCH: 0


Loss=1.7236238718032837 Batch_id=468 Accuracy=89.97: 100%|██████████| 469/469 [00:14<00:00, 32.50it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4382, Accuracy: 8904/10000 (89.04%)

EPOCH: 1


Loss=0.9092704653739929 Batch_id=468 Accuracy=93.03: 100%|██████████| 469/469 [00:15<00:00, 30.73it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.6758, Accuracy: 8226/10000 (82.26%)

EPOCH: 2


Loss=1.1240334510803223 Batch_id=468 Accuracy=92.60: 100%|██████████| 469/469 [00:14<00:00, 37.62it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 2.7030, Accuracy: 2538/10000 (25.38%)

EPOCH: 3


Loss=0.8200960755348206 Batch_id=468 Accuracy=92.45: 100%|██████████| 469/469 [00:15<00:00, 30.91it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4240, Accuracy: 8846/10000 (88.46%)

EPOCH: 4


Loss=0.9446072578430176 Batch_id=468 Accuracy=92.33: 100%|██████████| 469/469 [00:15<00:00, 30.80it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.8614, Accuracy: 6974/10000 (69.74%)

EPOCH: 5


Loss=0.9442675113677979 Batch_id=468 Accuracy=92.37: 100%|██████████| 469/469 [00:15<00:00, 30.87it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3664, Accuracy: 8970/10000 (89.70%)

EPOCH: 6


Loss=0.7987424731254578 Batch_id=468 Accuracy=92.55: 100%|██████████| 469/469 [00:15<00:00, 38.38it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4511, Accuracy: 8870/10000 (88.70%)

EPOCH: 7


Loss=0.8482500910758972 Batch_id=468 Accuracy=92.53: 100%|██████████| 469/469 [00:15<00:00, 30.07it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2533, Accuracy: 9271/10000 (92.71%)

EPOCH: 8


Loss=0.8354435563087463 Batch_id=468 Accuracy=92.28: 100%|██████████| 469/469 [00:15<00:00, 29.80it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3499, Accuracy: 9089/10000 (90.89%)

EPOCH: 9


Loss=0.909902811050415 Batch_id=468 Accuracy=92.34: 100%|██████████| 469/469 [00:15<00:00, 30.08it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.7689, Accuracy: 7327/10000 (73.27%)

EPOCH: 10


Loss=1.0604593753814697 Batch_id=468 Accuracy=92.33: 100%|██████████| 469/469 [00:14<00:00, 31.41it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.5641, Accuracy: 8312/10000 (83.12%)

EPOCH: 11


Loss=1.0146172046661377 Batch_id=468 Accuracy=92.46: 100%|██████████| 469/469 [00:15<00:00, 31.18it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2693, Accuracy: 9361/10000 (93.61%)

EPOCH: 12


Loss=0.9009400606155396 Batch_id=468 Accuracy=92.60: 100%|██████████| 469/469 [00:14<00:00, 31.53it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.6363, Accuracy: 8013/10000 (80.13%)

EPOCH: 13


Loss=0.9359372854232788 Batch_id=468 Accuracy=92.52: 100%|██████████| 469/469 [00:15<00:00, 30.32it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.5691, Accuracy: 8143/10000 (81.43%)

EPOCH: 14


Loss=0.9470521211624146 Batch_id=468 Accuracy=92.20: 100%|██████████| 469/469 [00:14<00:00, 31.44it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3115, Accuracy: 9161/10000 (91.61%)

EPOCH: 15


Loss=0.8851172924041748 Batch_id=468 Accuracy=92.51: 100%|██████████| 469/469 [00:14<00:00, 31.39it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2423, Accuracy: 9384/10000 (93.84%)

EPOCH: 16


Loss=0.9492326974868774 Batch_id=468 Accuracy=92.46: 100%|██████████| 469/469 [00:15<00:00, 30.94it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3816, Accuracy: 8818/10000 (88.18%)

EPOCH: 17


Loss=0.8060396909713745 Batch_id=468 Accuracy=92.57: 100%|██████████| 469/469 [00:14<00:00, 31.57it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3445, Accuracy: 9003/10000 (90.03%)

EPOCH: 18


Loss=1.0917060375213623 Batch_id=468 Accuracy=92.66: 100%|██████████| 469/469 [00:14<00:00, 31.48it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 2.4090, Accuracy: 4651/10000 (46.51%)

EPOCH: 19


Loss=0.8468632698059082 Batch_id=468 Accuracy=92.59: 100%|██████████| 469/469 [00:15<00:00, 30.90it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.2514, Accuracy: 9353/10000 (93.53%)

EPOCH: 20


Loss=0.9918744564056396 Batch_id=468 Accuracy=92.44: 100%|██████████| 469/469 [00:15<00:00, 30.33it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3167, Accuracy: 9164/10000 (91.64%)

EPOCH: 21


Loss=0.9775040149688721 Batch_id=468 Accuracy=92.55: 100%|██████████| 469/469 [00:14<00:00, 31.28it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.6261, Accuracy: 8080/10000 (80.80%)

EPOCH: 22


Loss=0.9625180959701538 Batch_id=468 Accuracy=92.58: 100%|██████████| 469/469 [00:14<00:00, 31.56it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3892, Accuracy: 8786/10000 (87.86%)

EPOCH: 23


Loss=0.8371215462684631 Batch_id=468 Accuracy=92.33: 100%|██████████| 469/469 [00:14<00:00, 31.57it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4180, Accuracy: 8746/10000 (87.46%)

EPOCH: 24


Loss=1.024937629699707 Batch_id=468 Accuracy=92.38: 100%|██████████| 469/469 [00:15<00:00, 30.96it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3168, Accuracy: 9145/10000 (91.45%)

EPOCH: 25


Loss=0.8105062246322632 Batch_id=468 Accuracy=92.41: 100%|██████████| 469/469 [00:14<00:00, 32.46it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.6103, Accuracy: 8391/10000 (83.91%)

EPOCH: 26


Loss=0.9181852340698242 Batch_id=468 Accuracy=92.46: 100%|██████████| 469/469 [00:14<00:00, 31.52it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3437, Accuracy: 9290/10000 (92.90%)

EPOCH: 27


Loss=0.9254135489463806 Batch_id=468 Accuracy=92.39: 100%|██████████| 469/469 [00:15<00:00, 30.79it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.8337, Accuracy: 7360/10000 (73.60%)

EPOCH: 28


Loss=0.9960976839065552 Batch_id=468 Accuracy=92.39: 100%|██████████| 469/469 [00:14<00:00, 31.40it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3418, Accuracy: 9000/10000 (90.00%)

EPOCH: 29


Loss=0.9977209568023682 Batch_id=468 Accuracy=92.52: 100%|██████████| 469/469 [00:14<00:00, 31.63it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3278, Accuracy: 9157/10000 (91.57%)

EPOCH: 30


Loss=0.8191602826118469 Batch_id=468 Accuracy=92.49: 100%|██████████| 469/469 [00:14<00:00, 31.82it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4513, Accuracy: 8579/10000 (85.79%)

EPOCH: 31


Loss=0.8704522252082825 Batch_id=468 Accuracy=92.44: 100%|██████████| 469/469 [00:15<00:00, 37.74it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.4097, Accuracy: 8836/10000 (88.36%)

EPOCH: 32


Loss=0.9751906991004944 Batch_id=468 Accuracy=92.52: 100%|██████████| 469/469 [00:15<00:00, 30.36it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.2092, Accuracy: 6660/10000 (66.60%)

EPOCH: 33


Loss=0.94305419921875 Batch_id=468 Accuracy=92.46: 100%|██████████| 469/469 [00:15<00:00, 29.62it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.9455, Accuracy: 3834/10000 (38.34%)

EPOCH: 34


Loss=0.9410853981971741 Batch_id=468 Accuracy=92.54: 100%|██████████| 469/469 [00:16<00:00, 29.13it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.3212, Accuracy: 6215/10000 (62.15%)

EPOCH: 35


Loss=0.9486172199249268 Batch_id=468 Accuracy=92.45: 100%|██████████| 469/469 [00:15<00:00, 29.86it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.3719, Accuracy: 8995/10000 (89.95%)

EPOCH: 36


Loss=0.8099182844161987 Batch_id=468 Accuracy=92.66: 100%|██████████| 469/469 [00:15<00:00, 29.85it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.7470, Accuracy: 7827/10000 (78.27%)

EPOCH: 37


Loss=1.050452470779419 Batch_id=468 Accuracy=92.49: 100%|██████████| 469/469 [00:15<00:00, 30.41it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 0.5138, Accuracy: 8319/10000 (83.19%)

EPOCH: 38


Loss=0.8734185099601746 Batch_id=468 Accuracy=92.57: 100%|██████████| 469/469 [00:16<00:00, 28.85it/s]
  0%|          | 0/469 [00:00<?, ?it/s]


Test set: Average loss: 1.5773, Accuracy: 4608/10000 (46.08%)

EPOCH: 39


Loss=1.0680984258651733 Batch_id=468 Accuracy=92.39: 100%|██████████| 469/469 [00:15<00:00, 30.54it/s]



Test set: Average loss: 0.5957, Accuracy: 8099/10000 (80.99%)

