<a href="https://www.kaggle.com/code/siddp6/mlp-mnist-2?scriptVersionId=138653910" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Multi-Layer Perceptron, MNIST
---
In this notebook, we will train an MLP to classify images from the [MNIST database](http://yann.lecun.com/exdb/mnist/) hand-written digit database.

The process will be broken down into the following steps:
>1. Load and visualize the data
2. Define a neural network
3. Train the model
4. Evaluate the performance of our trained model on a test dataset!

Before we begin, we have to import the necessary libraries for working with data and PyTorch, as well as a few more for convenience.

In [1]:
import torch
import multiprocessing
import numpy as np
from tqdm import tqdm


---
## Load and Visualize the [Data](http://pytorch.org/docs/stable/torchvision/datasets.html)

Downloading may take a few moments, and you should see your progress as the data is loading. You may also choose to change the `batch_size` if you want to load more data at a time.

This cell will create DataLoaders for each of our datasets.

The code snippet you provided is using the `urllib` module from the `six.moves` package, which is a compatibility module used to handle differences between Python 2 and Python 3. In this code, it sets a custom User-Agent header and installs an opener with the custom User-Agent for all future urllib requests.

Here's what each part of the code does:

1. `from six.moves import urllib`: Import the `urllib` module from the `six.moves` package. In Python 2, `urllib` was split into several modules, and `six.moves` provides a consistent way to access these modules across Python 2 and Python 3.

2. `opener = urllib.request.build_opener()`: This line creates an opener object using the `build_opener()` method from `urllib.request`. An opener is an object that manages the connection to a URL and can be used to open URLs with different settings.

3. `opener.addheaders = [('User-agent', 'Mozilla/5.0')]`: This line adds a custom header to the opener. The header being added is the "User-Agent" header, which is often used to identify the client (in this case, 'Mozilla/5.0' represents a common User-Agent for Mozilla Firefox). Setting a custom User-Agent can be useful in some cases when interacting with web servers that require specific user-agent information.

4. `urllib.request.install_opener(opener)`: This line installs the custom opener as the default opener for all future `urllib.request` calls. This means that any subsequent HTTP requests made using `urllib` will use this custom opener with the added User-Agent header.

After running this code, any subsequent HTTP requests made using `urllib` will include the specified User-Agent header. For example, if you use `urllib.request.urlopen()` to open a URL, the custom User-Agent header will be sent along with the request. This can be useful when you need to mimic a specific user agent or when some websites might block requests from specific User-Agents.

Remember that using a custom User-Agent header might violate some website's terms of service, so it's essential to ensure that you're using it responsibly and in compliance with the website's policies.


In [2]:
from six.moves import urllib

opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib.request.install_opener(opener)

In [3]:
from torchvision import datasets
import torchvision.transforms as transforms

# number of sub-process to load the data
# set to the number of CPUs avaliable
num_workers = multiprocessing.cpu_count()

# number of sample per batch ot load
batch_size = int(48000 / 5)

transform = transforms.Compose([
    # input: np.array or PIL image with range 0-255
    # transform: float tensor with range 0.0 - 1.0
    transforms.ToTensor(),
    
    # transform: float tensor with range -1.0 - 1.0
    # (better for activation like Relu)
    transforms.Normalize((0.5), (0.5)),
])


In [4]:
train_val_data = datasets.MNIST(root = "data", train=True, download=True, transform=transform)

In the specific context of the code snippet you provided, `torch.utils.data.random_split()` uses the random number generator (`generator`) to perform the random split of the dataset (`train_val_data`). By setting the same fixed seed (42) each time the code runs, the split will be consistent and reproducible across different runs.

For example, if you run the script multiple times with the same seed (42), you will get the same training and validation subsets split from the `train_val_data`, making it easier to compare results and analyze model performance.


- 'shuffle=True': This shuffles the data before each epoch during training to introduce randomness and improve convergence.

- 'num_workers': Set the number of worker processes to load data in parallel. This can speed up data loading for large datasets.


In [5]:
# Split into tarin and Validation
train_len = int(len(train_val_data)*.80)
val_len = int(len(train_val_data)*.20)

train_subset, val_subset = torch.utils.data.random_split(
    train_val_data, [train_len, val_len], generator=torch.Generator().manual_seed(42)
)

train_loader = torch.utils.data.DataLoader(
dataset = train_subset, shuffle=True, batch_size=batch_size, num_workers=num_workers
)

val_loader = torch.utils.data.DataLoader(
dataset=val_subset, shuffle=False, batch_size=batch_size, num_workers=num_workers
)

In [6]:
# Get test data
test_data = datasets.MNIST(root="data", train=False, download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(
    test_data, batch_size=batch_size, num_workers=num_workers
)

In [7]:
print(f"Data: {len(train_val_data)}")
print(f"Data for train: {len(train_subset)}")
print(f"Total batches in train loader: {len(train_loader)}")
print(f"Total images in one batch: {len(train_subset)/len(train_loader)}")
print("\n")

batch_1 = next(iter(train_loader))
print(f"Each batch has {len(batch_1)} component, image and label")
print("\n")

print(f"First component has {len(batch_1[0])} images")
print(f"Second component has {len(batch_1[1])} labels")
print(f"Shape of first component of each batch {batch_1[0].shape}")
print(f"Shape of second component of each batch {batch_1[1].shape}")

print("\n")


# print(f"First image from first batch is {batch_1[0][0]} and lable is {batch_1[1][0]}")

print(f"Shape of Image: {batch_1[0][0].shape}")
print(f"Shape of Label: {batch_1[1][0].shape}")
print(f"Shape of Image after view function {batch_1[0][0].view(-1, 28*28).shape}")
print("\n")


print(len(train_loader.dataset))

Data: 60000
Data for train: 48000
Total batches in train loader: 5
Total images in one batch: 9600.0


Each batch has 2 component, image and label


First component has 9600 images
Second component has 9600 labels
Shape of first component of each batch torch.Size([9600, 1, 28, 28])
Shape of second component of each batch torch.Size([9600])


Shape of Image: torch.Size([1, 28, 28])
Shape of Label: torch.Size([])
Shape of Image after view function torch.Size([1, 784])


48000


---
## Define the Network [Architecture](http://pytorch.org/docs/stable/nn.html)


The architecture will be responsible for seeing as input a 784-dim Tensor of pixel values for each image, and producing a Tensor of length 10 (our number of classes) that indicates the class scores for an input image. This particular example uses two hidden layers and dropout to avoid overfitting.

https://ashwinhprasad.medium.com/pytorch-for-deep-learning-nn-linear-and-nn-relu-explained-77f3e1007dbb

In [8]:
model_run_count = 0

In [9]:
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layer_1 = nn.Linear(784, 128)
        self.layer_2 = nn.Linear(128, 64)
        self.layer_3 = nn.Linear(64, 10)
        
        self.Relu = nn.ReLU()

    def forward(self, X):
        global model_run_count
        model_run_count += 1
        X = X.view(-1, 28*28)
        X = self.Relu(self.layer_1(X))
        X = self.Relu(self.layer_2(X))
        X = self.layer_3(X)
        
        return X

In [10]:
model = Net()

## (Bonus: visualize the structure of your network)
You can visualize your achitecture by using netron.app. Just execute the following cell (which will save the network to a file called "mnist_network.pt" in this directory), then download the produced `mnist_network.pt` to your computer. Finally, go to [Netron.app](https://netron.app) and click on `Open Model`, and select the file you just downloaded.

In [11]:
# scripted = torch.jit.script(model)
# torch.jit.save(scripted, "Net.pt")

1. `criterion = nn.CrossEntropyLoss()`: 
   - The `CrossEntropyLoss` is a loss function used for multi-class classification tasks.
   - It combines the softmax activation and the negative log-likelihood loss.
   - It is suitable for problems where each input belongs to one class among multiple classes.

2. `optimizer = torch.optim.SGD(model.parameters(), lr=0.01)`: 
   - The `SGD` optimizer is used to update the model's parameters during training.
   - It performs stochastic gradient descent, which updates the parameters based on gradients of the loss.
   - `model.parameters()` provides the parameters (weights and biases) of the model to be optimized.
   - `lr=0.01` sets the learning rate, which controls the step size during parameter updates.

In [12]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=.01)

---
## Train the Network

The steps for training/learning from a batch of data are described in the comments below:
1. Clear the gradients of all optimized variables
2. Forward pass: compute predicted outputs by passing inputs to the model
3. Calculate the loss
4. Backward pass: compute gradient of the loss with respect to model parameters
5. Perform a single optimization step (parameter update)
6. Update average training loss

The following loop trains for 30 epochs; feel free to change this number. For now, we suggest somewhere between 20-50 epochs. As you train, take a look at how the values for the training loss decrease over time. We want it to decrease while also avoiding overfitting the training data. 

In [13]:
# number of epochs to train the model
n_epochs = 10  # suggest training between 20-50 epochs
model_run_count = 0
model.train() # prep model for training

for epoch in range(n_epochs):
    # monitor training loss
    train_loss = 0.0
    
    ###################
    # train the model #
    ###################
    for batch_idx, (data, target) in tqdm(
            enumerate(train_loader),
            desc="Training",
            total=len(train_loader),
            leave=True,
            ncols=80,
        ):
        
        # clear the gradients of all optimized variables
        optimizer.zero_grad()
        
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        
        # calculate the loss
        loss = criterion(output, target)
        
        # backward pass: compute gradient of the loss with respect to model parameters
        loss.backward()
        
        # perform a single optimization step (parameter update)
        optimizer.step()
        
        # update running training loss
        train_loss += loss.item()*data.size(0)
        
        print(f"Train Batch: {batch_idx + 1}/{len(train_loader)}")
        
        print(f"Total loss in this batch: {loss.item()*data.size(0)}")
        print(f"Number of samples in this batch: {data.size(0)}")
        print(f"Average loss per sample in this batch: {loss.item()}")
        print("\n")
        
    # print training statistics 
    # calculate average loss over an epoch
    avergae_train_loss = train_loss/len(train_loader.dataset)
    
    # Validate
    with torch.no_grad():

        # set the model to evaluation mode
        model.eval()

        valid_loss = 0.0
        for batch_idx, (data, target) in tqdm(
                enumerate(val_loader),
                desc="Validating",
                total=len(val_loader),
                leave=True,
                ncols=80,
            ):

            # 1. forward pass: compute predicted outputs by passing inputs to the model
            output = model(data)  # =
            # 2. calculate the loss
            loss = criterion(output, target)  # =
            
            # update running training loss
            valid_loss += loss.item()*data.size(0)
            
            print(f"Val Batch: {batch_idx + 1}/{len(val_loader)}")

            print(f"Total val-loss in this batch: {loss.item()*data.size(0)}")
            print(f"Number of samples in this val-batch: {data.size(0)}")
            print(f"Average val-loss per sample in this val-batch: {loss.item()}")
            print("\n")

        # Calculate average validation loss
        average_valid_loss = valid_loss/len(val_loader.dataset)
            
    
    print("\n######################################")
    print(f"Epoch: {epoch+1} / {n_epochs}") 
    print(f"Total loss in this epoch: {train_loss}")
    print(f"Number of samples in this train-epoch: {len(train_loader.dataset)}")
    print(f"Average loss in this epoch: {avergae_train_loss}")
    print("\n")
    print(f"Total val-loss in this epoch: {valid_loss}")
    print(f"Number of samples in this val-epoch: {len(train_loader.dataset)}")
    print(f"Average val-loss in this epoch: {average_valid_loss}")
    print("######################################")
    print("\n\n")

Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.05it/s]

Train Batch: 1/5
Total loss in this batch: 22173.346710205078
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.3097236156463623


Train Batch: 2/5
Total loss in this batch: 22143.852996826172
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.3066513538360596


Train Batch: 3/5
Total loss in this batch: 22135.28594970703
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.3057589530944824


Train Batch: 4/5
Total loss in this batch: 22120.703887939453
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.3042399883270264




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.09s/it]

Train Batch: 5/5
Total loss in this batch: 22105.531311035156
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.302659511566162





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.13s/it]

Val Batch: 1/2
Total val-loss in this batch: 22082.379913330078
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.30024790763855


Val Batch: 2/2
Total val-loss in this batch: 5514.0363693237305
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2975151538848877



######################################
Epoch: 1 / 10
Total loss in this epoch: 110678.72085571289
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.3058066844940184


Total val-loss in this epoch: 27596.41628265381
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2997013568878173
######################################






Training:  60%|█████████████████████              | 3/5 [00:04<00:02,  1.06s/it]

Train Batch: 1/5
Total loss in this batch: 22095.146942138672
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.3015778064727783


Train Batch: 2/5
Total loss in this batch: 22067.813873291016
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2987306118011475


Train Batch: 3/5
Total loss in this batch: 22055.982971191406
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2974982261657715


Train Batch: 4/5
Total loss in this batch: 22035.626220703125
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.295377731323242




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.18s/it]

Train Batch: 5/5
Total loss in this batch: 22025.592041015625
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.294332504272461





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.10s/it]

Val Batch: 1/2
Total val-loss in this batch: 22006.640625
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.2923583984375


Val Batch: 2/2
Total val-loss in this batch: 5495.083236694336
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2896180152893066



######################################
Epoch: 2 / 10
Total loss in this epoch: 110280.16204833984
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.29750337600708


Total val-loss in this epoch: 27501.723861694336
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.291810321807861
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.06it/s]

Train Batch: 1/5
Total loss in this batch: 22008.83560180664
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2925870418548584


Train Batch: 2/5
Total loss in this batch: 22009.72137451172
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2926793098449707


Train Batch: 3/5
Total loss in this batch: 21980.45196533203
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.28963041305542


Train Batch: 4/5
Total loss in this batch: 21972.676849365234
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.288820505142212




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.08s/it]

Train Batch: 5/5
Total loss in this batch: 21947.122192382812
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.286158561706543





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.11s/it]

Val Batch: 1/2
Total val-loss in this batch: 21938.660430908203
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.2852771282196045


Val Batch: 2/2
Total val-loss in this batch: 5477.994346618652
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2824976444244385



######################################
Epoch: 3 / 10
Total loss in this epoch: 109918.80798339844
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.289975166320801


Total val-loss in this epoch: 27416.654777526855
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2847212314605714
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.04it/s]

Train Batch: 1/5
Total loss in this batch: 21944.583892822266
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2858941555023193


Train Batch: 2/5
Total loss in this batch: 21936.646270751953
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.285067319869995


Train Batch: 3/5
Total loss in this batch: 21917.935180664062
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.28311824798584


Train Batch: 4/5
Total loss in this batch: 21891.261291503906
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2803397178649902




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.09s/it]

Train Batch: 5/5
Total loss in this batch: 21898.418426513672
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.281085252761841





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.12s/it]

Val Batch: 1/2
Total val-loss in this batch: 21874.793243408203
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.2786242961883545


Val Batch: 2/2
Total val-loss in this batch: 5461.96231842041
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.275817632675171



######################################
Epoch: 4 / 10
Total loss in this epoch: 109588.84506225586
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.283100938796997


Total val-loss in this epoch: 27336.755561828613
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2780629634857177
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.04it/s]

Train Batch: 1/5
Total loss in this batch: 21878.64761352539
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2790257930755615


Train Batch: 2/5
Total loss in this batch: 21875.4638671875
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2786941528320312


Train Batch: 3/5
Total loss in this batch: 21861.2548828125
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2772140502929688


Train Batch: 4/5
Total loss in this batch: 21834.290313720703
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2744052410125732




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.10s/it]

Train Batch: 5/5
Total loss in this batch: 21821.575927734375
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.273080825805664





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.10s/it]

Val Batch: 1/2
Total val-loss in this batch: 21811.81182861328
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.272063732147217


Val Batch: 2/2
Total val-loss in this batch: 5445.990943908691
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.269162893295288



######################################
Epoch: 5 / 10
Total loss in this epoch: 109271.23260498047
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.2764840126037598


Total val-loss in this epoch: 27257.802772521973
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.271483564376831
######################################






Training:  60%|█████████████████████              | 3/5 [00:04<00:02,  1.09s/it]

Train Batch: 1/5
Total loss in this batch: 21820.182037353516
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.272935628890991


Train Batch: 2/5
Total loss in this batch: 21803.814697265625
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.271230697631836


Train Batch: 3/5
Total loss in this batch: 21781.56509399414
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2689130306243896


Train Batch: 4/5
Total loss in this batch: 21782.276916503906
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2689871788024902




Training: 100%|███████████████████████████████████| 5/5 [00:06<00:00,  1.22s/it]

Train Batch: 5/5
Total loss in this batch: 21764.513397216797
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.267136812210083





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.14s/it]

Val Batch: 1/2
Total val-loss in this batch: 21747.750091552734
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.265390634536743


Val Batch: 2/2
Total val-loss in this batch: 5429.621887207031
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2623424530029297



######################################
Epoch: 6 / 10
Total loss in this epoch: 108952.35214233398
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.269840669631958


Total val-loss in this epoch: 27177.371978759766
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2647809982299805
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.06it/s]

Train Batch: 1/5
Total loss in this batch: 21751.670837402344
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.265799045562744


Train Batch: 2/5
Total loss in this batch: 21745.667266845703
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2651736736297607


Train Batch: 3/5
Total loss in this batch: 21731.009674072266
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2636468410491943


Train Batch: 4/5
Total loss in this batch: 21704.470825195312
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2608823776245117




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.10s/it]

Train Batch: 5/5
Total loss in this batch: 21692.564392089844
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2596421241760254





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.12s/it]

Val Batch: 1/2
Total val-loss in this batch: 21681.44989013672
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.258484363555908


Val Batch: 2/2
Total val-loss in this batch: 5412.569618225098
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.255237340927124



######################################
Epoch: 7 / 10
Total loss in this epoch: 108625.38299560547
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.2630288124084474


Total val-loss in this epoch: 27094.019508361816
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.257834959030151
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.03it/s]

Train Batch: 1/5
Total loss in this batch: 21688.357543945312
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2592039108276367


Train Batch: 2/5
Total loss in this batch: 21662.928771972656
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2565550804138184


Train Batch: 3/5
Total loss in this batch: 21657.614135742188
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2560014724731445


Train Batch: 4/5
Total loss in this batch: 21665.453338623047
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2568180561065674




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.09s/it]

Train Batch: 5/5
Total loss in this batch: 21610.80551147461
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2511255741119385





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.10s/it]

Val Batch: 1/2
Total val-loss in this batch: 21612.30926513672
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.251282215118408


Val Batch: 2/2
Total val-loss in this batch: 5394.72541809082
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.247802257537842



######################################
Epoch: 8 / 10
Total loss in this epoch: 108285.15930175781
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.255940818786621


Total val-loss in this epoch: 27007.03468322754
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.250586223602295
######################################






Training:  60%|█████████████████████              | 3/5 [00:03<00:01,  1.05it/s]

Train Batch: 1/5
Total loss in this batch: 21619.608306884766
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.252042531967163


Train Batch: 2/5
Total loss in this batch: 21599.08218383789
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2499043941497803


Train Batch: 3/5
Total loss in this batch: 21581.332397460938
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2480554580688477


Train Batch: 4/5
Total loss in this batch: 21578.1005859375
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2477188110351562




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.10s/it]

Train Batch: 5/5
Total loss in this batch: 21551.962280273438
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2449960708618164





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.12s/it]

Val Batch: 1/2
Total val-loss in this batch: 21540.12680053711
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.2437632083892822


Val Batch: 2/2
Total val-loss in this batch: 5376.047515869141
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2400197982788086



######################################
Epoch: 9 / 10
Total loss in this epoch: 107930.08575439453
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.248543453216553


Total val-loss in this epoch: 26916.17431640625
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2430145263671877
######################################






Training:  60%|█████████████████████              | 3/5 [00:04<00:02,  1.11s/it]

Train Batch: 1/5
Total loss in this batch: 21531.731414794922
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2428886890411377


Train Batch: 2/5
Total loss in this batch: 21541.65802001953
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.243922710418701


Train Batch: 3/5
Total loss in this batch: 21513.97933959961
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.241039514541626


Train Batch: 4/5
Total loss in this batch: 21492.597198486328
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.238812208175659




Training: 100%|███████████████████████████████████| 5/5 [00:05<00:00,  1.19s/it]

Train Batch: 5/5
Total loss in this batch: 21479.017639160156
Number of samples in this batch: 9600
Average loss per sample in this batch: 2.2373976707458496





Validating: 100%|█████████████████████████████████| 2/2 [00:02<00:00,  1.09s/it]

Val Batch: 1/2
Total val-loss in this batch: 21464.65301513672
Number of samples in this val-batch: 9600
Average val-loss per sample in this val-batch: 2.235901355743408


Val Batch: 2/2
Total val-loss in this batch: 5356.418609619141
Number of samples in this val-batch: 2400
Average val-loss per sample in this val-batch: 2.2318410873413086



######################################
Epoch: 10 / 10
Total loss in this epoch: 107558.98361206055
Number of samples in this train-epoch: 48000
Average loss in this epoch: 2.240812158584595


Total val-loss in this epoch: 26821.07162475586
Number of samples in this val-epoch: 48000
Average val-loss in this epoch: 2.2350893020629883
######################################








In [14]:
print(f"Total time model run in each epochs :{model_run_count}")
print(f"Total batches in train loader: {len(train_loader)}")
print(f"Total batches in train loader: {len(val_loader)}")
print(f"Total epochs: {n_epochs}")
print("\n")

Total time model run in each epochs :70
Total batches in train loader: 5
Total batches in train loader: 2
Total epochs: 10




---
## Test the Trained Network

Finally, we test our best model on previously unseen **test data** and evaluate it's performance. Testing on unseen data is a good way to check that our model generalizes well. It may also be useful to be granular in this analysis and take a look at how this model performs on each class as well as looking at its overall loss and accuracy.

#### `model.eval()`

`model.eval(`) will set all the layers in your model to evaluation mode. This affects layers like dropout layers that turn "off" nodes during training with some probability, but should allow every node to be "on" for evaluation!

In [15]:
test_loss = 0
class_correct = list(0.0 for i in range(10))
class_total = list(0.0 for i in range(10))

model.eval()

for batch_idx, (data, target) in tqdm(
                            enumerate(test_loader),
                            desc="Testing",
                            total=len(test_loader),
                            leave=True,
                            ncols=80):
    
    output = model(data)
    
    loss = criterion(output, target)    
    test_loss += loss.item()*data.size(0)
    
    _, pred = torch.max(output, 1)

    for i in range(target.shape[0]):
        label = target[i]
        class_correct[label] += (1 if pred[i].item() == label else 0)
        class_total[label] += 1
        
avergae_test_loss = test_loss / len(test_loader.dataset)
print(f"Average test-loss: {avergae_train_loss}")

Testing: 100%|████████████████████████████████████| 2/2 [00:02<00:00,  1.14s/it]

Average test-loss: 2.240812158584595





In [22]:
for i in range(10):
    if class_total[i] > 0:
        print(f"Accuracy of {i}: {100*class_correct[i]/class_total[i]} ({class_correct[i]}/{class_total[i]})")
    else:
        print(f"Accuracy of {i}: N/A (no training examples)")
        
print(f"\nOverall Accuracy: {100 * sum(class_correct)/ sum(class_total)}")

Accuracy of 0: 56.734693877551024 (556.0/980.0)
Accuracy of 1: 96.74008810572687 (1098.0/1135.0)
Accuracy of 2: 9.496124031007753 (98.0/1032.0)
Accuracy of 3: 55.742574257425744 (563.0/1010.0)
Accuracy of 4: 7.2301425661914465 (71.0/982.0)
Accuracy of 5: 0.0 (0.0/892.0)
Accuracy of 6: 0.0 (0.0/958.0)
Accuracy of 7: 75.68093385214007 (778.0/1028.0)
Accuracy of 8: 15.91375770020534 (155.0/974.0)
Accuracy of 9: 22.497522299306244 (227.0/1009.0)

Overall Accuracy: 35.46
