In [1]:
!pip install torchsummary

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
!pip install torchsummary
from torchsummary import summary

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


## Some Notes on our naive model

- We are going to write a network based on what we have learnt so far.
- The size of the input image is 28x28x1. We are going to add as many layers as required to reach RF = 32 "atleast". 

## Neural Network Class architecture

**FirstDNN Class:**

The FirstDNN class is a neural network model implemented using the PyTorch library. It consists of several convolutional and pooling layers that perform image classification tasks. This documentation provides an overview of the class and its functionality.

**Class Overview:**

The FirstDNN class is derived from the nn.Module class, which is the base class for all neural network modules in PyTorch. By inheriting from nn.Module, FirstDNN gains access to convenient methods and functionalities for constructing and training neural networks.

**Class Constructor**:

- The constructor __init__ initializes the FirstDNN object. It sets up the architecture of the neural network by defining its layers and their configurations. Here are the layers included in the FirstDNN class:

- self.conv1: The first convolutional layer with 1 input channel, 32 output channels, and a kernel size of 3x3.
- self.conv2: The second convolutional layer with 32 input channels, 64 output channels, and a kernel size of 3x3.
- self.pool1: The first max pooling layer with a kernel size of 2x2 and a stride of 2.
- self.conv3: The third convolutional layer with 64 input channels, 128 output channels, and a kernel size of 3x3.
- self.conv4: The fourth convolutional layer with 128 input channels, 256 output channels, and a kernel size of 3x3.
- self.pool2: The second max pooling layer with a kernel size of 2x2 and a stride of 2.
- self.conv5: The fifth convolutional layer with 256 input channels, 512 output channels, and a kernel size of 3x3.
- self.conv6: The sixth convolutional layer with 512 input channels, 1024 output channels, and a kernel size of 3x3.
- self.conv7: The seventh convolutional layer with 1024 input channels, 10 output channels, and a kernel size of 3x3.

**Forward Pass:**

- The forward method defines the forward pass of the FirstDNN network. It specifies how input data flows through the defined layers to produce an output. Here is the sequence of operations performed in the forward pass:

- The input x is passed through the first convolutional layer (conv1), followed by a ReLU activation function.
- The result is then passed through the second convolutional layer (conv2), followed by another ReLU activation function.
- The output is fed into the first max pooling layer (pool1).
- The output from the pooling layer is passed through the third and fourth convolutional layers (conv3 and conv4), each followed by a ReLU activation function.
- The output is fed into the second max pooling layer (pool2).
- The output from the pooling layer is passed through the fifth and sixth convolutional layers (conv5 and conv6), each followed by a ReLU activation function.
- The output from the sixth convolutional layer is passed through the seventh convolutional layer (conv7) without an activation function.
- The output is then flattened using x.view(-1, 10) to reshape it into a 2D tensor.
- Finally, a log softmax activation function is applied to the flattened output, and the result is returned.

In [5]:
class FirstDNN(nn.Module):
    def __init__(self):
        super(FirstDNN, self).__init__() #con2d parm ---> input channel, output channel, kernel size
        self.conv1 = nn.Conv2d(1, 32, 3, padding=1) #input-28*28*1 ,RF-3*3*1*32,Output-28*28*32
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1) #input-28*28*32 ,RF-3*3*32*64,Output-28*28*64
        self.pool1 = nn.MaxPool2d(2, 2) #input-28*28*64 ,Output-14*14*64
        self.conv3 = nn.Conv2d(64, 128, 3, padding=1) #input-14*14*64 ,RF-3*3*64*128,Output-14*14*128
        self.conv4 = nn.Conv2d(128, 256, 3, padding=1) #input-14*14*128 ,RF-3*3*128*256,Output-14*14*256
        self.pool2 = nn.MaxPool2d(2, 2) #input-14*14*256 ,Output-7*7*256
        self.conv5 = nn.Conv2d(256, 512, 3) #input-7*7*256 ,RF-3*3*254*512,Output-5*5*512
        self.conv6 = nn.Conv2d(512, 1024, 3) #input-5*5*512 ,RF-3*3*512*1024,Output-3*3*1024
        self.conv7 = nn.Conv2d(1024, 10, 3) #input-3*3*1024 ,RF-3*3*1024*10,Output-1*1*10

    def forward(self, x):
        x = self.pool1(F.relu(self.conv2(F.relu(self.conv1(x)))))
        x = self.pool2(F.relu(self.conv4(F.relu(self.conv3(x)))))
        x = F.relu(self.conv6(F.relu(self.conv5(x))))
        x = F.relu(self.conv7(x))
        x = x.view(-1, 10)
        return F.log_softmax(x)

## Model Initialization and Summary
The following code demonstrates the initialization of a neural network model and the generation of its summary using the torchsummary library. This documentation will outline the purpose and functionality of the code.

- Checking for GPU Availability
  - The variable use_cuda is assigned the value of torch.cuda.is_available(). This function checks if a GPU is available for computation. If a GPU is present, use_cuda is set to True; otherwise, it is set to False. This information is used to determine the device on which the model will be trained and run.

- Device Selection
  - The device variable is initialized based on the value of use_cuda. If use_cuda is True, indicating the presence of a GPU, device is set to "cuda" to utilize the GPU for computation. If use_cuda is False, indicating no GPU availability, device is set to "cpu", indicating the CPU will be used for computation. The device selection ensures the model runs on the available hardware.

- Model Initialization
  - An instance of the FirstDNN class is created and assigned to the model variable. This class represents a neural network model implemented using the PyTorch library. The initialization of model does not require any input arguments. By default, the model will be initialized on the CPU.

To utilize the chosen device (GPU or CPU), the to(device) method is called on the model. This moves the model's parameters and buffers to the specified device, allowing for computations on that device. If a GPU is available (use_cuda is True), the model will be transferred to the GPU. Otherwise, it remains on the CPU.

**Model Summary Generation:**

- The summary function from the torchsummary library is used to generate a summary of the model's architecture and parameter information. The summary function takes two arguments: the model instance and the input_size tuple, representing the expected size of the input to the model.

- In this case, the input_size is set to (1, 28, 28), indicating that the model expects input tensors with a shape of (batch_size, channels, height, width), where batch_size is flexible, and channels, height, and width are fixed at 1, 28, and 28, respectively. Providing the input_size allows the summary function to calculate the number of parameters in the model and display a summary table with detailed information about each layer, including the output shape and number of parameters.

In [6]:
from torchsummary import summary
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = FirstDNN().to(device)
summary(model, input_size=(1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 32, 28, 28]             320
            Conv2d-2           [-1, 64, 28, 28]          18,496
         MaxPool2d-3           [-1, 64, 14, 14]               0
            Conv2d-4          [-1, 128, 14, 14]          73,856
            Conv2d-5          [-1, 256, 14, 14]         295,168
         MaxPool2d-6            [-1, 256, 7, 7]               0
            Conv2d-7            [-1, 512, 5, 5]       1,180,160
            Conv2d-8           [-1, 1024, 3, 3]       4,719,616
            Conv2d-9             [-1, 10, 1, 1]          92,170
Total params: 6,379,786
Trainable params: 6,379,786
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 1.51
Params size (MB): 24.34
Estimated Total Size (MB): 25.85
-------------------------------------

  return F.log_softmax(x)


In [7]:
torch.manual_seed(1)

batch_size = 128

kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                    transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                        transforms.ToTensor(),
                        transforms.Normalize((0.1307,), (0.3081,))
                    ])),
    batch_size=batch_size, shuffle=True, **kwargs)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 108831527.34it/s]


Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 44226248.20it/s]


Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 25425977.83it/s]


Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 24581327.44it/s]


Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw



## Training Function Documentation


### Parameters
*   model (nn.Module): The neural network model to be trained.
*   device (torch.device): The device (CPU or GPU) on which the training will be performed.
*   train_loader (torch.utils.data.DataLoader): The data loader object that provides the training dataset in batches.
*   optimizer (torch.optim.Optimizer): The optimizer used to update the model's parameters.
*   epoch (int): The current epoch number.

### Function Description
The train function performs the training process for a given neural network model. It iterates over the training data in batches and updates the model's parameters based on the calculated loss.

### Function Steps
*   Set the model in training mode using model.train(). This ensures that the model is prepared for training and enables features such as dropout.
*   Create a progress bar (pbar) using tqdm(train_loader) to track the training progress.
*   Iterate over the batches of the training data using enumerate(train_loader).
*   Retrieve the input data (data) and corresponding target labels (target) from the current batch. Move both data and target to the specified device using data.to(device) and target.to(device).
*   Clear the gradients of the optimizer using optimizer.zero_grad() to prepare for a new gradient calculation.
*   Perform a forward pass of the input data through the model to obtain the predicted output using output = model(data).
*   Calculate the loss between the predicted output and the target labels using the negative log-likelihood loss (F.nll_loss) and assign it to loss.
*   Perform backpropagation by calling loss.backward() to compute the gradients of the model's parameters with respect to the loss.
*   Update the model's parameters using the optimizer by calling optimizer.step().
*   Update the progress bar's description with the current loss and batch index using pbar.set_description(desc=f'loss={loss.item()} batch_id={batch_idx}'). *   This provides real-time feedback on the training progress.
*   Repeat steps 4-10 for each batch in the training data.

**The training function completes when all batches have been processed for the given.**

In [8]:
from tqdm import tqdm
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    pbar = tqdm(train_loader)
    for batch_idx, (data, target) in enumerate(pbar):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        pbar.set_description(desc= f'loss={loss.item()} batch_id={batch_idx}')

## Testing Function Documentation
The provided code snippet presents a testing function for a trained neural network model. This documentation will describe the purpose and functionality of the function.

### Parameters
- model (nn.Module): The trained neural network model to be evaluated.
- device (torch.device): The device (CPU or GPU) on which the testing will be performed.
- test_loader (torch.utils.data.DataLoader): The data loader object that provides the test dataset in batches.

### Function Description
- The test function evaluates the performance of a trained neural network model by testing it on a separate test dataset. It calculates the average loss and accuracy of the model's predictions.

### Function Steps
- Set the model in evaluation mode using model.eval(). This ensures that the model is prepared for evaluation and disables features such as dropout.
- Initialize variables test_loss and correct to track the cumulative loss and the number of correctly predicted samples, respectively.
- Enter a context where gradients are not computed using torch.no_grad().
- Iterate over the batches of the test data using a for loop with data and target as loop variables.
- Move the input data (data) and target labels (target) to the specified device using data.to(device) and target.to(device).
- Perform a forward pass of the input data through the model to obtain the predicted output using output = model(data).
- Calculate the loss between the predicted output and the target labels using the negative log-likelihood loss (F.nll_loss) with the reduction set to 'sum'. - Add the batch loss to test_loss.
- Find the predicted class labels (pred) by taking the index of the maximum log-probability in each output using output.argmax(dim=1, keepdim=True).
- Compare the predicted class labels with the target labels to count the number of correct predictions. Increment correct by the sum of matches using pred.eq(target.view_as(pred)).sum().item().
- Repeat steps 4-9 for each batch in the test data.
- Calculate the average test loss by dividing test_loss by the total number of samples in the test dataset: test_loss /= len(test_loader.dataset).
- Print the test results, including the average loss and accuracy, in a formatted message.

In [None]:
def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

## Experimentation result

In [9]:
## Used the given LR and it seems that accuracy is 59%
model = FirstDNN().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

for epoch in range(1, 3):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

  return F.log_softmax(x)
loss=1.9943186044692993 batch_id=468: 100%|██████████| 469/469 [00:21<00:00, 21.59it/s]



Test set: Average loss: 1.9707, Accuracy: 2790/10000 (28%)



loss=1.2586653232574463 batch_id=468: 100%|██████████| 469/469 [00:21<00:00, 21.44it/s]



Test set: Average loss: 1.1984, Accuracy: 5882/10000 (59%)



In [10]:
## increased LR by 10 times and model is stuck which means it is not able to learn.
model = FirstDNN().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.1, momentum=0.9)

for epoch in range(1, 3):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

  return F.log_softmax(x)
loss=2.3025858402252197 batch_id=468: 100%|██████████| 469/469 [00:22<00:00, 20.85it/s]



Test set: Average loss: 2.3026, Accuracy: 980/10000 (10%)



loss=2.3025858402252197 batch_id=468: 100%|██████████| 469/469 [00:22<00:00, 20.53it/s]



Test set: Average loss: 2.3026, Accuracy: 980/10000 (10%)



In [11]:
## reduced learning rate hee and accuracy seems to have increased
model = FirstDNN().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(1, 3):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

  return F.log_softmax(x)
loss=0.9290955662727356 batch_id=468: 100%|██████████| 469/469 [00:22<00:00, 20.68it/s]



Test set: Average loss: 1.1030, Accuracy: 6136/10000 (61%)



loss=0.7543477416038513 batch_id=468: 100%|██████████| 469/469 [00:24<00:00, 18.87it/s]



Test set: Average loss: 0.8887, Accuracy: 6575/10000 (66%)



In [12]:
## reduced learning rate here and have removed momentum also
model = FirstDNN().to(device)
optimizer = optim.SGD(model.parameters(), lr=0.001)

for epoch in range(1, 3):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

  return F.log_softmax(x)
loss=2.2996745109558105 batch_id=468: 100%|██████████| 469/469 [00:22<00:00, 21.06it/s]



Test set: Average loss: 2.2983, Accuracy: 3217/10000 (32%)



loss=2.2969436645507812 batch_id=468: 100%|██████████| 469/469 [00:22<00:00, 21.07it/s]



Test set: Average loss: 2.2929, Accuracy: 3855/10000 (39%)

