# Introduction

I am starting a series of post in medium covering most of the CNN architectures implemented so far, in pytorch and tensorflow. I believe after getting your hands on with the standard architectures, we will be ready to build our own custom CNN architectures for any task.

So I am starting with the oldest CNN architecture LeNet(1998). It was primarily developed for recognition of handwritten and other characters.

<img src="https://miro.medium.com/max/700/1*lvvWF48t7cyRWqct13eU0w.jpeg">

The above picture summarizes the LeNet's architecture, let's break down each of them layer by layer.


## LeNet Architecture
S.No | Layers | Output Shape (Height, Width, Channels)
--- | --- | ---
1 | Input Layer | 32 x 32 x 1
2 | Conv2d [6 Filters of size = 5x5, stride = 1, padding = 0 ] | 28 x 28 x 6
3 | Average Pooling [stride = 2, padding = 0] | 14 x 14 x 6
4 | Conv2d [16 Filters of size = 5x5, stride = 1, padding = 0 ] | 10 x 10 x 16
5 | Average Pooling [stride = 2, padding = 0] | 5 x 5 x 16
6 | Conv2d [120 Filters of size = 5x5, stride = 1, padding = 0 ] | 1 x 1 x 120
7 | Linear1 Layer | 120
8 | Linear2 Layer | 84
9 | Final Linear Layer | 10



<img src="https://miro.medium.com/max/330/1*D47ER7IArwPv69k3O_1nqQ.png">

## Number of Learning Parameters = [i x (f x f) x b] + b
i = Number of input channels in conv2d

f = Filter Size

b = Number of Bias


## Output size calculation after applying convolution
Stride and Padding are kept constants across the network, so S = 1, P = 0

1. Input Layer shape = 32 x 32 x 1
2. After applying conv2d with 6 filters of (5x5),
  * Output shape = ((32 + 0 - 5) / 1) + 1 = 28
  * No of Learning Parameters = ([ 1 x (5 * 5) x 1] + 1) * 6 filters = 156
3. After applying Average Pooling (2x2),
  * Output shape = ((28 + 0 - 2) / 2) + 1 = 14
  * No of Learning Parameters = None (0)
4. After applying conv2d with 16 filters of (5x5),
  * Output shape = ((14 + 0 - 5) / 1) + 1 = 10
  * No of Learning Parameters = ([ 6 x (5 * 5) x 1] + 1) * 16 filters = 2416
5. After applying Average Pooling (2x2),
  * Output shape = ((10 + 0 - 2) / 2) + 1 = 5
  * No of Learning Parameters = None (0)
6. After applying conv2d with 150 filters of (5x5),
  * Output shape = ((5 + 0 - 5) / 1) + 1 = 1
  * No of Learning Parameters = ([ 16 x (5 * 5) x 1] + 1) * 120 filters = 48120
7. Apply Linear Layer of 84 neurons,
  * No of Learning Parameters = (120 * 84 + 84) = 10164
8. Apply Linear Layer of 10 neurons,
  * No of Learning Parameters = (84 * 10 + 10) = 850


In [1]:
!pip install softposit

Collecting softposit
  Downloading softposit-0.3.4.4.tar.gz (118 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/118.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m118.3/118.3 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: softposit
  Building wheel for softposit (setup.py) ... [?25l[?25hdone
  Created wheel for softposit: filename=softposit-0.3.4.4-cp310-cp310-linux_x86_64.whl size=374168 sha256=7e0ca41531f4732baee700adcbf94cc9c1a83c9ebdfd82590826e01a32b62570
  Stored in directory: /root/.cache/pip/wheels/99/f1/20/d5f8be9cc554fe2ec37ec65ddc64002b5bee71f44899e3e33c
Successfully built softposit
Installing collected packages: softposit
Successfully installed softposit-0.3.4.4


In [2]:
# Importing necessary modules
import time
import torch
import torch.nn as nn
import torchvision.datasets as datasets
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.autograd import Variable
import softposit as sp

!pip install torchsummaryX --quiet
from torchsummaryX import summary as summaryX
from torchsummary import summary

from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()

In [3]:
def change_tensor_values(tensor):
    shape = tensor.shape
    flat = tensor.flatten()  # Flatten the tensor to a 1D array
    # print(shape)
    for i in range(len(flat)):
      temp = sp.posit8(float(flat[i]))
      flat[i] = float(temp)
    return flat.reshape(shape)
    # return flat.reshape(shape).type(torch.FloatTensor)

class LeNet(nn.Module):
  def __init__(self):
    super(LeNet, self).__init__()

    self.conv1 = nn.Conv2d(in_channels = 1, out_channels = 6,
                           kernel_size = 5, stride = 1, padding = 0)
    self.conv2 = nn.Conv2d(in_channels = 6, out_channels = 16,
                           kernel_size = 5, stride = 1, padding = 0)
    self.conv3 = nn.Conv2d(in_channels = 16, out_channels = 120,
                           kernel_size = 5, stride = 1, padding = 0)
    self.linear1 = nn.Linear(120, 84)
    self.linear2 = nn.Linear(84, 10)
    self.tanh = nn.Tanh()
    self.avgpool = nn.AvgPool2d(kernel_size = 2, stride = 2)

  def forward(self, x):
    x = change_tensor_values(x)
    x = self.conv1(x)
    x = self.tanh(x)
    x = self.avgpool(x)
    x = change_tensor_values(x)
    x = self.conv2(x)
    x = self.tanh(x)
    x = self.avgpool(x)
    x = change_tensor_values(x)
    x = self.conv3(x)
    x = self.tanh(x)
    x = change_tensor_values(x)
    x = x.reshape(x.shape[0], -1)
    x = self.linear1(x)
    x = self.tanh(x)
    x = change_tensor_values(x)
    x = self.linear2(x)
    return x

model = LeNet()
model

LeNet(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (conv3): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1))
  (linear1): Linear(in_features=120, out_features=84, bias=True)
  (linear2): Linear(in_features=84, out_features=10, bias=True)
  (tanh): Tanh()
  (avgpool): AvgPool2d(kernel_size=2, stride=2, padding=0)
)

In [4]:
# x = torch.randn(64,1,32,32).type(torch.cuda.FloatTensor)
# model = model().to(device)
# x = torch.randn(64,1,32,32)
# output = model(x)
# print(output.shape)
# summary(model, (1,32,32))

# Loading MNIST

In [5]:
# Hyperparameters
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
learning_rate = 0.01
num_epochs = 10

train_dataset = datasets.MNIST(root='dataset/', train=True, transform=transforms.Compose([transforms.Pad(2), transforms.ToTensor()]), download=True)
test_dataset = datasets.MNIST(root='dataset/', train=False, transform=transforms.Compose([transforms.Pad(2), transforms.ToTensor()]), download=True)

train_loader = DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=64, shuffle=True)
dataset_sizes = {'train':len(train_dataset), 'test':len(test_dataset)}

model = LeNet().to(device)
criterion = nn.CrossEntropyLoss().to(device)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to dataset/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 167604799.06it/s]

Extracting dataset/MNIST/raw/train-images-idx3-ubyte.gz to dataset/MNIST/raw






Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to dataset/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 24986735.52it/s]


Extracting dataset/MNIST/raw/train-labels-idx1-ubyte.gz to dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to dataset/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 38545607.24it/s]

Extracting dataset/MNIST/raw/t10k-images-idx3-ubyte.gz to dataset/MNIST/raw






Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 16681723.96it/s]


Extracting dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz to dataset/MNIST/raw



In [6]:
from IPython.display import HTML, display
class ProgressMonitor(object):
    """
    Custom IPython progress bar for training
    """

    tmpl = """
        <p>Loss: {loss:0.4f}   {value} / {length}</p>
        <progress value='{value}' max='{length}', style='width: 100%'>{value}</progress>
    """

    def __init__(self, length):
        self.length = length
        self.count = 0
        self.display = display(self.html(0, 0), display_id=True)

    def html(self, count, loss):
        return HTML(self.tmpl.format(length=self.length, value=count, loss=loss))

    def update(self, count, loss):
        self.count += count
        self.display.update(self.html(self.count, loss))

def train_new(model,criterion,optimizer,num_epochs,dataloaders,dataset_sizes,first_epoch=1):
  since = time.time()
  best_loss = 999999
  best_epoch = -1
  last_train_loss = -1
  plot_train_loss = []
  plot_valid_loss = []


  for epoch in range(first_epoch, first_epoch + num_epochs):
      print()
      print('Epoch', epoch)
      running_loss = 0.0
      valid_loss = 0.0

      # train phase
      model.train()

      # create a progress bar
      progress = ProgressMonitor(length=dataset_sizes["train"])

      for data in dataloaders[0]:
          # Move the training data to the GPU
          inputs, labels  = data
          batch_size = inputs.shape[0]

          inputs = Variable(inputs.to(device))
          labels = Variable(labels.to(device))

          # clear previous gradient computation
          optimizer.zero_grad()
          outputs = model(inputs)
          loss = criterion(outputs, labels)

          loss.backward()
          optimizer.step()

          running_loss += loss.data * batch_size
          # update progress bar
          progress.update(batch_size, running_loss)

      epoch_loss = running_loss / dataset_sizes["train"]
      print('Training loss:', epoch_loss.item())
      writer.add_scalar('Training Loss', epoch_loss, epoch)
      plot_train_loss.append(epoch_loss)

      # validation phase
      model.eval()
      # We don't need gradients for validation, so wrap in
      # no_grad to save memory
      with torch.no_grad():
        for data in dataloaders[-1]:
            inputs, labels  = data
            batch_size = inputs.shape[0]

            inputs = Variable(inputs.to(device))
            labels = Variable(labels.to(device))
            outputs = model(inputs)

            # calculate the loss
            optimizer.zero_grad()
            loss = criterion(outputs, labels)

            # update running loss value
            valid_loss += loss.data * batch_size

      epoch_valid_loss = valid_loss / dataset_sizes["test"]
      print('Validation loss:', epoch_valid_loss.item())
      plot_valid_loss.append(epoch_valid_loss)
      writer.add_scalar('Validation Loss', epoch_valid_loss, epoch)

  time_elapsed = time.time() - since
  print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

  return plot_train_loss, plot_valid_loss, model

if __name__=="__main__":
  train_losses, valid_losses, model = train_new(model = model ,criterion = criterion,optimizer = optimizer,
                                              num_epochs=3,dataloaders = [train_loader, test_loader],dataset_sizes = dataset_sizes)


Epoch 1


Training loss: 0.26723140478134155
Validation loss: 0.20391511917114258

Epoch 2


Training loss: 0.19443653523921967
Validation loss: 0.1866968870162964

Epoch 3


Training loss: 0.1816229224205017
Validation loss: 0.19128583371639252
Training complete in 296m 17s


In [7]:
def accuracy(loader, model, train=True):
    num_correct = num_samples = 0
    model.eval()
    with torch.no_grad():
      for data in loader:
        inputs, labels  = data
        batch_size = inputs.shape[0]

        inputs = Variable(inputs.to(device))
        labels = Variable(labels.to(device))

        outputs = model(inputs)
        _, preds = outputs.max(1)
        num_correct += (preds == labels).sum()
        num_samples += preds.size(0)
    accuracy = (num_correct.item()/num_samples)*100
    if train:
      print("Model Predicted {} correctly out of {} from training dataset, Acuracy : {:.2f}".format(num_correct.item(), num_samples, accuracy))
    else:
      print("Model Predicted {} correctly out of {} from testing dataset, Acuracy : {:.2f}".format(num_correct.item(), num_samples, accuracy))
    model.train()

accuracy(train_loader, model)
accuracy(test_loader, model, train=False)

Model Predicted 56757 correctly out of 60000 from training dataset, Acuracy : 94.59
Model Predicted 9430 correctly out of 10000 from testing dataset, Acuracy : 94.30


In [8]:
writer.add_graph(model,x)
writer.close()

# Start tensorboard (optional)
%load_ext tensorboard
%tensorboard --logdir runs


NameError: name 'x' is not defined