## Writing Classic CNN LeNet5 from Scratch in PyTorch

In this notebook, we would write one of the earliest Convolutional Neural Networks, LeNet5, from scratch in PyTorch. You can read more about it here: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf

### Importing the libraries

Let's start by importing the required libraries and defining some required variables

In [1]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms

batch_size = 64
num_classes = 10
learning_rate = 0.001
num_epochs = 10

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)
# device = 'cuda' if torch.cuda.is_available() else 'cpu'
# torch.manual_seed(777)
# if device == 'cuda':
#     torch.cuda.manual_seed_all(777)

cuda


### Downloading and Loading the Dataset

We will download the datasets from `torchvision` and load them into PyTorch. We will also apply some transformations, such as resizing the images, converting them to tensors and normalizing them using the mean and standard deviation

In [5]:
!python test.py

In [2]:

#Loading the dataset and preprocessing
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transforms.Compose([transforms.Resize(size=(32,32)), transforms.ToTensor(), transforms.Normalize(mean=(0.1307,), std=(0.3081,))]), download=False)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transforms.Compose([transforms.Resize(size=(32,32)), transforms.ToTensor(), transforms.Normalize(mean=(0.1325,), std=(0.3105,))]), download=False)


train_loader = torch.utils.data.DataLoader(dataset = train_dataset,
                                           batch_size = batch_size,
                                           shuffle = True)


test_loader = torch.utils.data.DataLoader(dataset = test_dataset,
                                           batch_size = batch_size,
               
                                           shuffle = True)

### LeNet5 From Scratch


In [3]:
import numpy as np

arr_ex = np.random.randint(low=-10, high=10, size=(10, 10))

print(arr_ex)
print(arr_ex.size)

[[ -9   4  -6  -8   3   5   3   2   4   7]
 [  3   5  -9  -4   2   3  -3  -5   8   4]
 [  6  -7   3  -3  -1  -3   2   9   5   3]
 [ -6  -8  -2   7   1  -3   6   6  -1  -9]
 [  3   7   2   6   5   0   8   2  -7  -6]
 [  0  -9   1   0  -8   0   8  -3   8   6]
 [ -1   5  -1  -3  -9  -3   2  -9   2  -4]
 [  9   3   4   9   9  -2  -9 -10  -2  -1]
 [  0  -9  -1   9  -5   6  -7  -3  -1  -5]
 [  0 -10   9   4   6  -4   9 -10  -8   5]]
100


In [4]:
class ConvNeuralNet(nn.Module):
    def __init__(self, num_classes):
        super(ConvNeuralNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 6, kernel_size=5, stride=1, padding=0),
            nn.BatchNorm2d(6),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size = 2, stride = 2))
        self.layer2 = nn.Sequential(
            nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size = 2, stride = 2))
        self.fc = nn.Linear(400, 120)
        self.relu = nn.ReLU()
        self.fc1 = nn.Linear(120, 84)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(84, num_classes)
        
    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        out = self.relu(out)
        out = self.fc1(out)
        out = self.relu1(out)
        out = self.fc2(out)
        return out

### Setting Hyperparameters

In [5]:

model = ConvNeuralNet( num_classes).to(device)

#Defining cost and optimizer
cost = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

### Training

In [6]:
total_step = len(train_loader)
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):  
        images = images.to(device)
        labels = labels.to(device)
            #Forward pass
        outputs = model(images)
        loss = cost(outputs, labels)
        	# Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        		
        if (i+1) % 400 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
        		           .format(epoch+1, num_epochs, i+1, total_step, loss.item()))



Epoch [1/10], Step [400/938], Loss: 0.0708
Epoch [1/10], Step [800/938], Loss: 0.0323
Epoch [2/10], Step [400/938], Loss: 0.0495
Epoch [2/10], Step [800/938], Loss: 0.0176
Epoch [3/10], Step [400/938], Loss: 0.0137
Epoch [3/10], Step [800/938], Loss: 0.0092
Epoch [4/10], Step [400/938], Loss: 0.0621
Epoch [4/10], Step [800/938], Loss: 0.0036
Epoch [5/10], Step [400/938], Loss: 0.0042
Epoch [5/10], Step [800/938], Loss: 0.0337
Epoch [6/10], Step [400/938], Loss: 0.0038
Epoch [6/10], Step [800/938], Loss: 0.0048
Epoch [7/10], Step [400/938], Loss: 0.0046
Epoch [7/10], Step [800/938], Loss: 0.0066
Epoch [8/10], Step [400/938], Loss: 0.0058
Epoch [8/10], Step [800/938], Loss: 0.0149
Epoch [9/10], Step [400/938], Loss: 0.0116
Epoch [9/10], Step [800/938], Loss: 0.0150
Epoch [10/10], Step [400/938], Loss: 0.0262
Epoch [10/10], Step [800/938], Loss: 0.0004


### Testing

In [8]:
# Test the model
# In test phase, we don't need to compute gradients (for memory efficiency)

with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))


Accuracy of the network on the 10000 test images: 98.82 %
