<a href="https://colab.research.google.com/github/gnoejh/ict1022/blob/main/Architectures/lenet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LeNet-5: A Pioneering Convolutional Neural Network Architecture

## Introduction

LeNet-5 is one of the earliest convolutional neural networks (CNNs), introduced by Yann LeCun and his collaborators in 1998. It was originally designed for handwritten and machine-printed character recognition. Despite its age, the architecture remains influential and forms the foundation for many modern CNN architectures.

## Historical Context

- Developed at Bell Labs by Yann LeCun et al. in the late 1990s
- Published in the paper "Gradient-Based Learning Applied to Document Recognition" in 1998
- Successfully applied to digit recognition tasks like reading zip codes, checks, etc.
- One of the first applications of convolutional neural networks to practical problems

## Architecture Overview

LeNet-5 is a 7-layer convolutional neural network that takes a 32×32 grayscale image as input and outputs classification probabilities. The architecture consists of:

1. **Input Layer**: 32×32 grayscale image
2. **C1**: Convolutional layer with 6 feature maps of size 28×28
3. **S2**: Subsampling (average pooling) layer with 6 feature maps of size 14×14
4. **C3**: Convolutional layer with 16 feature maps of size 10×10
5. **S4**: Subsampling (average pooling) layer with 16 feature maps of size 5×5
6. **C5**: Convolutional layer with 120 feature maps of size 1×1
7. **F6**: Fully connected layer with 84 units
8. **Output Layer**: Fully connected layer with 10 units (for digit classification)

![LeNet-5 Architecture](https://www.researchgate.net/profile/Vladimir-Golovko-3/publication/313808170/figure/fig3/AS:714087491317770@1547277416263/Architecture-of-LeNet-5_W640.jpg)

In [None]:
# Imports
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np

# Check if CUDA is available and set device accordingly
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## Implementing LeNet-5 in PyTorch

Below is a PyTorch implementation of the LeNet-5 architecture. The implementation follows the original architecture with slight modifications to accommodate modern deep learning practices:

In [None]:
class LeNet5(nn.Module):
    def __init__(self, num_classes=10):
        super(LeNet5, self).__init__()
        
        # Layer C1: Convolutional layer (6 feature maps of size 28x28)
        self.conv1 = nn.Conv2d(1, 6, kernel_size=5, stride=1, padding=2)
        # Layer S2: Average pooling layer (6 feature maps of size 14x14)
        self.avg_pool1 = nn.AvgPool2d(kernel_size=2, stride=2)
        
        # Layer C3: Convolutional layer (16 feature maps of size 10x10)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5, stride=1)
        # Layer S4: Average pooling layer (16 feature maps of size 5x5)
        self.avg_pool2 = nn.AvgPool2d(kernel_size=2, stride=2)
        
        # Layer C5: Convolutional layer (120 feature maps of size 1x1)
        self.conv3 = nn.Conv2d(16, 120, kernel_size=5, stride=1)
        
        # Layer F6: Fully connected layer with 84 units
        self.fc1 = nn.Linear(120, 84)
        
        # Output layer
        self.fc2 = nn.Linear(84, num_classes)
    
    def forward(self, x):
        # C1 -> S2
        x = self.avg_pool1(F.relu(self.conv1(x)))
        
        # C3 -> S4
        x = self.avg_pool2(F.relu(self.conv2(x)))
        
        # C5
        x = F.relu(self.conv3(x))
        # Flatten the feature maps
        x = x.view(-1, 120)
        
        # F6
        x = F.relu(self.fc1(x))
        
        # Output layer
        x = self.fc2(x)
        return x

## Data Preparation

Let's load and prepare the MNIST dataset for training our LeNet-5 model:

In [None]:
# Transform operations for the dataset
transform = transforms.Compose([
    transforms.Resize((32, 32)),  # LeNet requires 32x32 images
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))  # MNIST mean and std
])

# Download and load the MNIST training dataset
train_dataset = torchvision.datasets.MNIST(root='./data', train=True,
                                           download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64,
                                          shuffle=True, num_workers=2)

# Download and load the MNIST test dataset
test_dataset = torchvision.datasets.MNIST(root='./data', train=False,
                                         download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=1000,
                                         shuffle=False, num_workers=2)

## Visualizing Some Samples

Let's visualize some samples from the MNIST dataset:

In [None]:
# Get some random training images
dataiter = iter(train_loader)
images, labels = next(dataiter)

# Create a grid of images for visualization
img_grid = torchvision.utils.make_grid(images[:25], nrow=5)
npimg = img_grid.numpy()

# Plot the images
plt.figure(figsize=(10, 10))
plt.imshow(np.transpose(npimg, (1, 2, 0)))
plt.title('MNIST Training Images')
plt.axis('off')
plt.show()

# Print the labels
print('Labels:', ' '.join(f'{labels[i]}' for i in range(25)))

## Training the LeNet-5 Model

Now, let's train our LeNet-5 model on the MNIST dataset:

In [None]:
# Instantiate the model and move it to the device
model = LeNet5(num_classes=10).to(device)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Number of epochs
num_epochs = 5

# Lists to store metrics
train_losses = []
train_accuracies = []

# Training loop
for epoch in range(num_epochs):
    running_loss = 0.0
    correct = 0
    total = 0
    
    for i, data in enumerate(train_loader, 0):
        # Get the inputs; data is a list of [inputs, labels]
        inputs, labels = data[0].to(device), data[1].to(device)
        
        # Zero the parameter gradients
        optimizer.zero_grad()
        
        # Forward pass
        outputs = model(inputs)
        
        # Calculate loss
        loss = criterion(outputs, labels)
        
        # Backward pass and optimize
        loss.backward()
        optimizer.step()
        
        # Statistics
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
        # Print statistics every 100 mini-batches
        if i % 100 == 99:
            print(f'[Epoch {epoch + 1}, Batch {i + 1:5d}] Loss: {running_loss / 100:.3f}, Accuracy: {100 * correct / total:.2f}%')
            running_loss = 0.0
    
    # Calculate epoch statistics
    epoch_loss = running_loss / len(train_loader)
    epoch_accuracy = 100 * correct / total
    train_losses.append(epoch_loss)
    train_accuracies.append(epoch_accuracy)
    
    print(f'Epoch {epoch + 1} completed. Loss: {epoch_loss:.3f}, Accuracy: {epoch_accuracy:.2f}%')

print('Finished Training')

## Evaluating the Model

Let's evaluate our trained LeNet-5 model on the test dataset:

In [None]:
# Set the model to evaluation mode
model.eval()

# Test the model
correct = 0
total = 0
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))

with torch.no_grad():
    for data in test_loader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        
        # Calculate accuracy for each class
        c = (predicted == labels).squeeze()
        for i in range(len(labels)):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1

print(f'Accuracy of the LeNet-5 model on the 10000 test images: {100 * correct / total:.2f}%')

# Print accuracy for each class
for i in range(10):
    print(f'Accuracy of {i}: {100 * class_correct[i] / class_total[i]:.2f}%')

## Visualizing Predictions

Let's visualize some of the model's predictions on the test dataset:

In [None]:
# Get a batch of test images
dataiter = iter(test_loader)
images, labels = next(dataiter)
images = images[:25].to(device)
labels = labels[:25]

# Get model predictions
with torch.no_grad():
    outputs = model(images)
    _, predicted = torch.max(outputs, 1)

# Plot the images with their true and predicted labels
fig, axes = plt.subplots(5, 5, figsize=(12, 12))
fig.subplots_adjust(hspace=0.5)

for i, ax in enumerate(axes.flat):
    if i < len(images):
        ax.imshow(images[i].cpu().numpy().squeeze(), cmap='gray')
        ax.set_title(f'True: {labels[i].item()}, Pred: {predicted[i].item()}')
        ax.axis('off')

plt.show()

## Conclusion

LeNet-5 was a groundbreaking architecture in the field of deep learning, particularly for computer vision tasks. Despite its simplicity compared to modern architectures, it introduced several key concepts that are still used today:

1. **Local receptive fields**: Each neuron in a convolutional layer processes data only from a small region of the input volume.
2. **Shared weights**: The same set of weights is used across the entire input volume, reducing the number of parameters and making the network more efficient.
3. **Subsampling**: Pooling layers reduce the spatial dimensions of the data, making the network more robust to variations in the input.

As we've seen, even with its relatively simple architecture, LeNet-5 can achieve high accuracy on the MNIST dataset. Modern CNN architectures have built upon these foundations to achieve even better performance on more complex tasks.

## References

1. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
2. LeCun, Y., Cortes, C., & Burges, C. J. (2010). MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/
3. PyTorch Documentation. https://pytorch.org/docs/stable/index.html