![lenet](https://media.geeksforgeeks.org/wp-content/uploads/20240524181157/lenet-min.PNG)

# LeNet-5 Architecture – Layer Comments

### C1: First Convolutional Layer
- 6 feature maps
- Kernel size: 5 × 5

### S2: First Subsampling (Pooling) Layer
- Uses **Average Pooling**
- Kernel size: 2 × 2
- Stride: 2

### C3: Second Convolutional Layer
- 16 feature maps
- Kernel size: 5 × 5

### S4: Second Subsampling (Pooling) Layer
- Uses **Average Pooling**
- Kernel size: 2 × 2
- Stride: 2

### C5: Third Convolutional Layer
- 120 feature maps
- Kernel size: 5 × 5
- Acts as a fully connected layer because the input is **5 × 5**

### F6: Fully Connected Layer
- 84 units (neurons)

### Output Layer
- 10 output units
- Used for digit classification (0–9)


import libraries

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import torch.nn.functional as F
from torchsummary import summary
import matplotlib.pyplot as plt

LeNet-5 Architecture

In [13]:
class LeNet(nn.Module):
  def __init__(self):
    super().__init__()

    self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, stride=1, padding=0)
    self.pool1 = nn.AvgPool2d(kernel_size=2)
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5, stride=1, padding=0)
    self.pool2 = nn.AvgPool2d(kernel_size=2)
    self.conv3 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5, stride=1, padding=0)
    self.fc1 = nn.Linear(in_features=120, out_features=84)
    self.fc2 = nn.Linear(in_features=84, out_features=10)

  def forward(self, x):
      # Input: 32x32
      # C1: 32x32 -> 28x28 (conv 5x5)
      x = torch.tanh(self.conv1(x))

      # S2: 28x28 -> 14x14 (pool 2x2)
      x = self.pool1(x)

      # C3: 14x14 -> 10x10 (conv 5x5)
      x = torch.tanh(self.conv2(x))

      # S4: 10x10 -> 5x5 (pool 2x2)
      x = self.pool2(x)

      # C5: 5x5 -> 1x1 (conv 5x5)
      x = torch.tanh(self.conv3(x))

      # Flatten
      x = x.view(-1, 120)

      # F6: Fully connected
      x = torch.tanh(self.fc1(x))

      # Output layer (no activation - will use CrossEntropyLoss)
      x = self.fc2(x)

      return x




Training fun

In [11]:
# Training function
def train(model, device, train_loader, optimizer, criterion, epoch):
    model.train()
    total_loss = 0
    correct = 0

    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)

        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        total_loss += loss.item()
        pred = output.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()

        if batch_idx % 100 == 0:
            print(f'Epoch: {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)} '
                  f'({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')

    avg_loss = total_loss / len(train_loader)
    accuracy = 100. * correct / len(train_loader.dataset)
    return avg_loss, accuracy


Test fun

In [5]:
def test(model, device, test_loader, criterion):
    model.eval()
    test_loss = 0
    correct = 0

    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader)
    accuracy = 100. * correct / len(test_loader.dataset)

    print(f'\nTest set: Average loss: {test_loss:.4f}, '
          f'Accuracy: {correct}/{len(test_loader.dataset)} ({accuracy:.2f}%)\n')

    return test_loss, accuracy

In [6]:

# Hyperparameters (as per original paper)
batch_size = 64
epochs = 20
lr = 0.001

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# Data transformations
# LeNet-5 expects 32x32 input, MNIST is 28x28
transform = transforms.Compose([
    transforms.Resize(32),  # Resize to 32x32 as per paper
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load MNIST dataset
train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST('./data', train=False, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

Using device: cuda


100%|██████████| 9.91M/9.91M [00:01<00:00, 5.59MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 134kB/s]
100%|██████████| 1.65M/1.65M [00:01<00:00, 1.24MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 13.7MB/s]


In [14]:
# Initialize model, loss, and optimizer
model = LeNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)

In [15]:
# Initialize model, loss, and optimizer
model = LeNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=lr)


# Training loop
train_losses, train_accs = [], []
test_losses, test_accs = [], []

for epoch in range(1, epochs + 1):
    train_loss, train_acc = train(model, device, train_loader, optimizer, criterion, epoch)
    test_loss, test_acc = test(model, device, test_loader, criterion)

    train_losses.append(train_loss)
    train_accs.append(train_acc)
    test_losses.append(test_loss)
    test_accs.append(test_acc)


Test set: Average loss: 0.0804, Accuracy: 9760/10000 (97.60%)


Test set: Average loss: 0.0581, Accuracy: 9824/10000 (98.24%)


Test set: Average loss: 0.0509, Accuracy: 9851/10000 (98.51%)


Test set: Average loss: 0.0490, Accuracy: 9848/10000 (98.48%)


Test set: Average loss: 0.0412, Accuracy: 9864/10000 (98.64%)


Test set: Average loss: 0.0400, Accuracy: 9876/10000 (98.76%)


Test set: Average loss: 0.0417, Accuracy: 9873/10000 (98.73%)


Test set: Average loss: 0.0426, Accuracy: 9877/10000 (98.77%)


Test set: Average loss: 0.0415, Accuracy: 9875/10000 (98.75%)


Test set: Average loss: 0.0462, Accuracy: 9868/10000 (98.68%)


Test set: Average loss: 0.0430, Accuracy: 9883/10000 (98.83%)


Test set: Average loss: 0.0381, Accuracy: 9886/10000 (98.86%)


Test set: Average loss: 0.0471, Accuracy: 9868/10000 (98.68%)


Test set: Average loss: 0.0462, Accuracy: 9880/10000 (98.80%)


Test set: Average loss: 0.0462, Accuracy: 9877/10000 (98.77%)


Test set: Average loss: 0.0503, Accurac