# LeNet-5

## 1. LeNet-5 网络结构

通过阅读论文 Gradient-Based Learning Applied to Document Recognition，可以得知 LeNet-5 网络结构如图：
<img src="../Image/LeNet5.png" width="100%">

不包括输入层的话，LeNet-5 由7层网络层组成，每层都包含可训练参数（权重）。输入层是一个 32$\times$32的图像

C1 卷积层：

- 输入：$32\times32\times1$

- 滤波器个数：6

- 滤波器大小：$5\times5\times1$

- 输出：$28\times28\times6$

- 参数个数：$6\times(5\times5+1) = 156$      # (6个特征图，每个特征图含一个滤波器 $5\times5$ 个参数和一个偏置参数)

S2 采样层（池化层）：

- 采样方式：$2\times2$ 区域的4个值相加，乘以一个可训练参数，再加上一个偏置参数，结果通过 Sigmoid 非线性化。

- 输入：$28\times28\times6$

- 滤波器个数；6

- 滤波器大小：$2\times2$

- 输出：$14\times14\times6$

- 参数个数：$6\times(1 + 1) = 12$     #（采样的权重 + 一个偏置参数）

C3 卷积层：

- 输入：$14\times14\times6$

- 滤波器个数：16

- 滤波器大小：$5\times5\times6$

- 输出：$10\times10\times16$

- 连接方式：C3跟S2并不是全连接的，具体连接方式是： C3的前6个特征图以S2中3个相邻的特征图子集为输入。接下来6个特征图以S2中4个相邻特征图子集为输入。然后的3个以不相邻的4个特征图子集为输入。最后一个将S2中所有特征图为输入，对应如下：

<img src="../Image/LeNet5_1.png" width=100%>

- 参数个数：
$6*（3*5*5+1）+6*（4*5*5+1）+3*（4*5*5+1）+1*（6*5*5+1）=1516$

S4 下采样层（池化层）：

- 输入：$10\times10\times16$

- 滤波器大小：$2\times2$

- 滤波器个数：16

- 输出：$5\times5\times16$

- 参数个数：$2\times16$

C5 卷积层：

- 输入：$5\times5\times16$

- 滤波器大小：$5\times5\times16$

- 滤波器个数：120

- 输出：$1\times1\times120$

- 参数个数：$5\times5\times16\times120 + 120 = 48120$

F6 全连接层：

- 输入：120

- 输出：84

- 参数个数：$120\times(84 + 1) = 10164$

F7 全连接层：

- 输入：84

- 输出：10

- 参数个数：$84\times(10 + 1) = 850$

## 2. LeNet-5 代码

In [39]:
import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
import torch.optim as optim

import numpy as np

# Device configuaration
device =  torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Hyper-parameters
input_size = 784     # 28*28 = 784
hidden_size = 500
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# MNIST dataset
train_dataset = torchvision.datasets.MNIST(root="../data",
                                         train=True,
                                         transform=transforms.Compose([
                                             transforms.Resize((32, 32)),
                                             transforms.ToTensor()]),
                                         download=True)

test_dataset = torchvision.datasets.MNIST(root='../data',
                                        train=False,
                                        transform=transforms.Compose([
                                             transforms.Resize((32, 32)),
                                             transforms.ToTensor()]))

# Data Loader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                          batch_size=batch_size,
                                          shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                         batch_size=batch_size,
                                         shuffle=False)

class LeNet5(nn.Module):
    """
    Input - 1x32x32
    C1 - 6@28x28 (5x5 kernel)
    tanh
    S2 - 6@14x14 (2x2 kernel, stride 2) Subsampling
    C3 - 16@10x10
    tanh
    S4 - 16@5x5 (2x2 kernel, stride 2) Subsampling
    C5 - 120@1x1 (5x5 kernel)
    F6 - 84
    tanh
    F7 - 10 (Output)
    """
    def __init__(self):
        super(LeNet5, self).__init__()
        
        self.convnet = nn.Sequential(
            nn.Conv2d(1, 6, 5),                   # C1
            nn.ReLU(),                              # ReLU
            nn.MaxPool2d(2, stride=2),      # S2
            nn.Conv2d(6, 16, 5),                # C3
            nn.ReLU(),                            # ReLU
            nn.MaxPool2d(2, stride=2),    # S4
            nn.Conv2d(16, 120, 5),           # C5
            nn.ReLU())
        
        self.fc = nn.Sequential(
            nn.Linear(120, 84),                # F6
            nn.ReLU(),                           # ReLU
            nn.Linear(84, 10),                 # F7
            nn.LogSoftmax(dim=-1))       # LogSoftmax
        
    def forward(self, img):
        output = self.convnet(img)
        output = output.view(img.size(0), -1)
        output = self.fc(output)
        return output

# Create an LeNet5 instance
net = LeNet5()
# Loss function
criterion = nn.CrossEntropyLoss()
# Optimizer
optimizer = optim.Adam(net.parameters(), lr=learning_rate)

# Train the model
def train(num_epoch):
    """
    It is a functino to train the LeNet5 model
    @param: num_epoch: the number of epochs
    """
    total_step = len(train_loader)
    for epoch in range(num_epoch):
        for i, (images, labels) in enumerate(train_loader):
            images = images.to(device)
            labels = labels.to(device)
            
            # Forward pass
            output = net(images)
            loss = criterion(output, labels)
            
            # Backward and optimize
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            
            if (i + 1) % 100 == 0:
                print("Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}".format(epoch+1, num_epochs, i+1, total_step, loss.item()))
                
def test():
    '''
    It is a function to test the model.
    '''
    net.eval()
    total_correct = 0
    avg_loss = 0.0
    for i, (images, labels) in enumerate(test_loader):
        images = images.to(device)
        labels = labels.to(device)
        
        output = net(images)
        avg_loss += criterion(output, labels)
        # _, pred = torch.max(output.data, 1)
        pred = torch.argmax(output.data, dim=1)
        total_correct += (pred == labels).sum().item()
    
    avg_loss = avg_loss / len(test_dataset)
    print("Test Avg. Loss: {}, Accuracy: {}%".format(avg_loss, 100*total_correct/len(test_dataset)))

def main():
    # Train
    train(num_epochs)
    # Test
    test()
    # Save the model checkpoint
    torch.save(net.state_dict(), "LeNet5.ckpt")
    print("Saved model successfully!\n")
        

if __name__ == '__main__':
    main()

Epoch [1/5], Step [100/600], Loss: 0.2431
Epoch [1/5], Step [200/600], Loss: 0.2401
Epoch [1/5], Step [300/600], Loss: 0.2225
Epoch [1/5], Step [400/600], Loss: 0.1189
Epoch [1/5], Step [500/600], Loss: 0.1123
Epoch [1/5], Step [600/600], Loss: 0.1100
Epoch [2/5], Step [100/600], Loss: 0.1179
Epoch [2/5], Step [200/600], Loss: 0.1053
Epoch [2/5], Step [300/600], Loss: 0.1705
Epoch [2/5], Step [400/600], Loss: 0.0558
Epoch [2/5], Step [500/600], Loss: 0.0475
Epoch [2/5], Step [600/600], Loss: 0.0939
Epoch [3/5], Step [100/600], Loss: 0.0605
Epoch [3/5], Step [200/600], Loss: 0.0388
Epoch [3/5], Step [300/600], Loss: 0.1090
Epoch [3/5], Step [400/600], Loss: 0.0404
Epoch [3/5], Step [500/600], Loss: 0.1106
Epoch [3/5], Step [600/600], Loss: 0.0181
Epoch [4/5], Step [100/600], Loss: 0.0061
Epoch [4/5], Step [200/600], Loss: 0.0173
Epoch [4/5], Step [300/600], Loss: 0.0364
Epoch [4/5], Step [400/600], Loss: 0.0124
Epoch [4/5], Step [500/600], Loss: 0.0343
Epoch [4/5], Step [600/600], Loss: