1. 相邻像素在像素中可能举例较远，难以被模型识别
2. 对于大尺寸图像，全连接层容易造成模型过大，权重参数过多，消耗资源较多。

# LeNet模型

分为卷积层块和全连接层块两个部分。

## 卷积层块
卷积层块里的基本单位是**卷积层** 后接 **池化层**，前者用来识别图像里的空间模式，后者用来降低卷积层对位置的敏感性。

卷积层：5 * 5的窗口，输出接sigmoid激活函数。第一层输出通道为6，第二层输出通道数则增加到16，增加输出通道的目的是使得两层的参数尺寸类似。

池化层：窗口形状为 2 * 2，步幅为2，这样池化窗口每次覆盖的区域互不重叠。

卷积层块输出形状：（批量大小，通道，高，宽）


## 全连接层块
当卷积层块的输出传入全连接层块时，全连接层块会将小批量中每个样本变平（flatten）。

全连接层块含3个全连接层，它们输出个数分别是120、84、10。

In [1]:
import time
import torch
import torch.nn as nn
import torch.optim as optim

import d2lzh_pytorch as d2l

In [2]:
class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 6, 5), # in_channel, out_channels, kernel_size 
            nn.Sigmoid(),
            nn.MaxPool2d(2, 2), # kernel_size, stride
            nn.Conv2d(6, 16, 5),
            nn.Sigmoid(),
            nn.MaxPool2d(2, 2)
        )
        self.fc = nn.Sequential(
            nn.Linear(16 * 4 * 4, 120),
            nn.Sigmoid(),
            nn.Linear(120, 84),
            nn.Sigmoid(),
            nn.Linear(84, 10)
        )
    def forward(self, img):
        feature = self.conv(img)
        # -1 表示除了将batch_size的留住，其他都平铺
        output = self.fc(feature.view(img.shape[0], -1))
        return output

In [3]:
net = LeNet()
print(net)

LeNet(
  (conv): Sequential(
    (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
    (1): Sigmoid()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
    (4): Sigmoid()
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Sequential(
    (0): Linear(in_features=256, out_features=120, bias=True)
    (1): Sigmoid()
    (2): Linear(in_features=120, out_features=84, bias=True)
    (3): Sigmoid()
    (4): Linear(in_features=84, out_features=10, bias=True)
  )
)


In [4]:
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=batch_size)

In [5]:
lr, num_epochs = 0.001, 5
device = 'cpu'
optimizer = torch.optim.Adam(net.parameters(), lr=lr)
d2l.train_cuda_cpu(net, train_iter, test_iter, batch_size, optimizer, device,num_epochs=num_epochs)

training on  cpu
epoch  1, loss  2.3478, train acc  0.094, test acc  0.100, time  2.3 sec
epoch  1, loss  2.3300, train acc  0.100, test acc  0.100, time  4.9 sec
epoch  1, loss  2.3212, train acc  0.096, test acc  0.100, time  7.3 sec
epoch  1, loss  2.3149, train acc  0.101, test acc  0.100, time  9.6 sec
epoch  1, loss  2.3120, train acc  0.095, test acc  0.100, time  12.2 sec
epoch  1, loss  2.3113, train acc  0.096, test acc  0.100, time  14.4 sec
epoch  1, loss  2.3098, train acc  0.098, test acc  0.100, time  16.3 sec
epoch  1, loss  2.3092, train acc  0.099, test acc  0.100, time  18.3 sec
epoch  1, loss  2.3093, train acc  0.097, test acc  0.100, time  20.5 sec
epoch  1, loss  2.3094, train acc  0.096, test acc  0.100, time  22.8 sec
epoch  1, loss  2.3097, train acc  0.096, test acc  0.100, time  24.9 sec
epoch  1, loss  2.3089, train acc  0.097, test acc  0.100, time  26.9 sec
epoch  1, loss  2.3084, train acc  0.099, test acc  0.100, time  28.8 sec
epoch  1, loss  2.3089, t

epoch  1, loss  2.2838, train acc  0.145, test acc  0.323, time  1510.1 sec
epoch  1, loss  2.2820, train acc  0.147, test acc  0.324, time  1512.1 sec
epoch  1, loss  2.2804, train acc  0.148, test acc  0.323, time  1514.3 sec
epoch  1, loss  2.2784, train acc  0.150, test acc  0.326, time  1516.5 sec
epoch  1, loss  2.2766, train acc  0.151, test acc  0.328, time  1518.4 sec
epoch  1, loss  2.2745, train acc  0.153, test acc  0.334, time  1520.6 sec
epoch  1, loss  2.2723, train acc  0.154, test acc  0.341, time  1522.7 sec
epoch  1, loss  2.2700, train acc  0.156, test acc  0.357, time  1524.7 sec
epoch  1, loss  2.2680, train acc  0.158, test acc  0.374, time  1526.7 sec
epoch  1, loss  2.2655, train acc  0.160, test acc  0.373, time  1528.8 sec
epoch  1, loss  2.2629, train acc  0.162, test acc  0.376, time  1531.1 sec
epoch  1, loss  2.2605, train acc  0.164, test acc  0.377, time  1533.2 sec
epoch  1, loss  2.2577, train acc  0.166, test acc  0.382, time  1535.3 sec
epoch  1, lo

epoch  1, loss  1.9008, train acc  0.309, test acc  0.578, time  1727.5 sec
epoch  1, loss  1.8973, train acc  0.310, test acc  0.580, time  1729.3 sec
epoch  1, loss  1.8941, train acc  0.311, test acc  0.575, time  1731.1 sec
epoch  1, loss  1.8905, train acc  0.312, test acc  0.572, time  1732.9 sec
epoch  1, loss  1.8873, train acc  0.313, test acc  0.570, time  1734.7 sec
epoch  1, loss  1.8840, train acc  0.314, test acc  0.568, time  1736.5 sec
epoch  1, loss  1.8804, train acc  0.316, test acc  0.565, time  1738.3 sec
epoch  1, loss  1.8768, train acc  0.317, test acc  0.563, time  1740.3 sec
epoch  1, loss  1.8734, train acc  0.318, test acc  0.564, time  1742.1 sec
epoch  1, loss  1.8701, train acc  0.319, test acc  0.564, time  1744.1 sec
epoch  1, loss  1.8668, train acc  0.320, test acc  0.566, time  1746.2 sec
epoch  1, loss  1.8638, train acc  0.321, test acc  0.569, time  1748.4 sec
epoch  1, loss  1.8605, train acc  0.322, test acc  0.574, time  1750.4 sec
epoch  1, lo

epoch  2, loss  0.2893, train acc  0.608, test acc  0.658, time  180.7 sec
epoch  2, loss  0.2913, train acc  0.608, test acc  0.653, time  182.7 sec
epoch  2, loss  0.2934, train acc  0.608, test acc  0.646, time  184.7 sec
epoch  2, loss  0.2954, train acc  0.609, test acc  0.641, time  186.9 sec
epoch  2, loss  0.2974, train acc  0.609, test acc  0.639, time  189.0 sec
epoch  2, loss  0.2990, train acc  0.610, test acc  0.638, time  191.1 sec
epoch  2, loss  0.3010, train acc  0.611, test acc  0.634, time  192.9 sec
epoch  2, loss  0.3032, train acc  0.611, test acc  0.633, time  194.8 sec
epoch  2, loss  0.3052, train acc  0.611, test acc  0.634, time  196.6 sec
epoch  2, loss  0.3069, train acc  0.611, test acc  0.636, time  198.6 sec
epoch  2, loss  0.3087, train acc  0.612, test acc  0.640, time  200.4 sec
epoch  2, loss  0.3106, train acc  0.612, test acc  0.643, time  202.2 sec
epoch  2, loss  0.3121, train acc  0.613, test acc  0.644, time  204.1 sec
epoch  2, loss  0.3140, t

epoch  2, loss  0.4381, train acc  0.645, test acc  0.701, time  396.5 sec
epoch  2, loss  0.4393, train acc  0.645, test acc  0.700, time  398.3 sec
epoch  2, loss  0.4400, train acc  0.645, test acc  0.699, time  400.1 sec
epoch  2, loss  0.4409, train acc  0.646, test acc  0.698, time  401.9 sec
epoch  2, loss  0.4420, train acc  0.646, test acc  0.697, time  403.8 sec
epoch  2, loss  0.4426, train acc  0.647, test acc  0.697, time  405.8 sec
epoch  2, loss  0.4435, train acc  0.647, test acc  0.694, time  407.6 sec
epoch  2, loss  0.4445, train acc  0.647, test acc  0.692, time  409.4 sec
epoch  2, loss  0.4452, train acc  0.647, test acc  0.689, time  411.2 sec
epoch  2, loss  0.4462, train acc  0.647, test acc  0.688, time  413.0 sec
epoch  2, loss  0.4472, train acc  0.647, test acc  0.689, time  414.9 sec
epoch  2, loss  0.4480, train acc  0.647, test acc  0.691, time  416.7 sec
epoch  2, loss  0.4486, train acc  0.648, test acc  0.692, time  418.9 sec
epoch  2, loss  0.4496, t

epoch  3, loss  0.1115, train acc  0.716, test acc  0.712, time  153.4 sec
epoch  3, loss  0.1127, train acc  0.717, test acc  0.713, time  155.3 sec
epoch  3, loss  0.1137, train acc  0.716, test acc  0.717, time  157.1 sec
epoch  3, loss  0.1148, train acc  0.716, test acc  0.719, time  159.1 sec
epoch  3, loss  0.1161, train acc  0.716, test acc  0.721, time  160.9 sec
epoch  3, loss  0.1173, train acc  0.716, test acc  0.720, time  162.9 sec
epoch  3, loss  0.1185, train acc  0.717, test acc  0.719, time  164.9 sec
epoch  3, loss  0.1196, train acc  0.717, test acc  0.719, time  166.9 sec
epoch  3, loss  0.1209, train acc  0.717, test acc  0.720, time  168.9 sec
epoch  3, loss  0.1219, train acc  0.717, test acc  0.719, time  170.9 sec
epoch  3, loss  0.1231, train acc  0.717, test acc  0.717, time  172.7 sec
epoch  3, loss  0.1240, train acc  0.718, test acc  0.714, time  174.6 sec
epoch  3, loss  0.1252, train acc  0.718, test acc  0.713, time  176.4 sec
epoch  3, loss  0.1263, t

KeyboardInterrupt: 