# LeNet 模型
LeNet分为卷积层块和全连接层块两个部分。下面分别介绍这两个模块。

卷积层块里的基本单位是卷积层后接最大池化层：卷积层用来识别图像里的空间模式，如线条和物体局部，之后的最大池化层则用来降低卷积层对位置的敏感性。卷积层块由两个这样的基本单位重复堆叠构成。在卷积层块中，每个卷积层都使用$5\times 5$的窗口，并在输出上使用sigmoid激活函数。第一个卷积层输出通道数为6，第二个卷积层输出通道数则增加到16。这是因为第二个卷积层比第一个卷积层的输入的高和宽要小，所以增加输出通道使两个卷积层的参数尺寸类似。卷积层块的两个最大池化层的窗口形状均为$2\times 2$，且步幅为2。由于池化窗口与步幅形状相同，池化窗口在输入上每次滑动所覆盖的区域互不重叠。

卷积层块的输出形状为(批量大小, 通道, 高, 宽)。当卷积层块的输出传入全连接层块时，全连接层块会将小批量中每个样本变平（flatten）。也就是说，全连接层的输入形状将变成二维，其中第一维是小批量中的样本，第二维是每个样本变平后的向量表示，且向量长度为通道、高和宽的乘积。全连接层块含3个全连接层。它们的输出个数分别是120、84和10，其中10为输出的类别个数。

由于计算力有限， 这里使用Fashiong MNIST数据集来训练LeNet模型。



##  使用Sequential搭建LeNet模型

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models, losses
import os
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)


def lenet_model():
    tf.keras.backend.clear_session()
    lenet = models.Sequential()
    lenet.add(layers.Conv2D(filters=6, kernel_size = 5, activation = 'sigmoid', input_shape=(28, 28, 1)))
    lenet.add(layers.MaxPool2D(pool_size = 2, strides = 2))
    lenet.add(layers.Conv2D(filters = 16, kernel_size = 5, activation = 'sigmoid'))
    lenet.add(layers.MaxPool2D(pool_size = 2, strides = 2))
    lenet.add(layers.Flatten())
    lenet.add(layers.Dense(120, activation = 'sigmoid'))
    lenet.add(layers.Dense(84, activation = 'sigmoid'))
    lenet.add(layers.Dense(10, activation = 'sigmoid'))

    lenet.summary()
    return lenet

lenet=lenet_model()


Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 24, 24, 6)         156       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 6)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 8, 8, 16)          2416      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 4, 4, 16)          0         
_________________________________________________________________
flatten (Flatten)            (None, 256)               0         
_________________________________________________________________
dense (Dense)                (None, 120)               30840     
_________________________________________________________________
dense_1 (Dense)              (None, 84)                1

## 构建数据管道
构造一个高和宽均为28的单通道数据样本，并逐层进行前向计算来查看每个层的输出形状

In [2]:
def data_pipeline(net):
    x = tf.random.uniform((1, 28, 28, 1))
    for layer in net.layers:
        x = layer(x)
        print(layer.name, 'Output shape\t', x.shape)
        
#data_pipeline(lenet)
X = tf.random.uniform((1,28,28,1))
for layer in lenet.layers:
    X = layer(X)
    print(layer.name, 'output shape\t', X.shape)


conv2d output shape	 (1, 24, 24, 6)
max_pooling2d output shape	 (1, 12, 12, 6)
conv2d_1 output shape	 (1, 8, 8, 16)
max_pooling2d_1 output shape	 (1, 4, 4, 16)
flatten output shape	 (1, 256)
dense output shape	 (1, 120)
dense_1 output shape	 (1, 84)
dense_2 output shape	 (1, 10)


In [3]:
fashion_mnist = tf.keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

train_images = tf.reshape(train_images, (train_images.shape[0],train_images.shape[1],train_images.shape[2], 1))
print(train_images.shape)

test_images = tf.reshape(test_images, (test_images.shape[0],test_images.shape[1],test_images.shape[2], 1))


(60000, 28, 28, 1)


## 训练模型

In [5]:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.9, momentum=0.0, nesterov=False)

lenet.compile(optimizer=optimizer,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


In [6]:

with tf.device('/gpu:0'):
    lenet.fit(train_images, train_labels, epochs=5, validation_split=0.1)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


## 模型评估

In [8]:
lenet.evaluate(test_images, test_labels, verbose=2)

313/313 - 0s - loss: 0.5288 - accuracy: 0.7966


[0.5288030505180359, 0.7965999841690063]