# Building the Original LeNet5 Network

The LeNet5 architecture consists of two sequences of convolutional and average pooling layers that perform image processing. The last layer of the sequences is then flattened. Therefore, each neuron in the resulting series of convoluted 2-D arrays  is copied into a single line of neurons. Two fully connected layers and a softmax classifier complete the network and provide the output in terms of probability. 

In [1]:
import keras
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Conv2D, AveragePooling2D
from keras.layers import Dense, Flatten
from keras.losses import categorical_crossentropy

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


After importing the necessary tools, you need to collect the data

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

The downloaded data consists of single-channel 28-X-28 pixel images representing handwritten numbers from zero to nine

In [3]:
# transform targets into one-hot-encoded vectors
num_classes = len(np.unique(y_train))
print(y_train[0], end=' => ')
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
print(y_train[0])

5 => [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]


The output is 0 based and that the 1 appears at the position corresponding to the number 5. This setting is used because the neural network needs a response layer, which is a set of neurons that should become activated if the provided answer is correct. In this case, you see ten neurons, and in the training phase, the code activates the correct answer (the value at the correct position is set to 1) and turns the others off (their values are 0). In the test phase, the neural network uses its databases of examples to turn the correct neuron on, or at least more than the correct one.

In [4]:
# rescale 0-1 and cast training data as float32
X_train = X_train.astype(np.float32) / 255
X_test = X_test.astype(np.float32) / 255

# reshape data to have also the channel dimension
img_rows, img_cols = X_train.shape[1:]
X_train = X_train.reshape(len(X_train), img_rows, img_cols, 1)
X_test = X_test.reshape(len(X_test), img_rows, img_cols, 1)

# notice the input shape
input_shape = (img_rows, img_cols, 1)
print(input_shape)

(28, 28, 1)


The pixel numbers, which range from 0 to 255, are transformed into a decimal value ranging from 0 to 1. The first two lines of code optimise the network to work properly with large numbers that could cause problems. The lines that follow reshape the images to have height, width, and channels. 

In [5]:
# Call the sequential function that provides an empty model
lenet = Sequential()

# Convolutional Layer C1
lenet.add(Conv2D(6, kernel_size=(5, 5), activation='tanh', 
                 input_shape=input_shape, padding='same', name='C1'))

# Pooling Layer S2
lenet.add(AveragePooling2D(pool_size=(2, 2), name='S2'))

# Convolutional Layer C3
lenet.add(Conv2D(16, kernel_size=(5, 5), activation='tanh', name='C3'))

# Pooling Layer S4
lenet.add(AveragePooling2D(pool_size=(2, 2), name='S4'))

# Fully Connected Convolutional Layer C5
lenet.add(Conv2D(120, kernel_size=(5, 5), activation='tanh', name='C5'))
lenet.add(Flatten())

# Fully Connected Layer FC6
lenet.add(Dense(84, activation='tanh', name='FC6'))

#Output Layer (softmax activation)
lenet.add(Dense(10, activation='softmax', name='OUTPUT'))

The first layer added is a convolutional layer named C1. The convolution operates with a filter size of 6 and a kernel size of 5 X 5 pixels. **The activation function for all the layers of the network but the last one is *tanh***, a nonlinear function that was state of the art for activation at the Yann LeCun created LeNet5. It is outdated today, and should be replaced with a modern ReLU. Their is a pooling layer, named S2, which uses a 2 X 2-pixel kernel.

The code proceeds with the sequences, always performed with a convolution and a pooling layer but this time using more filters.

The LeNet5 closes incrementally using a convolution with 120 filters. This convolution does not have a pooling layer but rather a flattening layer, which projects the neurons into the last convolution layer as a dense layer. 

The closing of the network is a sequence of two dense layers that processes the convolution's outputs using the tanh and softmax activation. These two layers provide the final output layers where the neurons activate an output to signal the predicted answer. 

In [6]:
# The network is now ready, so we get Keras to compile it.
lenet.compile(loss=categorical_crossentropy, optimizer='SGD', metrics=['accuracy'])
lenet.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
C1 (Conv2D)                  (None, 28, 28, 6)         156       
_________________________________________________________________
S2 (AveragePooling2D)        (None, 14, 14, 6)         0         
_________________________________________________________________
C3 (Conv2D)                  (None, 10, 10, 16)        2416      
_________________________________________________________________
S4 (AveragePooling2D)        (None, 5, 5, 16)          0         
_________________________________________________________________
C5 (Conv2D)                  (None, 1, 1, 120)         48120     
_________________________________________________________________
flatten_1 (Flatten)          (None, 120)               0         
_________________________________________________________________
FC6 (Dense)                  (None, 84)                10164     
__________

We can now run the network!

Completing the run takes 50 epoch, each epoch processing batches of 64 images at one time (an epoch is the passing of the entire dataset through the neural network one time).

The output will show a progress bar telling you the time to complete that epoch. You can also read the accuracy measures for both the training set (estimate of the goodness of the model) and the test set (the more realistic view). 

In [7]:
batch_size = 64
epochs = 50
history = lenet.fit(X_train, y_train,
                      batch_size=batch_size,
                      epochs=epochs,
                      validation_data=(X_test, 
                                       y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


The LeNet5 achieves an accuracy of **0.989**.