# Chapter 3: Modern Neural Networks - First CNN with Tensorflow

This notebook implements the LeNet-5 architecture [1] as presented in the book, and applies it to hand-written digit recognition (MNIST dataset).

In [1]:
import tensorflow as tf

## Preparing the Data

As presented in [Chapter 2](../ch2), we use Tensorflow and Keras helpers to load the commonly-used [MNIST](http://yann.lecun.com/exdb/mnist) training and testing datasets. We also normalize the images (setting the pixel values from `[0, 255]` to `[0, 1]` and reshape them properly (as Tensorflow stores them as column-vectors):

In [2]:
num_classes = 10
img_rows, img_cols, img_ch = 28, 28, 1
input_shape = (img_rows, img_cols, img_ch)

In [3]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

x_train = x_train.reshape(x_train.shape[0], *input_shape)
x_test = x_test.reshape(x_test.shape[0], *input_shape)

## Implementing LeNet-5

We've demonstrated how CNNs can be implemented different ways depending on the level of parametrization vs. succinctness one needs. In this case, we will use the Keras API to showcase once again how straightforward it makes the network implementation and usage.

In [4]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D

LeNet-5 is a simple CNN composed of 7 layers (2 *conv*, 2 *max-pool*, 3 *FC* + 1 helper layer to flatten the feature maps before the *FC*). For more details, we invite our readers to go back to Chapter 3. Here is the model's implementation:


In [5]:
model = Sequential()
# 1st block:
model.add(Conv2D(6, kernel_size=(5, 5), padding='same', activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
# 2nd block:
model.add(Conv2D(16, kernel_size=(5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# Dense layers:
model.add(Flatten())
model.add(Dense(120, activation='relu'))
model.add(Dense(84, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

## Applying to MNIST


Now we can compile our model for digit classification. To train it for this task, we instantiate the optimizer (a simple SGD one for this example) and define the loss (the categorical cross-entropy):

In [6]:
model.compile(optimizer='sgd', loss='sparse_categorical_crossentropy', metrics=['accuracy'])


In [7]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 6)         156       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 6)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 10, 10, 16)        2416      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 16)          0         
_________________________________________________________________
flatten (Flatten)            (None, 400)               0         
_________________________________________________________________
dense (Dense)                (None, 120)               48120     
_________________________________________________________________
dense_1 (Dense)              (None, 84)                10164     
__________

Before launching the training, we also instantiate some Keras callbacks, i.e., utility functions automatically called at some points during training to monitor it:

In [8]:
callbacks = [
    # Callback to interrupt the training if the validation loss (`val_loss`) stops improving for over 3 epochs:
    tf.keras.callbacks.EarlyStopping(patience=3, monitor='val_loss'),
    # Callback to log the graph, losses and metrics into TensorBoard (saving log files in `./logs` directory):
    tf.keras.callbacks.TensorBoard(log_dir='./logs', histogram_freq=1, write_graph=True)]

(The Tensorboard callback allows us to monitor the training from Tensorboard. For that, open a console and launch the programm with the command "`tensorboard --logdir=./logs`". You can then access Tensorboard from a browser, via the URL "[`localhost:6006`](localhost:6006)".)

We can now pass everything to our model to train it:


In [9]:
model.fit(x_train, y_train, 
          batch_size=32, epochs=30, validation_data=(x_test, y_test), 
          verbose=1, callbacks=callbacks)

Train on 60000 samples, validate on 10000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30


<tensorflow.python.keras.callbacks.History at 0x7fa4da780908>

Given a machine with recent GPU(s), this training is quite fast (~0.16ms/step in our case, c.f. logs). The final accuracy we obtain on the validation dataset (**~98.9%!**) is also much better compared to our previous attempts with simpler networks. Indeed, the relative error has been approximately divided by 5 (from ~5% to ~1% error), which is a significant improvement.

## References

1. LeCun, Yann. "*LeNet-5, convolutional neural networks.*" [http://yann.lecun.com/exdb/lenet](http://yann.lecun.com/exdb/lenet) (2015): 20.