**Basic Introduction**

LeNet-5, from the paper Gradient-Based Learning Applied to Document Recognition, is a very efficient convolutional neural network for handwritten character recognition.

**Implement LeNet-5 on MNIST**

MNIST data is having some handwritten characters.

In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.callbacks import ModelCheckpoint   
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns 
import keras
from keras.utils import np_utils
from tensorflow.keras.datasets import mnist

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [3]:
print(len(X_train))
print(len(X_test))

60000
10000


In [4]:
X_train = X_train.astype('float32')/255 #Scaling the data
X_test = X_test.astype('float32')/255 

print('X_train shape:', X_train.shape)
print(X_train.shape[0], '=train samples')
print(X_test.shape[0], '=test samples')

X_train shape: (60000, 28, 28)
60000 =train samples
10000 =test samples


In [5]:
from keras.utils import np_utils

num_classes = 10 
# print first ten (integer-valued) training labels
print('Integer-valued labels:')
print(y_train[:10])

# one-hot encode the labels
# convert class vectors to binary class matrices
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)

# print first ten (one-hot) training labels
print('One-hot labels:')
print(y_train[:10])

Integer-valued labels:
[5 0 4 1 9 2 1 3 1 4]
One-hot labels:
[[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]


In [6]:
img_rows, img_cols = 28, 28

X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

print('input_shape: ', input_shape)
print('x_train shape:', X_train.shape)

input_shape:  (28, 28, 1)
x_train shape: (60000, 28, 28, 1)


In [7]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1))) #26*26*32
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu')) #24*24*32
model.add(MaxPooling2D(pool_size=(2, 2))) #12*12*32
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu')) #10*10*64
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu')) #8*8*64
model.add(MaxPooling2D(pool_size=(2, 2))) #4*4*64
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 32)        9248      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 10, 10, 64)        18496     
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 8, 8, 64)          36928     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 4, 4, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0

In [8]:
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', 
              metrics=['accuracy'])

In [9]:
from tensorflow.keras.callbacks import ModelCheckpoint   

# train the model
checkpointer = ModelCheckpoint(filepath='model.weights.best.hdf5', verbose=1, 
                               save_best_only=True)
hist = model.fit(X_train, y_train, batch_size=32, epochs=50,
          validation_data=(X_test, y_test), callbacks=[checkpointer], 
          verbose=2, shuffle=True)

Epoch 1/50
1875/1875 - 54s - loss: 0.1227 - accuracy: 0.9626 - val_loss: 0.0432 - val_accuracy: 0.9868

Epoch 00001: val_loss improved from inf to 0.04323, saving model to model.weights.best.hdf5
Epoch 2/50
1875/1875 - 25s - loss: 0.0409 - accuracy: 0.9880 - val_loss: 0.0350 - val_accuracy: 0.9892

Epoch 00002: val_loss improved from 0.04323 to 0.03496, saving model to model.weights.best.hdf5
Epoch 3/50
1875/1875 - 25s - loss: 0.0300 - accuracy: 0.9915 - val_loss: 0.0267 - val_accuracy: 0.9920

Epoch 00003: val_loss improved from 0.03496 to 0.02669, saving model to model.weights.best.hdf5
Epoch 4/50
1875/1875 - 25s - loss: 0.0237 - accuracy: 0.9930 - val_loss: 0.0276 - val_accuracy: 0.9930

Epoch 00004: val_loss did not improve from 0.02669
Epoch 5/50
1875/1875 - 25s - loss: 0.0213 - accuracy: 0.9940 - val_loss: 0.0213 - val_accuracy: 0.9929

Epoch 00005: val_loss improved from 0.02669 to 0.02125, saving model to model.weights.best.hdf5
Epoch 6/50
1875/1875 - 25s - loss: 0.0187 - accur

In [10]:
model.load_weights('model.weights.best.hdf5')

In [11]:
score = model.evaluate(X_test, y_test, verbose=0)
accuracy = 100*score[1]

# print test accuracy
print('Test accuracy: %.4f%%' % accuracy)

Test accuracy: 99.2900%
