# Handwriting Recognition using CNN
Lets try to furthur optimize the solution by using Convolutional Neural Network which is better suited for image processing. CNNs are less sensitive to where in the image the pattern is that we are looking for. 

With MLP we achieved ~98% accuracy for test data.

In [0]:
import tensorflow
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.optimizers import RMSprop, Adam
from tensorflow.keras import backend as K

Lets load the data

In [5]:
(x_train, y_train),(x_test, y_test)=mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


We need to shape the data differently now. Since we have 2D images of 28x28 pixels, we need to set it up based on color channels in this case it will be 28x28x1 or 1x28x28 where 1 indicates color channes as its just grayscale. It would be 3 if we have RGB colors.

In [0]:
# we need to check if the data is color channels first or in the end and then reshape the data
if K.image_data_format() == 'channels_first':
  train_images = x_train.reshape(x_train.shape[0], 1, 28, 28)
  test_images = x_test.reshape(x_test.shape[0], 1, 28, 28)
  input_shape = (1, 28, 28) #shape of the input test data
else:
  train_images = x_train.reshape(x_train.shape[0], 28, 28, 1)
  test_images = x_test.reshape(x_test.shape[0], 28, 28, 1)
  input_shape = (28, 28, 1) #shape of the input test data

#converting from 8bit byte data to float32
train_images = train_images.astype('float32') 
test_images = test_images.astype('float32')
train_images /= 255
test_images /= 255

We need to convert our labels to one hot format ie [0,1,0,0,0,0,0,0,0,0]

In [0]:
train_labels = tensorflow.keras.utils.to_categorical(y_train, 10)
test_labels = tensorflow.keras.utils.to_categorical(y_test, 10)

Lets set up our CNN. 
We will start with 2D convolution of the image. Set up a 32 windows or filters of each image where each filter being 3x3 in kernel size.

Another 2D convolution filter of 2x2 kernel size with relu activation is added with a 64 filters

Lets apply MaxPooling2D layer that takes max of each 2x2 result to distill the results down to manageable size.

Apply dropouts to manage overfitting. 

Next we flatten the 2D layer to a 1D layer at this point we do a tradition MLP and feed it to a 128 neuron MLP with relu activation

Then we feed this data to out output neuron of 10 with softmax activation


In [8]:
model = Sequential()
#Adding the 32 2D convolution filter with kernel size 3x3 and relu acitvation function.
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
#Adding another 64 2D convolution filter with same config
model.add(Conv2D(64, (3, 3), activation='relu'))
#Reducing the samples to max of each 2x2 blocks
model.add(MaxPooling2D(pool_size=(2, 2)))
#Adding dropout 
model.add(Dropout(0.25))
#Flattening the result to 1D to feed it into a MLP
model.add(Flatten())
#Adding a hidden layer of 128 neurons with relu activation
model.add(Dense(128, activation='relu'))
#adding another dropout
model.add(Dropout(0.5))
#output layer categoriztion with softmax activation
model.add(Dense(10, activation='softmax'))

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


Lets check how our model has been programmed

In [9]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9216)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               1179776   
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0

Since we are dealing with multi category data we will do a categorical_crossentropy for the loss function and we will use RMSprop optimizer. (We can check with adam as well )

In [0]:
model.compile(loss='categorical_crossentropy', optimizer='RMSprop', metrics=['accuracy'])

lets train our model. We will use a batch size of 32 and 10 epochs as this takes a lot of time to execute.

In [12]:
hist = model.fit(train_images, train_labels, batch_size=32, epochs=10, verbose=2, validation_data=(test_images, test_labels))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 - 157s - loss: 0.1853 - acc: 0.9452 - val_loss: 0.0602 - val_acc: 0.9798
Epoch 2/10
60000/60000 - 158s - loss: 0.0875 - acc: 0.9741 - val_loss: 0.0527 - val_acc: 0.9826
Epoch 3/10
60000/60000 - 158s - loss: 0.0777 - acc: 0.9779 - val_loss: 0.0509 - val_acc: 0.9853
Epoch 4/10
60000/60000 - 158s - loss: 0.0783 - acc: 0.9782 - val_loss: 0.0591 - val_acc: 0.9823
Epoch 5/10
60000/60000 - 156s - loss: 0.0793 - acc: 0.9784 - val_loss: 0.0477 - val_acc: 0.9865
Epoch 6/10
60000/60000 - 156s - loss: 0.0845 - acc: 0.9766 - val_loss: 0.0551 - val_acc: 0.9827
Epoch 7/10
60000/60000 - 158s - loss: 0.0859 - acc: 0.9777 - val_loss: 0.0506 - val_acc: 0.9850
Epoch 8/10
60000/60000 - 157s - loss: 0.0913 - acc: 0.9757 - val_loss: 0.0567 - val_acc: 0.9824
Epoch 9/10
60000/60000 - 157s - loss: 0.0908 - acc: 0.9761 - val_loss: 0.1215 - val_acc: 0.9843
Epoch 10/10
60000/60000 - 157s - loss: 0.0960 - acc: 0.9754 - val_loss: 0.0693 - val_a

Lets check out test data

In [13]:
score = model.evaluate(test_images, test_labels, verbose=0)
print('Test Loss: ', score[0])
print('Test Accuracy: ', score[1])

Test Loss:  0.06928741020411253
Test Accuracy:  0.9807


Lets check by using adam optimizer

In [0]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [17]:
hist = model.fit(train_images, train_labels, batch_size=32, epochs=10, verbose=2, validation_data=(test_images, test_labels))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 - 152s - loss: 0.0880 - acc: 0.9762 - val_loss: 0.0428 - val_acc: 0.9882
Epoch 2/10
60000/60000 - 154s - loss: 0.0692 - acc: 0.9812 - val_loss: 0.0397 - val_acc: 0.9885
Epoch 3/10
60000/60000 - 154s - loss: 0.0608 - acc: 0.9828 - val_loss: 0.0394 - val_acc: 0.9888
Epoch 4/10
60000/60000 - 153s - loss: 0.0496 - acc: 0.9864 - val_loss: 0.0373 - val_acc: 0.9899
Epoch 5/10
60000/60000 - 155s - loss: 0.0457 - acc: 0.9869 - val_loss: 0.0382 - val_acc: 0.9901
Epoch 6/10
60000/60000 - 153s - loss: 0.0410 - acc: 0.9887 - val_loss: 0.0363 - val_acc: 0.9905
Epoch 7/10
60000/60000 - 154s - loss: 0.0366 - acc: 0.9894 - val_loss: 0.0359 - val_acc: 0.9906
Epoch 8/10
60000/60000 - 153s - loss: 0.0325 - acc: 0.9906 - val_loss: 0.0306 - val_acc: 0.9905
Epoch 9/10
60000/60000 - 153s - loss: 0.0329 - acc: 0.9902 - val_loss: 0.0319 - val_acc: 0.9897
Epoch 10/10
60000/60000 - 153s - loss: 0.0294 - acc: 0.9914 - val_loss: 0.0329 - val_a

In [18]:
score = model.evaluate(test_images, test_labels, verbose=0)
print('Test Loss: ', score[0])
print('Test Accuracy: ', score[1])

Test Loss:  0.03287565022604867
Test Accuracy:  0.9907
