# MNIST Digit recognition 

In this notebook we will look at the MNIST digits dataset and create a Convoluted Neural Network.

Above every code block the purpose of the block is described.




The following block sets some environment variables. To limit warnings.

In [None]:
%env CUDA_VISIBLE_DEVICES=""
%env TF_CPP_MIN_LOG_LEVEL="5"

We read in some python libraries:
* Numpy
* Matplotlib
* Tensorflow
* Keras

In [None]:

import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical



We read in the data from MNIST (available in Keras) and split into train and test sets.

In [None]:

(x_train, y_train), (x_test, y_test) = mnist.load_data()

We enable matplotlib to show images in this notebook.

In [None]:
%matplotlib inline


We show the 35th image in the dataset.

In [None]:
image_index = 35
print(y_train[image_index])
plt.imshow(x_train[image_index], cmap='Greys')
plt.show()

We print the sizes (and shape) of the train and test sets.

In [None]:
print(x_train.shape)
print(x_test.shape)

We show which digits are in the beginning of the train set, upto and including the 35th digit.

In [None]:
print(y_train[:image_index + 1])

We set the image dimensions and then reshape the images in the train and test set to this size.

In [None]:
# save input image dimensions
img_rows, img_cols = 28, 28

x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)


As all values in images are between 0 and 255 and we want to work with numbers between 0 and 1, we scale all values by dividing by 255.

In [None]:

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

We convert the y values (the label of the digit) to a categorical variable.

In [None]:
num_classes = 10

y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

We import the different kinds of layers to use in Keras from the package.

In [None]:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D


We start a model as sequential (i.e. a stack of different kinds of layers). 
And we add the first layer, which is a Conv2D layer.
As activation we use 'relu' (rectified linear units).

In [None]:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
     activation='relu',
     input_shape=(img_rows, img_cols, 1)))

We add another Conv2D layer and a pooling layer:

In [None]:
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

We add a Dropout layer to randomly drop 25% of the units to prevent overfitting.

In [None]:
model.add(Dropout(0.25))

We convert the previous hidden layer into a 1D array using a Flatten layer.

In [None]:
model.add(Flatten())

We create a Dense layer. This layer is similar to a traditional neural network (all to all).

We again add a Dropout layer and drop 25% of all units.

And then another Dense layer which does the final classification into 10 classes.

In [None]:
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

Print the model:

In [None]:
print(model.summary())

Import a tool to show the model in another form:

In [None]:
from tensorflow.keras.utils import plot_model


Use the imported plot_model to show the created model.

In [None]:
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

With the library 'visualkeras' we can visualize the model:

In [None]:
import visualkeras
visualkeras.layered_view(model)

Also using a legend to show the types of layers.

In [None]:
visualkeras.layered_view(model, legend=True) # without custom font
from PIL import ImageFont
font = ImageFont.truetype("arial.ttf", 12)
visualkeras.layered_view(model, legend=True, font=font) # selected font

Or without the 3D effects.

In [None]:
visualkeras.layered_view(model, legend=True, font=font, draw_volume=False)


We 'compile' the model. Ready to start training.

In [None]:
model.compile(loss='categorical_crossentropy',
      optimizer='adam',
      metrics=['accuracy'])

We train the model for 10 epochs with batches of 128 images.

In [None]:
batch_size = 128
epochs = 10

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
model.save("test_model.h5")

Here we can import an image to test the model on.

In [None]:
import imageio
#import numpy as np
#from matplotlib import pyplot as plt

im = imageio.imread("digit.jpg")

#im = imageio.imread("https://i.imgur.com/a3Rql9C.png")

Scale the model to the right dimensions. And show the scaled image.

In [None]:
from skimage.transform import resize

im = resize(im, (img_rows, img_cols))
plt.imshow(im)
plt.show()

Let's grayscale the image and make it a bit darker.

In [None]:
gray = np.dot(im[...,:3], [0.299, 0.587, 0.114])

plt.imshow(gray, cmap = plt.get_cmap('gray'))
plt.show()

Reshape the output to make sure it is the right shape to go into our model to classify.

In [None]:
# reshape the image
gray = gray.reshape(1, img_rows, img_cols, 1)

# normalize image
# gray /= 255
# print(gray)
# print(img_cols, img_rows)

Previously we saved the model as test_model.h5. We now read it back in an use it to infer.

This is done using the predict function on the model using our grey-scale image.

In [None]:
# load the model
from tensorflow.keras.models import load_model
model = load_model("test_model.h5")

# predict digit
prediction = model.predict(gray)
print(prediction.argmax())

To see this in action live, go to [MNIST-Draw](https://mco-mnist-draw-rwpxka3zaa-ue.a.run.app/)