![Inspire logo](../images/inspire_logo.png "Inspire logo")

# Inspire | Summer 2021
## CNN for handwritten number recognition

Using this notebook, we will train a convolutional neural network (CNN) to recognise handwritten numbers.

# Get access to Python software packages
These are software packages that have already been installed on the computer. Here we import the packages so that we can use their functions in our code.

In [2]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

from tensorflow.keras.datasets import mnist
from tensorflow import keras
from tensorflow.keras import layers

pd.set_option("max_columns", 28)

# Functions
These functions are used later on in the code. You do not need to read or understand these functions.

In [3]:
def load_data():
    """Load the dataset and split it into training, validation and test datasets."""
    validation_split = 0.1
    (x_train_val, y_train_val), (x_test, y_test) = mnist.load_data()
    x_train, x_val, y_train, y_val = train_test_split(
        x_train_val, y_train_val, test_size=validation_split, 
        stratify=y_train_val, random_state=7)
    return (x_train, y_train), (x_val, y_val), (x_test, y_test)

In [4]:
def show_image(x, y, i):
    """Show image i from a dataset of samples, x, and labels, y."""
    plt.subplots(figsize=(10, 10))
    plt.imshow(x[i], cmap=plt.get_cmap("binary"))
    plt.title(f"A handwritten number {y[i]}")
    plt.xticks(ticks=range(28))
    plt.yticks(ticks=range(28));
    
    
def show__multi_images(x, y, i):
    """Show 25 images from a dataset of samples, x, and labels, y.
         Starting with image i."""
    n = 25
    assert i+n <= x.shape[0], f"i must be less than {x.shape[0]-n+1}."

    print(f"Data samples from {i} to {i+n-1}:")
    plt.subplots(figsize=(10, 10))
    for j in range(n):
        plt.subplot(5, 5, j+1)
        plt.imshow(x[i+j], cmap=plt.get_cmap("binary"))
        plt.title(f"Number {y[i+j]}")
        plt.xticks(ticks=[])
        plt.yticks(ticks=[])

In [5]:
def scale_and_label_data(x_train, y_train, x_val, y_val, x_test, y_test):
    """Return the data rescaled and with one-hot encoded labels. """
    # Rescale the matrices of numbers so that they are 0 to 1 instead of 0 to 255.
    x_train = x_train.astype("float32") / 255
    x_val = x_val.astype("float32") / 255
    x_test = x_test.astype("float32") / 255

    # Convert each label from a number from 0 to 9 to a 1x10 vector of 0s and 1s
    num_classes = 10
    y_train = keras.utils.to_categorical(y_train, num_classes)
    y_val = keras.utils.to_categorical(y_val, num_classes)
    y_test = keras.utils.to_categorical(y_test, num_classes)
    
    return (x_train, y_train), (x_val, y_val), (x_test, y_test)


def reshape_data_for_cnn(x_train, x_val, x_test):
    """Return the data reshaped for input to a CNN model. """
    # Reshape the datasets from (n, 28, 28) to (n, 28, 28, 1)
    x_train = np.expand_dims(x_train, -1)
    x_val = np.expand_dims(x_val, -1)
    x_test = np.expand_dims(x_test, -1)
    return x_train, x_val, x_test


def prepare_data_for_cnn(x_train, y_train, x_val, y_val, x_test, y_test):
    """Return the data rescaled and reshaped ready for input to a CNN model. """
    # Rescale the matrices of numbers so that they are 0 to 1 instead of 0 to 255.
    # Convert each label from a number from 0 to 9 to a 1x10 vector of 0s and 1s
    (x_train, y_train), (x_val, y_val), (x_test, y_test) = scale_and_label_data(
        x_train, y_train, x_val, y_val, x_test, y_test)

    # Reshape the matrices from (n, 28, 28) to (n, 784)
    x_train, x_val, x_test = reshape_data_for_cnn(x_train, x_val, x_test)
    
    return (x_train, y_train), (x_val, y_val), (x_test, y_test)

In [6]:
def plot_metric(hist):
    """Plot the metrics that were recorded in the log during model training """
    log = pd.DataFrame(hist.history) 
    ax = log.plot(title='Training')
    ax.set_xlabel("Model training epoch")

# Load the data
Load data samples, x, and their labels, y. The dataset is split into datasets that will be used for different stages of model development:
+ training the model: x_train, y_train
+ validation (testing the model during model development): x_val, y_val
+ testing - a final test once model development is complete: x_test, y_test

Each data sample is a different handwritten number. Each sample has a label, which tells us what that handwritten number is supposed to be. Each label will be an integer (whole number) between 0 and 9.

In [7]:
(x_train, y_train), (x_val, y_val), (x_test, y_test) = load_data()

In [8]:
print(f"There are {x_train.shape[0]} samples and {y_train.shape[0]} labels in the training dataset.")
print(f"Each data sample in the training dataset is an image that is {x_train.shape[1]} pixels by {x_train.shape[2]} pixels.")
print(f"There are {x_val.shape[0]} samples and labels in the validation dataset.")
print(f"There are {x_test.shape[0]} samples and labels in the test dataset.")

There are 54000 samples and 54000 labels in the training dataset.
Each data sample in the training dataset is an image that is 28 pixels by 28 pixels.
There are 6000 samples and labels in the validation dataset.
There are 10000 samples and labels in the test dataset.


# Build a machine learning model

In [9]:
num_classes = 10
input_shape = (28, 28, 1)

## Prepare the data
This changes the data to the size and scale that the model requires.

In [10]:
(x_train, y_train), (x_val, y_val), (x_test, y_test) = prepare_data_for_cnn(
    x_train, y_train, x_val, y_val, x_test, y_test)

## Build the model

In [11]:
model = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

In [12]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                1

## Train the model

In [13]:
batch_size = 128
epochs = 3

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, 
          epochs=epochs, validation_data=(x_val, y_val))


Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7f056c3c9100>

# Validation: How good is the trained model?

In [14]:
score = model.evaluate(x_val, y_val, verbose=0)
print("Testing using the validation dataset:")
print("Loss:", score[0])
print("Accuracy:", score[1])

Testing using the validation dataset:
Loss: 0.05685357004404068
Accuracy: 0.981333315372467


# Final test: How good is the final model?

In [15]:
do_final_test = False  # INSPIRE: when model development is complete, change this to True

if do_final_test:
    score = model.evaluate(x_test, y_test, verbose=0)
    print("Testing using the test dataset:")
    print("Test loss:", score[0])
    print("Test accuracy:", score[1])