In [1]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

In [2]:
num_classes = 10 # Output layer will be in the form of a number in [0,9]
input_shape = (28, 28, 1) # Input image is in a 28 X 28 pixel format

# Loading test/train data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

We are importing an image where each pixel is in grayscale.

This is represented in a range of 0 to 255, 0 being Black and 255 being white

For easier representation, we divide each element by 255 to scale down to [0,1]

![Matrix visualised](./images/MNIST-Matrix.png "28x28 Matrix with scaled down integers")

In [3]:
# To ensure shape of matrix is 28 X 28
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)

print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


In [4]:
# Building the model
model = keras.Sequential(
    [
        keras.Input(shape=input_shape), # setting input shape as 28 X 28 X 1 (2d input)
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

Explaining the previous cell

First we define the input size to the model.

Then we define the kernel size and the activation function to be used, while relu is most used, you may also use tanh

![RelU vs TanH](./images/relu.jpeg "RelU vs TanH")

Then we define the pooling size, in our case 2X2. this basically selects the max value in a 2x2 matrix

Then we repeat the process, as we have chosen 2 conv layers, after which we flatten them.

Then in order to reduce the risk of overfitting we set the dropout to 0.5 i.e. some neurons will be randomly turned on and off to add entropy

After which we use softmax to give the output in the form of a probablity array

In [5]:
# Gives a summary of the designed model
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 13, 13, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 1600)              0         
                                                                 
 dropout (Dropout)           (None, 1600)              0

In [6]:
# Now we train the model

batch_size = 128
epochs = 15

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.callbacks.History at 0x23f3aedc2e0>

In [8]:
# Testing with test data
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])


Test loss: 0.034038905054330826
Test accuracy: 0.9901000261306763
