In this notebook, we construct a traditional convolutional neural network to classify the MNIST data set.

In [None]:
%tensorflow_version 2.x

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Input
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import numpy as np

The steps below are exactly the same as the ones we applied for the MLP network. There is one additional step, however, which is the `np.expand_dims` step at the end. The reason for this is due to the fact that `Conv2D` layers expect to work with tensors which are shaped as `(width, height, depth)`. Recall that depth can refer to a number of previously constructed filters, or to the number of color channels of the input input. Even when working with a black and white image as we do here, we need to add in an additional dimension with a size of one.

In [None]:
num_classes = 10

(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.astype('float32')
X_test  = X_test.astype('float32')
X_train /= 255
X_test  /= 255
y_train = to_categorical(y_train, num_classes)
y_test  = to_categorical(y_test, num_classes)

# These steps are new:
X_train = np.expand_dims(X_train, axis=3)
X_test  = np.expand_dims(X_test, axis=3)

print(X_train.shape)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
(60000, 28, 28, 1)


In [None]:
model = Sequential([
    Input(shape=(28, 28, 1)),
    Conv2D(16, (3, 3), padding='same', activation='relu'),
    Conv2D(16, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(num_classes, activation='softmax')
])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 16)        160       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 26, 26, 16)        2320      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 16)        0         
_________________________________________________________________
flatten (Flatten)            (None, 2704)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               346240    
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
Total params: 350,010
Trainable params: 350,010
Non-trainable params: 0
__________________________________________________

In [None]:
model.compile(loss='categorical_crossentropy', 
              optimizer='adam',
              metrics=['accuracy'])

We only need to train for two epochs with this CNN to get a good result.

In [None]:
batch_size = 128
epochs = 2

model.fit(X_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(X_test, y_test))

Epoch 1/2
Epoch 2/2


<tensorflow.python.keras.callbacks.History at 0x7f0d40c70208>

In [None]:
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:',     score[0])
print('Test accuracy:', score[1])

Test loss: 0.043157193809747696
Test accuracy: 0.9854999780654907


Compare this with the result of our MLP:

```
Test loss: 0.27593347430229187
Test accuracy: 0.9203000068664551
```

Using only 2 epochs of training in this case, we obtain a better result.