In [1]:
from tensorflow.keras.datasets import mnist

We will be looking at how to implement a NN-based digit classifier, the 'Hello World' of computer vision & deep learning. So we save the train + test images and labels from the mnist dataset to tuples.




In [2]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
train_images.shape, test_images.shape #a tensor with 60k 28 * 28 matrices (28 by 28 pixels of digits)

((60000, 28, 28), (10000, 28, 28))

In [4]:
len(train_labels), len(test_labels)

(60000, 10000)

Implementing the NN architecture with tf & Keras. We'll be implementing two dense fully-connected layers - the first using a rectified linear unit activation function. The second, the classification layer using softmax to generate probabilities. We will then implement an optimizer (for gradient descent & back propagation), A loss function (to measure the distance between our model and the correct predictions during training) and our model metrics, to evaluate our model.

In [5]:
from tensorflow import keras
from tensorflow.keras import layers

In [6]:
model = keras.Sequential([
    layers.Dense(512, activation = "relu"),
    layers.Dense(10, activation = "softmax")
])

In [7]:
model.compile(optimizer = "rmsprop",
              loss = "sparse_categorical_crossentropy",
              metrics = ["accuracy"])

We need to reshape and scale our model, scale between [0, 1] to speed up computation, and reshape to ensure that all matrix/tensor operations are valid. 

In [8]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype("float32") /255

test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype("float32") /255

Model is now ready to be trained with the keras fit() function, 

In [9]:
model.fit(train_images, train_labels, epochs = 5, batch_size = 128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fd2106c6550>

Trains at 98.82% accuracy (training), next step is to use the model to make predictions

In [10]:
test_digits = test_images[0:10]

predictions = model.predict(test_digits)
predictions[0]

array([1.1273890e-07, 1.0296161e-09, 2.4231335e-06, 5.0567578e-05,
       7.7990614e-11, 3.4575454e-08, 4.6118911e-14, 9.9994326e-01,
       5.2128996e-07, 3.2327878e-06], dtype=float32)

In [11]:
predictions[0].argmax() #Model predicts that the first digit classified is a 7

7

In [12]:
test_labels[0] #Which was correct

7

In [14]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test acc: {test_acc}")

test acc: 0.9778000116348267


97.8% accuracy on the test set. As this is lower than the training accuracy, there is a degree of overfitting, but that is to be expected.