In [13]:
import tensorflow as tf
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist

In [2]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
train_images.shape

(60000, 28, 28)

In [4]:
train_labels.shape

(60000,)

In [5]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [6]:
test_images.shape

(10000, 28, 28)

In [7]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

#### Create NN, feed it with training data (train_images, train_labels). NN will learn to associate the images and labels. Then ask NN to produce predictions for test_images and verify if the predictions match test_labels.

In [8]:
model = models.Sequential([
  layers.Dense(512, activation='relu'),
  layers.Dense(10, activation='softmax')
])

To make the model ready for training, we need to pick three more things, as part of the **compilation** step:

An **optimizer** — The mechanism through which the model will update itself based on the training data it sees, so as to improve its performance.
A **loss function** — How the model will be able to measure its performance on the training data, and thus how it will be able to steer itself in the right direction.
**Metrics to monitor during training and testing** — Here, we’ll only care about accuracy (the fraction of the images that were correctly classified).

In [9]:
model.compile(optimizer='rmsprop',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Before training, we’ll preprocess the data by reshaping it into the shape the model expects and scaling it so that all values are in the [0, 1] interval. Previously, our training images, for instance, were stored in an array of shape (60000, 28, 28) of type uint8 with values in the [0, 255] interval. We transform it into a float32 array of shape (60000, 28 * 28) with values between 0 and 1

In [10]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

In [11]:
train_images.dtype

dtype('float32')

train the model, which in Keras is done via a call to the model’s fit method — we fit the model to its training data

In [12]:
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f21835aee80>

use trained model to predict class probabilities for new digits — images that weren’t part of the training data, like those from the test set

In [14]:
test_digits = test_images[:10]
predictions = model.predict(test_digits)

In [22]:
max_prediction_index = predictions[0].argmax()
max_prediction_index

7

In [23]:
predictions[0][max_prediction_index]

0.9999474

In [24]:
test_labels[0]

7

 computing average accuracy over the entire test set

In [26]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)



In [27]:
print("Test accuracy: ", test_accuracy)

Test accuracy:  0.9818000197410583
