In [3]:
import tensorflow as tf

Load and prepare the MNIST dataset. Convert the samples from integers to floating-point numbers:

In [5]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Build the `tf.keras.Sequential` model by stacking layers. Choose an optimizer and loss function for training:

In [10]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

For each example the model returns a vector of [logits](https://developers.google.com/machine-learning/glossary#logits) or [log-odds](https://developers.google.com/machine-learning/glossary#log-odds) scores, one for each class.

In [11]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.3471426 ,  0.65173316,  0.01225733,  0.3861053 , -0.4628726 ,
        -0.3014609 ,  0.45240572,  0.05118516,  0.17452842, -0.9740951 ]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to "probabilities" for each class: 

In [12]:
tf.nn.softmax(predictions).numpy()

array([[0.06622556, 0.17981745, 0.09486609, 0.13787043, 0.05898814,
        0.06932102, 0.14732112, 0.09863184, 0.11157951, 0.03537884]],
      dtype=float32)

The `losses.SparseCategoricalCrossentropy` loss takes a vector of logits and a `True` index and returns a scalar loss for each example.

In [13]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

This loss is equal to the negative log probability of the true class: It is zero if the model is sure of the correct class.

This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.log(1/10) ~= 2.3`.

In [14]:
loss_fn(y_train[:1], predictions).numpy()

2.669007

In [15]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

The `Model.fit` method adjusts the model parameters to minimize the loss: 

In [16]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fe5768634a8>

The `Model.evaluate` method checks the models performance, usually on a [Validation-set](https://developers.google.com/machine-learning/glossary#validation-set) or [Test-set](https://developers.google.com/machine-learning/glossary#test_set).

In [17]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 0s - loss: 0.0726 - accuracy: 0.9765


[0.0725918784737587, 0.9764999747276306]

The image classifier is now trained to ~98% accuracy on this dataset.