In [1]:
import tensorflow as tf

TensorFlow version: 2.6.0


In [2]:
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.6.0


# Load a dataset

In [3]:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Build a machine learning model

Build a `tf.keras.Sequential` model by stacking layers.

In [4]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

For each example, the model returns a vector of `logits` or `log-odds` scores, one for each class.

In [5]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.40473342, -0.29072082, -0.3236487 , -0.7757125 ,  0.07120637,
        -0.02493709, -0.28406855,  0.29677203,  0.2936807 ,  0.19655187]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to *probabilities* for each class:

In [6]:
tf.nn.softmax(predictions).numpy()

array([[0.07170074, 0.08035977, 0.07775678, 0.04947769, 0.1154042 ,
        0.10482553, 0.08089612, 0.14460509, 0.14415875, 0.13081528]],
      dtype=float32)

Define a loss function for training use `losses.SparseCategoricalCrossentrpy`, which takes a vector of logits and a `True` index and returns a scalar loss for each example.

In [7]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

The loss is equal to the negative log probability of the true class: The loss is zero if the model is sure of the correct class.

The untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.math.log(1/10) ~= 2.3`j

In [8]:
loss_fn(y_train[:1], predictions).numpy()

2.2554579

Before you start training, configure and compile the model using Keras `Model.compile`. Set the `optimizer` class to `adam`, set the `loss` to the `loss_fn` function you defined earlier, and specify a metric to be evaluated for the model by setting the `metrics` parameter to `accuracy`.

In [9]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

## Train and evaluate your model

Use the `Model.fit` method to adjust your model parameters and minimize the loss:

In [10]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x168d7c42c40>

The `Model.evaluate` method checks the models performance, usually on a `validation-set` or `test-set`

In [11]:
model.evaluate(x_test, y_test, verbose=2)

313/313 - 2s - loss: 0.0724 - accuracy: 0.9780


[0.07241009920835495, 0.9779999852180481]

The image classifier is now trained to ~98% accuracy on the dataset. 

If we want the model to return a probability, we can wrap the trained model, and attach the softmax to it.

In [12]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [13]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[9.4331071e-08, 1.8843404e-07, 2.2812039e-06, 4.9344566e-05,
        1.5255871e-12, 2.6980874e-08, 5.7849480e-12, 9.9994779e-01,
        9.0120253e-08, 3.1460758e-07],
       [2.2605376e-07, 3.3183198e-05, 9.9994576e-01, 1.6475213e-05,
        1.3125157e-14, 1.8247939e-06, 6.5421117e-09, 5.3642563e-14,
        2.6502428e-06, 2.1070907e-13],
       [1.3739691e-06, 9.9853778e-01, 1.7382318e-04, 2.0978567e-05,
        1.7076254e-05, 1.4287379e-05, 3.9031427e-05, 9.8268373e-04,
        2.1254388e-04, 3.2765226e-07],
       [9.9919587e-01, 1.0403673e-08, 6.4418832e-04, 1.2773200e-05,
        3.9448409e-07, 2.5724783e-05, 6.0944971e-05, 7.0159772e-06,
        5.8757755e-06, 4.7243338e-05],
       [6.7055344e-06, 2.2116629e-07, 1.7468301e-05, 5.2622630e-07,
        9.9526513e-01, 2.8908710e-06, 8.2595352e-06, 2.6998125e-04,
        1.4750537e-05, 4.4142026e-03]], dtype=float32)>