In [1]:
import tensorflow as tf
print('Tensorflow version:', tf.__version__)

Tensorflow version: 2.16.1


# Load a Dataset

Load and prepare the MNIST dataset. The pixel values of the images range from 0 through 255. Scale these values to a range of 0 to 1 by dividing the values by 255.0. This also converts the sample data from integers to floating-point numbers:

In [2]:
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


# Building a machine learning model

Sequential is useful for stacking layers where each layer has one input tensor and one output tensor. Layers are functions with a known mathematical structure that can be reused and have trainable variables. Most TensorFlow models are composed of layers. This model uses the Flatten, Dense, and Dropout layers.

In [3]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape = (28,28)),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

  super().__init__(**kwargs)


In [4]:
predictions = model(x_train[:1]).numpy()
predictions

array([[ 0.26431614,  0.33295038,  0.41134372, -0.69114345, -0.24677163,
        -0.29520088,  0.7586751 , -0.32582572,  0.08616189, -0.15356557]],
      dtype=float32)

The tf.nn.softmax function converts these logits to probabilities for each class:

In [5]:
tf.nn.softmax(predictions).numpy()

array([[0.11800323, 0.12638673, 0.13669328, 0.04538821, 0.0707834 ,
        0.06743709, 0.19346006, 0.06540315, 0.09874669, 0.07769807]],
      dtype=float32)

Define a loss function for training using losses.SparseCategoricalCrossentropy:

In [6]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

The loss function takes a vector of ground truth values and a vector of logits and returns a scalar loss for each example. This loss is equal to the negative log probability of the true class: The loss is zero if the model is sure of the correct class.

This untrained model gives probabilities close to random (1/10 for each class), so the initial loss should be close to -tf.math.log(1/10) ~= 2.3.

In [7]:
loss_fn(y_train[:1], predictions).numpy()

2.69656

Before you start training, configure and compile the model using Keras Model.compile. Set the optimizer class to adam, set the loss to the loss_fn function you defined earlier, and specify a metric to be evaluated for the model by setting the metrics parameter to accuracy.

In [8]:
model.compile(optimizer='adam',
              loss = loss_fn,
              metrics = ['accuracy'])

# Train and Evaluate Model

Use the Model.fit method to adjust your model parameters and minimize the loss:

In [9]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1ms/step - accuracy: 0.8560 - loss: 0.4842
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1ms/step - accuracy: 0.9543 - loss: 0.1557
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1ms/step - accuracy: 0.9664 - loss: 0.1122
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1ms/step - accuracy: 0.9737 - loss: 0.0878
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1ms/step - accuracy: 0.9775 - loss: 0.0735


<keras.src.callbacks.history.History at 0x16f30aaa9f0>

The Model.evaluate method checks the model's performance, usually on a validation set or test set.

In [10]:
model.evaluate(x_test, y_test, verbose=2)

313/313 - 0s - 1ms/step - accuracy: 0.9780 - loss: 0.0704


[0.07040254771709442, 0.9779999852180481]

The image classifier is now trained to ~98% accuracy on this dataset.

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [11]:
probability_model = tf.keras.Sequential([
    model,
    tf.keras.layers.Softmax()
])

In [12]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[1.1386523e-09, 1.7705350e-10, 8.2766820e-07, 1.1284749e-04,
        1.3896496e-11, 3.4351203e-08, 5.2413230e-15, 9.9988127e-01,
        3.2867355e-08, 4.9934260e-06],
       [1.0978155e-06, 9.0788722e-05, 9.9989581e-01, 3.4949612e-06,
        9.7746477e-14, 8.5753676e-08, 2.7584715e-06, 3.9564359e-12,
        5.8948549e-06, 1.0351064e-12],
       [1.8823508e-07, 9.9822444e-01, 1.4132813e-04, 2.0742686e-05,
        2.2197829e-04, 3.6603512e-06, 1.8312921e-06, 1.0714470e-03,
        2.9886828e-04, 1.5474438e-05],
       [9.9987006e-01, 8.4762691e-10, 7.0242699e-05, 1.6933399e-07,
        6.1627287e-07, 3.2516530e-06, 6.2332947e-06, 1.0789309e-05,
        9.9716829e-07, 3.7593622e-05],
       [5.1177384e-08, 2.8408166e-08, 1.0576845e-06, 4.1902148e-08,
        9.9725109e-01, 1.5845837e-06, 1.8161886e-06, 7.7683280e-06,
        9.3477830e-08, 2.7364627e-03]], dtype=float32)>