create : @tarickali 23/31/05 \
update : @tarickali 23/31/05 \
source : https://www.tensorflow.org/tutorials/quickstart/beginner

# TensorFlow 2 Quickstart for Beginners

Topics:
- Load a prebuilt dataset
- Build a neural network model that classifes images
- Train and evaluate a neural network model

In [1]:
# Import TensorFlow and checks its version
import tensorflow as tf
tf.__version__

'2.12.0'

## Load dataset

To start, we will load our dataset (MNIST) and prepare the data for training.

In [3]:
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Prepare data by normalizing x_train and x_test to have values between [0, 1]
x_train, x_test = x_train / 255., x_test / 255.

# NOTE: shape of single x image is (28, 28)

(60000, 28, 28)


## Build a machine learning model

In [4]:
# Build a Sequential model
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB



The model we defined above is a `Sequential` model that is composed of layers stacked ontop of each other.
An input tensor flows through the model one layer at a time from the first layer to the last output layer.
The model uses three types of keras layers: `Flatten`, `Dense`, and `Dropout`. Note that there are many other layers available in `keras.layers`.

The output of the model are logits, which can be intrepreted as the score of each class as predicted by the model.

In [13]:
logits = model(x_train[:1])
logits

<tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[ 0.07729129,  0.31014988, -0.00361805, -0.13739625, -0.03916094,
        -0.10212111, -0.49514657,  0.16622578, -0.31897706,  0.14071107]],
      dtype=float32)>

We can use `tf.nn.softmax` to convert the logits to probabilities of each class.

**Note:** The `softmax` function can also be found at `tf.math.softmax` and `tf.keras.activations.softmax` (requires logits to be a `Tensor`).

**Note:** It is possible to have the `softmax` functionality in the model by setting `activation='softmax'` in the final layer, however it is **not recommended** since it is not numerically stable.

In [16]:
tf.nn.softmax(logits)

<tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[0.10973859, 0.13851237, 0.1012094 , 0.08853637, 0.0976753 ,
        0.09171525, 0.06190885, 0.11994527, 0.07383499, 0.11692361]],
      dtype=float32)>

We will now define a loss function for training.

In [19]:
# NOTE that we set from_logits to True
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

The loss function typically used for multi-class classification tasks is `SparseCategoricalCrossentropy` which takes in the ground truth values and a vector of logits and returns the negative log probability of the true class.
If the loss is 0 then model is sure of the correct label.

Since the model is untrained, the output of `loss_fn` on `logits` should be close to $-\log(1/10) \approx 2.3$

In [23]:
loss_fn(y_train[:1], logits).numpy()

2.3890667

Before we can train our model, we need to configure and compile it with an optimizer, loss function, and metrics.
We will use the Adam optimizer, the loss function `loss_fn` defined above, and the accuracy metric.

In [24]:
model.compile(
    optimizer='adam',
    loss=loss_fn,
    metrics=['accuracy']
)

## Train and evaluate your model

We are now ready to train our model using `model.fit` on `(x_train, y_train)`...

In [25]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x17ecf9050>

... and then evaluate the model using `model.evaluate` on `(x_test, y_test)`.

In [27]:
model.evaluate(x_test, y_test, verbose=2)

313/313 - 1s - loss: 0.3763 - accuracy: 0.9055 - 1s/epoch - 4ms/step


[0.37628376483917236, 0.905500054359436]

Just like that we have an image classifier that has an accuracy of ~90%!

Finally, if we want our model to return probabilities we can attach a `Softmax` layer ontop of our trained model.

In [38]:
probability_model = tf.keras.models.Sequential([
    model,
    tf.keras.layers.Softmax()
])

probability_model(x_train[:1])

<tf.Tensor: shape=(1, 10), dtype=float32, numpy=
array([[5.9971984e-07, 6.0447677e-05, 5.9034908e-05, 1.7659027e-04,
        1.8492098e-11, 9.9970174e-01, 6.5866220e-07, 4.7255162e-07,
        5.1461927e-07, 1.1878882e-09]], dtype=float32)>

## Conclusion

We have just trained a machine learning model using Keras's `Sequential` API on a prebuilt dataset (MNIST).