A mini project in MNIST using TensorFlow

In [1]:
import tensorflow as tf

In [2]:
import tensorflow_datasets

In [3]:
(train, test), info = tensorflow_datasets.load('mnist', split=['train', 'test'], as_supervised=True, with_info=True)

[1mDownloading and preparing dataset 11.06 MiB (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to ~/tensorflow_datasets/mnist/3.0.1...[0m


Dl Completed...:   0%|          | 0/4 [00:00<?, ? file/s]

[1mDataset mnist downloaded and prepared to ~/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.[0m


## **Train Part**

In [4]:
train

<PrefetchDataset element_spec=(TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>

According to the above results, each row is a 28*28 image nad a lable

As we now each pixel is an integer number between 1 to 256. The following function receives an image and a lable (1 row of the dataset) and nomalize it. 

To achieve this we should first of all cast each pixel from tf.uint8 to float (either tf.float32 or tf.float64). I prefer tf.float32.

In [5]:
def normalize(image, label):
    return tf.cast(image, tf.float32) / 256, label
train = train.map(normalize)
train

<MapDataset element_spec=(TensorSpec(shape=(28, 28, 1), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>

In [6]:
train = train.cache()

In [7]:
info.splits['train']

<SplitInfo num_examples=60000, num_shards=1>

In [8]:
train = train.shuffle(info.splits['train'].num_examples)

In [9]:
train = train.batch(256)

In [12]:
train = train.prefetch(tf.data.experimental.AUTOTUNE)

In [13]:
train

<PrefetchDataset element_spec=(TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int64, name=None))>

## **Test Part**

In [14]:
test = test.map(normalize)
test

<MapDataset element_spec=(TensorSpec(shape=(28, 28, 1), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>

In [15]:
test = test.batch(256)

In [16]:
test = test.cache()
test = test.prefetch(tf.data.experimental.AUTOTUNE)

## **Model**

In order to build the model, first of all I flatten each 28*28 image into a 784-dimention row. So the first layer of our model is made up of 784 neurons.

I decided to have only 1 hiden layer which is made up of 100 neurons with **relu** activation functon, And the output layer should obviously contain 10 neurons (the images are showing an integer between 0 to 9) and I choose **softmax** to be the activation function of neurons of this layer.


In [17]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

* **Loss:**

Generally loss/cost functions are used to optimize the model during training. The objective is almost always to minimize the loss function. The lower the loss the better the model. **Cross-Entropy loss** is a most important cost function. It is used to optimize classification models. This loss function is used when there are two or more label classes. In this project, Softmax outputs probabilities. The purpose of the Cross-Entropy is to take the these probabilities and measure the distance from the truth values. 

* **Metrics:**

Keras accuracy metrics are functions that are used to evaluate the performance of your deep learning model. Keras provides a rich pool of inbuilt metrics. Depending on your problem, you’ll use different ones.

**Categorical Accuracy:**
This metric is used for classification problems involving more than two classes. Like our dataset, MNIST, that has 10 classes.

Categorical Accuracy calculates the percentage of predicted values (yPred) that match with actual values (yTrue) for one-hot labels.

for each row is compares the index of the maximal true value with the index of the maximal predicted value. In other words **“how often predictions have maximum in the same spot as true values”**. To acheive this, First, it identifies the index at which the maximum value occurs using **argmax()** If it is the same for both yPred and yTrue, it is considered accurate. It computes the **mean accuracy** rate across all predictions.

In [23]:
model.compile(optimizer=tf.keras.optimizers.Adam(0.001), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=[tf.keras.metrics.CategoricalAccuracy()])

In [24]:
model.fit(train, epochs=10, validation_data=test)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7ff7a6b8b150>

An example of prediction:

In [26]:
model.predict(test)[0]

array([1.9356178e-11, 1.8541074e-09, 9.9999356e-01, 7.4125523e-08,
       3.0247133e-08, 3.4786538e-09, 1.2138639e-06, 1.7588035e-11,
       5.1738139e-06, 2.4990656e-09], dtype=float32)

The array that we have received from the code above is the output of the neurons of the last layer of our model. We need the argmax of this array in order to know the predicted label.

In [25]:
import numpy as np
np.argmax(model.predict(test)[0])

2