Introduction to MNIST

Objective : classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9)

In [None]:
# Loading MNIST dataset in Keras
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
train_images.shape

(60000, 28, 28)

In [None]:
test_images.shape

(10000, 28, 28)

The workflow :

1.   Feed the neural network the training data,
(`train_images` and `train_labels`)
2.   Network will then learn to associate images and
labels
3.  Ask the network to produce predictions for `test_images`
4. Verify whether these predictions match the labels from `test_labels`



In [None]:
# The network architecture
# 1. Two Dense layers : 
# 2. 10-way softmax classification layer : return an array of 10 probability 
#    scores (summing to 1)
from tensorflow import keras
from tensorflow.keras import layers
model = keras.Sequential([
 layers.Dense(512, activation="relu"),
 layers.Dense(10, activation="softmax")
])

*   The core building blocks of neural network is the layer
  *   Layer : Some data goes in, and it comes out in a more useful form
  *  Layers extract representations out of the data fed into them
* A deep learning model is like a sieve for data processing, made of a succession of
increasingly refined data filters—the layers
* To make a model ready for training we need :
  * Optimizer - The mechanism through which the model will update itself based
on the training data
  * Loss Function - How the model will be able to measure its performance on the
training data, and thus how it will be able to steer itself in the right direction
  * Metrics - accuracy (the fraction of the images that were correctly classified)


In [None]:
# Compilation Step
model.compile(optimizer="rmsprop",
 loss="sparse_categorical_crossentropy",
 metrics=["accuracy"])

In [None]:
# Preparing the image data (60000, 28, 28) [unit8] => (60000, 28 * 28) [0, 1]
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

In [None]:
# Fitting the model
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f6a49ecf650>

* Two quantites are shown above : 
  * The loss of the model over the training data
  * The accuracy of the model over the training data

In [None]:
# Prediction of the model
test_digits = test_images[0:10]
predictions = model.predict(test_digits)
predictions[0]

array([1.2306888e-08, 2.4567282e-10, 4.8447487e-06, 7.3256044e-05,
       2.9436949e-11, 1.6762909e-08, 1.0809133e-14, 9.9990594e-01,
       4.5081336e-08, 1.5757565e-05], dtype=float32)

In [None]:
# Evaluationg the model on new data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

test_acc: 0.9805999994277954


* The test data accuracy is 98.05% (slightly lower than training data accuracy of 98.87%)
* This gap between training accuracy and test accuracy is an
example of overfitting