NEURAL NETWORK EXAMPLE

Using Python library Keras to classify handwritten digits.

We are using the MNIST dataset. It contains a set of 60,000 training images and 10,000 test images. The goal is to classify grayscale images of handwritten digits(28 x 28 pixels) into 10 categories (0 to 9).

In [9]:
#loading dataset (preloaded in keras)

from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [10]:
train_images.shape

(60000, 28, 28)

In [11]:
len(train_labels)

60000

In [12]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [13]:
test_images.shape

(10000, 28, 28)

In [14]:
len(test_labels)

10000

In [15]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

In [16]:
from tensorflow import keras

from tensorflow.keras import layers 
model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

In [17]:
#pick 3 more things for the compilation step ; optimizer, loss_function, metrics to monitor

model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

In [19]:
#preparing the image data 

train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255


In [20]:
# feeding the neural network with the training data 
# fit model to its training data

model.fit(train_images, train_labels, epochs=5, batch_size=128)

#we see an accuracy of  98.91% on the training data 

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f9a14401310>

In [21]:
#using model to make predictions for the test dataset

test_digits = test_images[0:10]
predictions = model.predict(test_digits)
predictions[0]

array([4.6065693e-09, 9.0155808e-12, 5.9519078e-07, 1.8120321e-06,
       3.8137766e-12, 1.0422464e-09, 1.6675834e-13, 9.9999726e-01,
       1.8643746e-08, 3.1695953e-07], dtype=float32)

the highest probability score that the first digit image (test_digit[0]) has is 0.99999726, and it is at index 7 so according to our model, it must be a 7

In [22]:
predictions[0].argmax()

7

In [23]:
predictions[0][7]

0.99999726

In [24]:
#checking if test label agrees

test_labels[0]

7

In [25]:
#evaluating model on new data

test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

test_acc: 0.978600025177002


the test accuracy is 97.8% which is quite lower than the training set accuracy of 98.9%. this is an example of overfitting. Model performs better on traning step rather than the test set.