In [1]:
# import the famous mnist dataset, a classic in the ML community
# MNIST is a dataset of 60k training images and 10k test images assembled by National Institute of Standard and Technology
# solving MNIST is the hello world of deeplearning
from keras.datasets import mnist

Using TensorFlow backend.


In [2]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
train_images.shape

(60000, 28, 28)

In [4]:
len(train_labels)

60000

In [5]:
test_images.shape

(10000, 28, 28)

In [6]:
len(test_labels)

10000

In [7]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

# WORKFLOW

First we will feed the neural network the training data, train_images and train_labels. The network will then learn to associate images and labels. Finally we will ask the network to produce predictions for test_images, and will verify whether these predictions match the labels from test_labels

Layer is the core building block of neural networks. It is a data processing module that can be thought of as a filter for data. Some data goes in and it comes out in a more useful form. Layers extract representations out of data fed into them. Deep learning consists of chaining together simple layers that will implement a form of progressive data distillation. 

In [17]:
from keras import models
from keras import layers

Here, the network consists of a sequence of two dense layers also called a fully connected neural layers. The second layer is a 10 way softmax layer, will will return an array of 10 probability scores (summing to 1). Each score will be probability that the current digit image belongs to one of the 10 digits

In [9]:
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation = 'softmax'))

# Make the layer ready for Training

To make the layer ready for training, we need to pick 3 more things as part of the compilation step

1. Loss Function - How the network will be able to meausre its performance on the training data, and how it will be able to steer itself in the right direction

2. An optimizer - The mechanism through which the network will update itself based on the data it sees and its loss function.

3. Metrics to monitor during training and testing - Nothing but accuracy (the fraction of images that were correctly classified)

In [18]:
network.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy'])

# Steps before Training the network

Before training, 

1. We will reprocess the data by reshaping it into the shape the network expects.
2. Scaling it so that all values are in the [0 , 1] interval.

Previously the training images were stored in an array of shape (60000, 28 , 28) of type uint8 with values in [0,255] interval. We transform it into a float32 array of shape (60000, 28, 28) with values between 0 and 1

In [19]:
train_images = train_images.reshape((60000,28 * 28))
train_images = train_images.astype('float32') / 255

In [20]:
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

In [21]:
from keras.utils import to_categorical

In [14]:
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Training the network by fitting the model to its training data

In [22]:
network.fit(train_images, train_labels, epochs = 5,batch_size = 128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x244bc2b7198>

# Checking how the model performs on test data set

In [23]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
print('test_acc:', test_acc)

test_acc: 0.9172


# Conclusion:

The test accuracy (test_acc) turns out to be lesser than the training set accuracy. This gap between training accuracy and 
test accuracy is an example of overfitting - the fact that machine learning models tend to perform worse on new data than 
the training data.