### In this notebook we will classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9) using MNIST dataset from keras package

In [18]:
# Import required libraries/packages
import numpy as np
from keras.datasets import mnist
from keras.utils import to_categorical
from keras import models
from keras import layers


In [2]:
(train_images, train_labels) , (test_images, test_labels) = mnist.load_data()

* The train_images & train_labels together form training data set, the data that the model will learn from. The Model is then tested on test data set - test_images & test_labels. The images are encoded as Numpy arrays & labels are an array of digits ranging from 0 - 9

In [8]:
print("Train data dimension =", train_images.shape)
print("Test data dimension =", test_images.shape)
print("Train labels =", train_labels.shape)
print("Test labels =", test_labels.shape)

Train data dimension = (60000, 28, 28)
Test data dimension = (10000, 28, 28)
Train labels = (60000,)
Test labels = (10000,)


### Data Pre-processing
* reshape input data in the format the network expects
* Scale the input data
* Categorical encoding of the labels

In [19]:
# Reshaping input image data 
train_shape = train_images.shape
test_shape = test_images.shape
train_images = train_images.reshape((train_shape[0], np.prod(train_shape[1:])))
test_images = test_images.reshape((test_shape[0], np.prod(train_shape[1:])))

In [20]:
# Scaling input data
train_images = train_images.astype(np.float32)/255
test_images = test_images.astype(np.float32)/255

In [21]:
# Categorical encoding of the labels
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

### Build the neural network 
* Core building block of  network is "layer" which is data-processing module. Layers extract representations out of the data fed into them, hopefully more meaningfull representations for the problem at hand
* Deep Learning is more or less chaining together these simple layers, that will kind of form a progressive data distillation.
* Below network consists of 2 Dense (fully connected) neural layers. Last layer is a 10-way softmax layer, which will return an array of 10 probability scores summing to 1.

In [11]:
network = models.Sequential()
network.add(layers.Dense(512, activation="relu", input_shape=(28*28, )))
network.add(layers.Dense(10, activation="softmax"))

### Network Compilation 
To make network ready for training it requires 3 more things to be configured as part of Compilation step
1. Loss Function : To measure network performance on training data and to steer itself in right direction
2. Optimizer : Mechanism through which netwrok will update itself based on input data & loss function
3. Metrics to monitor during training & testing : Based on the kind of problem the metrics differ. Example for Classification -> Accuracy & for Regression -> RMSE  

In [24]:
network.compile(loss = "categorical_crossentropy", optimizer="rmsprop", metrics=['accuracy'])

### Training the network
In Keras, network training is done via call to networks's fit method -> fit the model to its train data

In [26]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x2295febdb38>

### Evaluate model performance on test data

In [30]:
test_loss, test_accuracy = network.evaluate(test_images, test_labels)
print("Test Accuracy :", test_accuracy)

Test Accuracy : 0.9819
