# **Classify Handwritten Digits - MNIST Dataset**

Problem we are trying to solve here is to classify a **grayscale images** of handwritten digits **(28 x 28 pixels)** into their **10 categories** (0 to 9).

The dataset consists of **60,000 training** images and **10,000 test** images.
The dataset was assembled by National Institiue of Standards and Technology (NIST) in 1980s.

## **Steps**

1) Importing the dataset

2) Building the Deep Learning Architecture

3) Complilation Step

4) Preparing the image data (depondent Variable)

5) Preparing the labels (target/Independent variable)

6) Training the Network

7) Testing the Network

## **1. Importing the Dataset**

In [None]:
from keras.datasets import mnist

train_images and train_labels from the training set, the data the model will learn from.

The model will be tested on test_images and test_labels.


In [None]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

The images are encoded as Numpy arrays, and the labels are and array of digits, ranging from 0 to 9.

The images and labels have a one-to-one correspondence.


In [None]:
import matplotlib.pyplot as plt
plt.subplot(1,4,1)
plt.imshow(train_images[0], cmap=plt.cm.binary)

plt.subplot(1,4,2)
plt.imshow(train_images[1], cmap=plt.cm.binary)

plt.subplot(1,4,3)
plt.imshow(train_images[2], cmap=plt.cm.binary)

plt.subplot(1,4,4)
plt.imshow(train_images[3], cmap=plt.cm.binary)

plt.show()

In [None]:
train_images[1].shape

In [None]:
#Training data
print("Train Image Shape: ",train_images.shape)
print("Train dataset Length: ",len(train_labels))
print("Train Labels: ",train_labels)

In [None]:
#Testing data
print("ain TrImage Shape: ",test_images.shape)
print("Train dataset Length: ",len(test_labels))
print("Train Labels: ",test_labels)

**Workflow**

1.   Feed The Neural network with training data - train_images, train_labels.
2.   Ask the neural network to produce predictions for test_images.
3.   Verify the predictions to match the labels from test_labels.



### **2. The Network Architecture**

In [None]:
from keras import models
from keras import layers

In [None]:
network = models.Sequential()
network.add( layers.Dense(512, activation='relu', input_shape=(28 * 28,) ) )
network.add( layers.Dense(10, activation='softmax') )

**Layers**

---

The core building block of the neural network is the ***layer***. It can be thought of as a data preprocessing module, which acts as a filter for the data.


Some data goes in and it comes out as a more useful form. Specifically, layers extract ***representations*** out of the data fed into them. 
Hopefully, representations that are more meaningful to the problem in hand. 


Most of deep learning consis of chaining together simple layers that will implement a a form of progressive ***data distillation***.


A deep learning model is like a sieve for data preprocessing, made of succession of increasingly refined data filters - the layers




**Network Architecture Layers**

---

**1.**   Here our network consist of two ***Dense layers***, which are densely connected (also called fully connected) neural layers.


**2.**   The second (and last) layer is a **10-way softmax layer**, which means it will return an array of 10 probability scores (summing to 1). Each score will be the probability that the current digit image belongs to one of our 10 digit classes.

***Compilation Step***

---

To make the network ready for training we need to pick three more things as part of the compilation steps:


**1.**   **A Loss Function** - How the network will be able to measure its performance on the training data, and thus how it willl be able to steer itself in the right direction.

**2.**   **An optimizer** - The mechanism through whichthe ntwork wil update itself based on the data it sees and its loss function.

**3.**   **Metrics to monitor during training and testing phase** - Here, we'll only care about accuracy (The fraction of the images that were correctly classified.)

### **3. The Compilation Step**

In [None]:
network.compile( optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
network.summary()

### **4. Preparing the image data**

Before training we will preprocess the data by reshaping it into theshape the network expects and scaling it so that all values are in the **[0,1] interval**.

Previously, our training images, for instance were stored in the array of shape **(6000, 28,28)** of **type uint8** with values int he range **[0,288]** interval. 

We transform it into a **float32 array** of shape **(6000,28*28)** with values between **0 and 1.**

In [None]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32') / 255

In [None]:
test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32') / 255

### **5. Preparing the labels**

We now need to categorically encode the labels.


In [None]:
train_labels[0]

In [None]:
from tensorflow.keras.utils import to_categorical

In [None]:
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In [None]:
train_labels[0]

### **6. Training the network**

now we are ready to train the network, which is done by calling the **fit** method - where we fit the model to the training data.


In [None]:
network.fit(train_images, train_labels, epochs=10, batch_size=128)

Two quantities are displayed during training:



1.   **The loss of the network over the training data**.

2.   **The accuracy of the network over the training data**.

we quicklyreach an **accuracy of 98.85%** on the training data.


### **7. Testing the network**

Now let us check if the model performs well on the **test set**:The **test accuracy** turned out to be **98.07%** - thats a bit lower than the training set accuracy.

This **gap between training and test accuracy** can be explained with the help of **overfitting**: The fact that machine learning models tend to  perform worse on new data than on their training data.



In [None]:
test_loss, test_acc = network.evaluate(test_images, test_labels)

In [None]:
print("Test accuracy: ",test_acc)
print("Test loss: ", test_loss)

The **test accuracy** turned out to be **98.07%** - thats a bit lower than the training set accuracy.

This **gap between training and test accuracy** can be explained with the help of **overfitting**: The fact that machine learning models tend to  perform worse on new data than on their training data.

