## Classification of the MNIST data using Fully Connected Networks

The MNIST Dataset has 28 x 28 black and white images of digits from 0 to 9. It is one of the most common datasets for starting up with deep learning. It comes in built in the keras package.

This notebook will walk you through the developing a classification model for the dataset using Fully Connected Networks.

### Importing required libraries

In [None]:
import numpy as np
from sklearn.metrics import confusion_matrix, classification_report
from matplotlib import pyplot as plt

from keras.layers import Input, Flatten, Dense, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.activations import *
from keras.optimizers import *
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Model, Sequential

### Loading the dataset

MNIST comes as a part of the keras datasets. It contains 60,000 training images while 10,000 test images

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = (X_train.astype(np.float32) - 127.5)/127.5

In [None]:
X_train.shape, X_test.shape

### Data Preprocessing

The labels are converted from singular to one-hot encoded values from 0 to 9. 

That is, if a given image corresponds to 5, its encoding will be [0,0,0,0,0,1,0,0,0,0]

In [None]:
num_classes = 10

# convert class vectors to binary class matrices
y_train = to_categorical(y_train, num_classes)
#y_test = to_categorical(y_test, num_classes)

### Creating a Keras Model

We will build the model using the Sequential API. Since we only want MLP based network, we will use Dense layers for fully connecting neurons.Simply go on adding a layer as it pleases. 

In [None]:
model1 = Sequential()
model1.add(Flatten(input_shape=(28,28)))
model1.add(Dense(512,activation='relu'))
#model1.add(Dropout(0.2))
model1.add(Dense(256,activation='tanh'))
#model1.add(Dropout(0.2))
model1.add(Dense(num_classes,activation='softmax'))
model1.summary()

The final model is compiled using an optimizer, a loss function and a metric for performance improvement. 
- The loss function is used to depict how far is the current model from the ideal answer
- The optimizer refers to the method that will be used to minimize the loss
- The metrics correspond to how we want to measure the performance of the network

In [None]:
model1.compile(loss='mse', optimizer=Adam(), metrics=['accuracy'])

### Training and testing

Training is done using the function fit(). We train out network for 5 epochs


In [None]:
model1.fit(X_train,y_train, epochs=5)

In [None]:
y_check = model1.predict(X_test)
y_pred = np.array([np.argmax(y_check[j]) for j in range(len(y_check))])
y_test

The model can be evaluated by multiple metrics. Here we are using a confusion matrix and classification report from Sklearn and evaluate function by Keras. 
- The confusion matrix has y_true on the vertical and y_pred on the horizontal axes respectively; it gives a measure of how many y_true happened to be classified in each of the categories.
- Classification Report is a collection of class wise Precision and Recall scores
- Evaluate() returns the accuracy for a particular model on the data given to evaluate.

In [None]:
confusion_matrix(y_test, y_pred)

In [None]:
print classification_report(y_test, y_pred)

In [None]:
model.evaluate(X_test,y_test)

## References:
1. [Keras' official example](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py) on Github
2. [Documentation References](https://keras.io/) for more info about every function/layer