## Classification of the MNIST data using Fully Connected Network

The MNIST Dataset has 28 x 28 black and white images of digits from 0 to 9. It is one of the most common datasets for starting up with deep learning. It comes in built in the keras package.

This notebook will walk you through the developing a classification model for the dataset using Fully Connected Networks.

### Importing required libraries

In [40]:
import numpy as np
from sklearn.metrics import confusion_matrix, classification_report
from matplotlib import pyplot as plt

from keras.layers import Input, Flatten, Dense, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.activations import *
from keras.optimizers import *
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Model, Sequential

### Loading the dataset

MNIST comes as a part of the keras datasets. It contains 60,000 training images while 10,000 test images

In [21]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = (X_train.astype(np.float32) - 127.5)/127.5
#X_train = X_train.reshape(60000, 784)

In [22]:
X_train.shape, X_test.shape

((60000, 28, 28), (10000, 28, 28))

### Data Preprocessing

The labels are converted from singular to one-hot encoded values from 0 to 9. 

That is, if a given image corresponds to 5, its encoding will be [0,0,0,0,0,1,0,0,0,0]

In [23]:
num_classes = 10

# convert class vectors to binary class matrices
y_train = to_categorical(y_train, num_classes)
#y_test = to_categorical(y_test, num_classes)

### Creating a Keras Model

We will build the model using the Sequential API. Since we only wan't MLP based network, we will use Dense layers for fully connecting neurons.Simply go on adding a layer as it pleases. 

In [33]:
model1 = Sequential()
model1.add(Flatten(input_shape=(28,28)))
model1.add(Dense(512,activation='relu'))
#model1.add(Dropout(0.2))
model1.add(Dense(256,activation='tanh'))
#model1.add(Dropout(0.2))
model1.add(Dense(num_classes,activation='softmax'))
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_4 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_10 (Dense)             (None, 512)               401920    
_________________________________________________________________
dense_11 (Dense)             (None, 256)               131328    
_________________________________________________________________
dense_12 (Dense)             (None, 10)                2570      
Total params: 535,818
Trainable params: 535,818
Non-trainable params: 0
_________________________________________________________________


The final model is compiled using an optimizer, a loss function and a metric for performance improvement. 
- The loss function is used to depict how far is the current model from the ideal answer
- The optimizer refers to the method that will be used to minimize the loss
- The metrics correspond to how we want to measure the performance of the network

In [34]:
model1.compile(loss='mse', optimizer=Adam(), metrics=['accuracy'])

### Training and testing

Training is done using the function fit(). We train out network for 5 epochs


In [35]:
model1.fit(X_train,y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fa34cf78810>

In [36]:
y_check = model1.predict(X_test)
y_pred = np.array([np.argmax(y_check[j]) for j in range(len(y_check))])
y_test

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

In [37]:
confusion_matrix(y_test, y_pred)

array([[ 941,    0,    0,   25,    0,    8,    3,    0,    2,    1],
       [   0, 1124,    0,    7,    0,    1,    1,    0,    2,    0],
       [  12,    9,  791,  189,    5,    2,    8,    8,    8,    0],
       [   0,    0,    0, 1005,    0,    0,    0,    3,    2,    0],
       [   8,    8,    1,    9,  897,    1,   10,    2,    7,   39],
       [   5,    2,    0,   91,    0,  774,    6,    0,   11,    3],
       [  11,    5,    0,   13,    1,   15,  902,    0,   10,    1],
       [   7,   16,    9,   59,    1,    2,    0,  858,    1,   75],
       [   0,    8,    1,  150,    3,    4,    2,    1,  801,    4],
       [   6,   16,    0,   56,    9,    2,    0,    5,    7,  908]])

In [38]:
print classification_report(y_test, y_pred)

             precision    recall  f1-score   support

          0       0.95      0.96      0.96       980
          1       0.95      0.99      0.97      1135
          2       0.99      0.77      0.86      1032
          3       0.63      1.00      0.77      1010
          4       0.98      0.91      0.95       982
          5       0.96      0.87      0.91       892
          6       0.97      0.94      0.95       958
          7       0.98      0.83      0.90      1028
          8       0.94      0.82      0.88       974
          9       0.88      0.90      0.89      1009

avg / total       0.92      0.90      0.90     10000



## References:
1. [Keras' official example](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py) on Github
2. [Documentation References](https://keras.io/) for more info about every function/layer

In [None]:
# Keras' Example

# print(X_train.shape[0], 'train samples')
# print(X_test.shape[0], 'test samples')

# num_classes = 10

# # convert class vectors to binary class matrices
# y_train = to_categorical(y_train, num_classes)
# y_test = to_categorical(y_test, num_classes)

# model = Sequential()
# model.add(Flatten(input_shape=(28,28)))
# model.add(Dense(1024,activation='relu'))
# #model.add(Dense(512, activation='relu', input_shape=(784,)))
# model.add(Dropout(0.2))
# model.add(Dense(512, activation='relu'))
# model.add(Dropout(0.2))
# model.add(Dense(num_classes, activation='softmax'))

# model.summary()

# model.compile(loss='categorical_crossentropy',
#               optimizer=RMSprop(),
#               metrics=['accuracy'])

# history = model.fit(X_train, y_train,
#                     batch_size=128,
#                     epochs=5,
#                     verbose=1,
#                     validation_data=(X_test, y_test))
# score = model.evaluate(X_test, y_test, verbose=0)
# print('Test loss:', score[0])
# print('Test accuracy:', score[1])

In [None]:
model.evaluate(X_test,y_test)