<a href="https://colab.research.google.com/github/poorvabedmutha31/PredatoryData/blob/master/FCN_Classification_MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Classification of the MNIST data using Fully Connected Network

The MNIST Dataset has 28 x 28 black and white images of digits from 0 to 9. It is one of the most common datasets for starting up with deep learning. It comes in built in the keras package.

This notebook will walk you through the developing a classification model for the dataset using Fully Connected Networks.

### Importing required libraries

In [0]:
import numpy as np
from sklearn.metrics import confusion_matrix, classification_report
from matplotlib import pyplot as plt

from keras.layers import Input, Flatten, Dense, Dropout
from keras.layers.advanced_activations import LeakyReLU
from keras.activations import *
from keras.optimizers import *
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Model, Sequential

### Loading the dataset

MNIST comes as a part of the keras datasets. It contains 60,000 training images while 10,000 test images

In [0]:
(X_train, y_train_arg), (X_test, y_test_arg) = mnist.load_data()
X_train = (X_train.astype(np.float32) - 127.5)/127.5
# X_train = X_train.reshape(60000, 784)

In [47]:
X_train.shape, X_test.shape

((60000, 28, 28), (10000, 28, 28))

### Data Preprocessing

The labels are converted from singular to one-hot encoded values from 0 to 9. 

That is, if a given image corresponds to 5, its encoding will be [0,0,0,0,0,1,0,0,0,0]

In [0]:
num_classes = 10

# convert class vectors to binary class matrices
y_train = to_categorical(y_train_arg, num_classes)
# y_test = to_categorical(y_test_arg, num_classes)

In [49]:
y_train.shape

(60000, 10)

### Creating a Keras Model

We will build the model using the Sequential API. Since we only wan't MLP based network, we will use Dense layers for fully connecting neurons.Simply go on adding a layer as it pleases. 

In [50]:
model1 = Sequential()
model1.add(Flatten(input_shape=(28,28)))
model1.add(Dense(512,activation='relu'))
# model1.add(Dropout(0.2))
model1.add(Dense(256,activation='tanh'))
# model1.add(Dropout(0.2))
model1.add(Dense(num_classes,activation='softmax'))
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_9 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_26 (Dense)             (None, 512)               401920    
_________________________________________________________________
dense_27 (Dense)             (None, 256)               131328    
_________________________________________________________________
dense_28 (Dense)             (None, 10)                2570      
Total params: 535,818
Trainable params: 535,818
Non-trainable params: 0
_________________________________________________________________


The final model is compiled using an optimizer, a loss function and a metric for performance improvement. 
- The loss function is used to depict how far is the current model from the ideal answer
- The optimizer refers to the method that will be used to minimize the loss
- The metrics correspond to how we want to measure the performance of the network

In [51]:
model1.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])
print(model1.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_9 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_26 (Dense)             (None, 512)               401920    
_________________________________________________________________
dense_27 (Dense)             (None, 256)               131328    
_________________________________________________________________
dense_28 (Dense)             (None, 10)                2570      
Total params: 535,818
Trainable params: 535,818
Non-trainable params: 0
_________________________________________________________________
None


### Training and testing

Training is done using the function fit(). We train out network for 5 epochs


In [52]:
# model1.compile(optimizer='sgd', loss='mse')
model1.fit(X_train,y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f6f34b64e50>

In [53]:
y_check = model1.predict(X_test)
y_pred = np.array([np.argmax(y_check[j]) for j in range(len(y_check))])
y_test_arg

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

In [54]:
confusion_matrix(y_test_arg, y_pred)

array([[ 976,    1,    0,    1,    0,    0,    1,    0,    0,    1],
       [   1, 1131,    0,    1,    0,    0,    2,    0,    0,    0],
       [  39,   37,  854,   80,    5,    0,    8,    5,    4,    0],
       [   0,    5,    0, 1000,    0,    1,    0,    2,    1,    1],
       [  49,   82,    2,    2,  716,    0,    6,    9,    3,  113],
       [  31,    7,    0,   64,    0,  767,   16,    0,    5,    2],
       [  40,    7,    1,    7,    0,    1,  899,    1,    1,    1],
       [  21,   57,   17,   26,    0,    1,    2,  873,    3,   28],
       [  56,   41,    4,   85,    0,    6,    8,    1,  772,    1],
       [  36,   67,    0,   28,    2,    1,    1,   20,    8,  846]])

In [55]:
print classification_report(y_test_arg, y_pred)

              precision    recall  f1-score   support

           0       0.78      1.00      0.88       980
           1       0.79      1.00      0.88      1135
           2       0.97      0.83      0.89      1032
           3       0.77      0.99      0.87      1010
           4       0.99      0.73      0.84       982
           5       0.99      0.86      0.92       892
           6       0.95      0.94      0.95       958
           7       0.96      0.85      0.90      1028
           8       0.97      0.79      0.87       974
           9       0.85      0.84      0.85      1009

   micro avg       0.88      0.88      0.88     10000
   macro avg       0.90      0.88      0.88     10000
weighted avg       0.90      0.88      0.88     10000



## References:
1. [Keras' official example](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py) on Github
2. [Documentation References](https://keras.io/) for more info about every function/layer

In [56]:
# Keras' Example

print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

num_classes = 10

# convert class vectors to binary class matrices one hot encoded
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test_arg, num_classes)

model = Sequential()
model.add(Flatten(input_shape=(28,28)))
model.add(Dense(1024,activation='relu'))
model.add(Dense(512, activation='relu', input_shape=(784,)))
# model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
# model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
model.add(Dense(1, activation='sigmoid'))

model.summary()

model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

history = model.fit(X_train, y_train, batch_size=128, epochs=5, verbose=1, validation_data=(X_test, y_test_arg))
score = model.evaluate(X_test, y_test_arg, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

(60000, 'train samples')
(10000, 'test samples')
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_10 (Flatten)         (None, 784)               0         
_________________________________________________________________
dense_29 (Dense)             (None, 1024)              803840    
_________________________________________________________________
dense_30 (Dense)             (None, 512)               524800    
_________________________________________________________________
dense_31 (Dense)             (None, 512)               262656    
_________________________________________________________________
dense_32 (Dense)             (None, 10)                5130      
_________________________________________________________________
dense_33 (Dense)             (None, 1)                 11        
Total params: 1,596,437
Trainable params: 1,596,437
Non-trainable params: 0
_________________

ValueError: ignored

In [0]:
model1.evaluate(X_test,y_test)