<a href="https://colab.research.google.com/github/tgi25/home/blob/master/mlp_mnist_v3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Keras - Classifying MNIST dataset with MLP (Version 3)

From Wikipedia, the free encyclopedia (https://en.wikipedia.org/wiki/MNIST_database)

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems.The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

The MNIST database contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset. There have been a number of scientific papers on attempts to achieve the lowest error rate; one paper, using a hierarchical system of convolutional neural networks, manages to get an error rate on the MNIST database of 0.23 percent. The original creators of the database keep a list of some of the methods tested on it. In their original paper, they use a support vector machine to get an error rate of 0.8 percent.An extended dataset similar to MNIST called EMNIST has been published in 2017, which contains 240,000 training images, and 40,000 testing images of handwritten digits.

THE MNIST DATABASE of handwritten digits - http://yann.lecun.com/exdb/mnist/

### Importing Keras

In [None]:
from tensorflow import keras

### Loading the MNIST dataset

In [None]:
from keras.datasets import mnist #importing the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print ("x_train shape = ", x_train.shape, "y_train shape = ", y_train.shape)
print ("x_test shape = ", x_test.shape, "y_test shape = ", y_test.shape)

### Plotting digits

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
plt.imshow(x_train[0]) #plots the first digit stored in the training dataset

In [None]:
plt.imshow(x_test[0]) #plots the first digit stored in the testing dataset

### Casting inputs to float32

In [None]:
x_train = x_train.astype('float32')
x_test  = x_test.astype('float32')

### Reshaping inputs

In [None]:
INPUT_DIM = 784 #28 by 28

x_train_reshape = x_train.reshape(60000, INPUT_DIM)
x_test_reshape = x_test.reshape(10000, INPUT_DIM)
print (x_train.shape, "=>", x_train_reshape.shape)
print (x_test.shape,  "=>", x_test_reshape.shape)

### Normalizing the inputs

In [None]:
x_train_reshape /= 255
x_test_reshape  /= 255

### Converting labels to one-hot vectors

In [None]:
from keras.utils import np_utils

NB_CLASSES = 10 # (number of classes)
y_train_one_hot = np_utils.to_categorical(y_train, NB_CLASSES) #nb_classes = 10 (number of classes)
y_test_one_hot  = np_utils.to_categorical(y_test, NB_CLASSES)
print ("Dimension of y_train_one_hot = ", y_train_one_hot.shape)
print ("Dimension of y_test_one_hot  = ", y_test_one_hot.shape)
print (y_train[0], "=>", y_train_one_hot[0])
print (y_test[0],  "=>", y_test_one_hot[0])

### Building the MLP model

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Activation

#MLP - Two hidden layers (512 neurons in each layer) + Softmax layer (10 classes)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(INPUT_DIM,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()

### Compiling the model

In [None]:
OPTIMIZER = keras.optimizers.SGD(learning_rate=0.1, momentum = 0.8)
model.compile(optimizer=OPTIMIZER, loss='categorical_crossentropy', metrics=['accuracy'])

### Fitting the model with early stopping criteria

In [None]:
from keras.callbacks import EarlyStopping

BATCH_SIZE = 128
EPOCHES = 20
VERBOSE = 1
VALIDATION_SPLIT = 0.2

#Monitor the validation lost and if it is not improved (min_delta=0.001) for 2 (patience=2) epoches stops 
#the training, mode = direction of improvement (increase/decrease) - auto means Keras decides it automatically 
ES = EarlyStopping(monitor='val_loss', min_delta=0.001, patience=2, verbose=VERBOSE, mode='auto')

model_history = model.fit(x=x_train_reshape, y=y_train_one_hot, batch_size=BATCH_SIZE, 
                          epochs=EPOCHES, verbose=VERBOSE, validation_split=VALIDATION_SPLIT, 
                          callbacks=[ES])

### Evaluating the model

In [None]:
score = model.evaluate(x_test_reshape, y_test_one_hot, verbose=2) 
print('Test score:', score[0]) 
print('Test accuracy:', score[1])

### Predicting the class

In [None]:
y_probability = model.predict(x_test_reshape)
y_classes = y_probability.argmax(axis=-1)
print ("True class = ", y_test[0], "Predicted class = ", y_classes[0])

### Plotting the model performances

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

def plot_history(network_history):
    plt.figure()
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.plot(network_history.history['loss'])
    plt.plot(network_history.history['val_loss'])
    plt.legend(['Training', 'Validation'])

    plt.figure()
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.plot(network_history.history['accuracy'])
    plt.plot(network_history.history['val_accuracy'])
    plt.legend(['Training', 'Validation'], loc='lower right')
    plt.show()

plot_history(model_history)

### Saving the trained model and its weights

In [None]:
json_string = model.to_json() # as json 
open('mlp_mnist_v2_model.json', 'w').write(json_string)
# save the weights in h5 format 
model.save_weights('mlp_mnist_v2_wts.h5')

### Retrieving a trained model and its weights

In [None]:
from keras.models import model_from_json
model1 = model_from_json(open('mlp_mnist_v2_model.json').read())
model1.load_weights('mlp_mnist_v2_wts.h5')
model1.summary()
model1.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
score = model1.evaluate(x_test_reshape, y_test_one_hot, verbose=0) 
print('Test score:', score[0]) 
print('Test accuracy:', score[1])

y_probability = model.predict(x_test_reshape)
y_classes = y_probability.argmax(axis=-1)
print ("True class = ", y_test[0], "Predicted class = ", y_classes[0])