## MNIST dataset using Keras

In [9]:
from __future__ import print_function
import keras 
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPool2D
from keras import backend as K


In [10]:
batch_size = 128
num_classes = 10
epochs=12

img_width = 28
img_height = 28

(x_train , y_train), (x_test, y_test) = mnist.load_data()

We define the batch size of 128 data per epoch. 

**Batch size** defines the number of samples that going to be propagated through the network at each epoch. It requires less memory and is especially important in case if you are not able to fit dataset in memory. 

Since MNIST handwritten digits have a input dimension of 28*28, we define image rows and columns as 28, 28.

In [11]:
import requests
requests.packages.urllib3.disable_warnings()

Rearrange the shape of our input data to pass it as input to our convolutional neural network.



In [12]:
if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_width, img_height)
    x_test = x_test.reshape(x_test.shape[0], 1, img_width, img_height)
    input_shape = (1, img_width, img_height)
else:
    x_train = x_train.reshape(x_train.shape[0], img_width, img_height, 1)
    x_test = x_test.reshape(x_test.shape[0], img_width, img_height, 1)
    input_shape = (img_width, img_height, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')


x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


Neural networks perform much better when the output label is fed as a sparse matrix so we convert the y-label for both train and test data as a sparse matrix.

In [13]:
#convert class vector to binary sparsse matrix
y_train = keras.utils.to_categorical(y_train,num_classes)
y_test = keras.utils.to_categorical(y_test,num_classes)

#### Model and Design
* First we define model to be a sequential model.
* We stack **Convolution Layer** and **Pooling Layer** along with **Dropout layer**
* Dropout layer provide simple way to overfit data by randomly dropping components of NN
* We flatten the layers to one single  dimensional vector
* Results in a scenario where at each layer more neurons are forced to learn the multiple characteristics of the neural network. 
* The last layer of the neural network will have number of node equal to the number of output class i.e. 10 and the activation function we will be using is “softmax”.


In [15]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3),
                activation = 'relu',
                input_shape=input_shape))

model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes,activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy, optimizer = keras.optimizers.Adadelta(),metrics=['accuracy'])

model.fit(x_train,y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test,y_test))



Train on 60000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.History at 0x7fc763f45128>

In [16]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss', score[0])
print('Test accuracy', score[1])

Test loss 0.0356181707877884
Test accuracy 0.9872
