# Digit Recognizer using Python Keras

Author: Enes Kemal Ergin

Code reference: [Keras Documentation](http://keras.io)

ConvNet Intro: [My Explanations](http://eneskemalergin.github.io/2016/04/02/ConvNets_Explained/)

In this notebook I will write the cnn model using keras to make a digit recognizer. My primary data source is MNIST data. A long the way I will put some definitions that I wasn't well aware of the meanings before writing this notebook.

In [1]:
import numpy as np
import pandas as pd

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils

np.random.seed(1337)  # for reproducibility



Using Theano backend.
Couldn't import dot_parser, loading of dot files will not be possible.


## Batch Size?

Batch size is basically the size of selected data in each epoch.

> Epoch is one forward pass and one backward pass of all the training examples. But careful epoch and the iteration is not same. We can calculate number of iteration by using this example:

> if we have 1000 training examples, and our batch size is 500, then it will take 2 iterations to complete 1 epoch. You can decide the number of epoch as well in the parameters of you model.

If the batch size increases we will be needing more memory during the execution. So adjust your parameters carefully.

In [2]:
batch_size = 256
nb_classes = 10
nb_epoch = 12

# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3

In [3]:
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.pkl.gz


In [4]:
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

('X_train shape:', (60000, 1, 28, 28))
(60000, 'train samples')
(10000, 'test samples')


In [5]:
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

Now we will start building our model with keras building blocks way...

In [6]:
model = Sequential() # Our model will be sequential

This small portion explain the model with some more detail:

- Convolution2D:
    - 4D tensor with shape: (Samples, channels, rows, columns)
    - nb_filters: Number of convolutional filters
    - nb_conv: Number of convolutional kernel used as row and columns
    - input_shape: used to specify the shape of the input, just used in the first layer since it's the input shape
  
- Activation('relu'):
    - activation function is a function that transforms a set of input signals into an output signal with specified way
    - in here we used rectifier linera unit function
    - f(x) = max(0,x)
- MaxPooling2D:
    - Pooling is a way of sub-sampling (reducing the dimension of the input)
    - Max Pooling takes the max of each group and down sizes into nb_pool(2)
- Dropout:
    - Dropout is a way of regularization
    - When random neurons are dropped out the network is forced to learn several independent representations of the patterns with identical input and output.
    - 0.25 is a fraction of the input units to drop.
    - Dropout(p): p must be in between 0 and 1    
- Flatten:
    - ?
- Dense:
    - ?


In [7]:
model.add(Convolution2D(nb_filters, nb_conv, nb_conv, border_mode='valid', input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

In [8]:
model.compile(loss='categorical_crossentropy', optimizer='adadelta')

In [None]:
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
          show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12

In [None]:
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])