# Kaggle Digits Recognizer

This notebook shows the method of developing a digits recognizer for the Digits Recognizer Kaggle Competition

First, we import all the classes and functions we need

In [1]:
import os
os.environ["THEANO_FLAGS"] = "mode=FAST_RUN,device=gpu,floatX=float32"
import theano
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.utils import np_utils
from keras import backend as K
import pandas as pd
import random
K.set_image_dim_ordering('th')

Using gpu device 0: GeForce GT 755M (CNMeM is disabled, cuDNN not available)
Using Theano backend.


Next we set the random number seed, to ensure reprducibility of our results

In [2]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

Next we load the required data from file.

The train data is located in "train.csv", while the test data is located in "test.csv". The train data includes labels, while the test data does not include labels. The prediction labels obtained form the test data is uploaded on kaggle to be scored.

In [3]:
# load data
dataset = pd.read_csv("train.csv")
y_train = dataset[[0]].values.ravel()
X_train = dataset.iloc[:,1:].values
X_test = pd.read_csv("test.csv").values
num_pixels = X_train.shape[1]

print("The shape of the three data sets is: ")
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)

The shape of the three data sets is: 
(42000, 784)
(42000,)
(28000, 784)


Reshape the features

In [4]:
# reshape to be [samples][pixels][width][height]
X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
print("The shape of X_train is: " + str(X_train.shape))
print("The shape of X_test is: " + str(X_test.shape))

The shape of X_train is: (42000, 784)
The shape of X_test is: (28000, 784)


We now scale the features from 0 - 255 to 0 - 1, and we one hot encode the outputs

In [5]:
# normalize inputs from 0-255 to 0-1
X_train = X_train / 255
X_test = X_test / 255
# one hot encode outputs
y_train = np_utils.to_categorical(y_train)
num_classes = y_train.shape[1]
y_train.shape

(42000, 10)

## Define a simple network with dropout

We add dropout after each layer

In [6]:
# define baseline model
def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dropout(0.3))
    model.add(Dense(num_pixels*2, init='normal', activation='relu'))
    model.add(Dropout(0.3))
    model.add(Dense(num_pixels*4, init='normal', activation='relu'))
    model.add(Dropout(0.3))
    model.add(Dense(num_pixels*2, init='normal', activation='relu'))
    model.add(Dropout(0.3))
    model.add(Dense(num_pixels, init='normal', activation='relu'))
    model.add(Dropout(0.3))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

The model is fit over 10 epochs with a batch size of 200.

In [9]:
# build the model
model = baseline_model()
# Fit the model
model.fit(X_train, y_train, validation_split = 0.1, nb_epoch=20, batch_size=100, verbose=2)

Train on 37800 samples, validate on 4200 samples
Epoch 1/20
26s - loss: 0.6970 - acc: 0.8445 - val_loss: 0.1703 - val_acc: 0.9526
Epoch 2/20
26s - loss: 0.1954 - acc: 0.9430 - val_loss: 0.1373 - val_acc: 0.9574
Epoch 3/20
26s - loss: 0.1505 - acc: 0.9574 - val_loss: 0.1328 - val_acc: 0.9617
Epoch 4/20
26s - loss: 0.1329 - acc: 0.9624 - val_loss: 0.1410 - val_acc: 0.9638
Epoch 5/20
26s - loss: 0.1102 - acc: 0.9694 - val_loss: 0.1311 - val_acc: 0.9671
Epoch 6/20
27s - loss: 0.1015 - acc: 0.9721 - val_loss: 0.1043 - val_acc: 0.9740
Epoch 7/20
26s - loss: 0.0908 - acc: 0.9744 - val_loss: 0.1259 - val_acc: 0.9705
Epoch 8/20
26s - loss: 0.0898 - acc: 0.9754 - val_loss: 0.1250 - val_acc: 0.9698
Epoch 9/20
26s - loss: 0.0805 - acc: 0.9780 - val_loss: 0.1074 - val_acc: 0.9733
Epoch 10/20
26s - loss: 0.0815 - acc: 0.9785 - val_loss: 0.1103 - val_acc: 0.9745
Epoch 11/20
26s - loss: 0.0865 - acc: 0.9783 - val_loss: 0.1556 - val_acc: 0.9700
Epoch 12/20
26s - loss: 0.0869 - acc: 0.9792 - val_loss: 0

<keras.callbacks.History at 0x7f57b73e3650>

In [8]:
#y_test = model.predict(X_test)