# Introduction

With the rage of Deep Learning, I wanted to try actually building a neural network to get the gist of understanding how they work.  While neural networks are currently a blackbox implementation, it doesn't hurt to experiment on them.

This notebook will explore a neural network on the popular MNIST digit recognition dataset.  However, we'll be using Kaggle's version.  The dataset itself is part of a <a href="https://www.kaggle.com/c/digit-recognizer">competition</a> that serves as a tutorial to more advanced machine learning algorithms.

In [1]:
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.


Kaggle's dataset is split into separate files called train.csv and test.csv.  We'll first be using the train.csv file to train our network.

In [2]:
dtrain = pd.read_csv("input/train.csv")
print(dtrain.shape)
dtrain.head()

(42000, 785)


Unnamed: 0,label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Here, label is our y and pixel# is our x.  We'll need to separate the two into separate variable before training.

In [3]:
X_train = dtrain.iloc[:, 1:].astype('float32')
Y_train = to_categorical(dtrain.iloc[:, 0])
num_pixels = X_train.shape[1] # Holds our size

In [4]:
def create_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, activation="relu"))
    model.add(Dense(10, activation="sigmoid"))
    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
    return model

In [5]:
model = create_model()
model.fit(X_train.values, Y_train, epochs=10, batch_size=150)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1219f4f98>

The fact that the model was trained with 20% accuracy already tells us that the data fed into the neural network was not good.  However, for demonstration purposes, let's evaluate the model's accuracy.

Since the dataset is from a Kaggle competition, we don't have labels of the test dataset.  We'll just use the training set for evaluating purposes.

In [6]:
print(model.metrics_names)
model.evaluate(X_train.values, Y_train)

['loss', 'acc']


[6.0152765448434016, 0.20780952380952381]

# Normalizing the dataset

In order to accurately train the network, we'll need to do some data wrangling.

For this dataset, we'll divide each column by 255 since the image is gray scale.

In [7]:
X_train = dtrain.iloc[:, 1:].astype('float32') / 255

We'll now retrain our model using the normalized dataset. 

In [8]:
model = create_model()
model.fit(X_train.values, Y_train, epochs=10, batch_size=150)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x10f4f5518>

The accuracy looks much better with the normalization.  Now let's test out the model.

In [11]:
dtest = pd.read_csv("input/test.csv")
dtest /= 255
dtest.head()

Unnamed: 0,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [14]:
y_results = model.predict_classes(dtest.values,batch_size=128)
res = pd.DataFrame({'ImageId':list(range(1,len(dtest) + 1)), "Label": y_results})
res = res.set_index("ImageId")
res.to_csv("Results.csv")



Evaluating the file against Kaggle's test results, the model was able to achieve an accuracy of 97-98%.