# Building Simple Neural Network with Keras

The problem: MNIST handwritten digit classification
MNIST data-set is classic deep learning problem. It's a collection of handwritten digits from 0 to 9.

Keras is simple and powerfull deep learning library for Python. You can learn more by reading the <a href='https://keras.io/getting_started/intro_to_keras_for_engineers/'>documentation</a>.

In [7]:
import numpy as np
import keras
import pandas as pd

import warnings
warnings.filterwarnings("ignore")

## Data set

Uploading the data set. You can download it from here: http://pjreddie.com/projects/mnist-in-csv/

In [25]:
# let's upload train data
train_data_file = open('/Users/abdygaziev/Documents/FlatironMaterials/Projects/data/mnist/mnist_train.csv','r')
train_data_list = train_data_file.readlines()
train_data_file.close()

# # let's upload test data
test_data_file = open('/Users/abdygaziev/Documents/FlatironMaterials/Projects/data/mnist/mnist_test.csv','r')
test_data_list = test_data_file.readlines()
test_data_file.close()

In [26]:
print('Number of training examples: ',len(train_data_list))
print('Number of test examples: ',len(test_data_list))

Number of training examples:  60000
Number of test examples:  10000


## Data Preparation

Let's split labels and features into separate data sets.

In [27]:
# y - targets
# X - features
y_train = []
X_train = []

for record in range(len(train_data_list)):
    y_train.append(train_data_list[record][0])
    values = train_data_list[record].split(',')
    X_train.append(values[1:])

y_test = []
X_test = []

for record in range(len(test_data_list)):
    y_test.append(test_data_list[record][0])
    values = test_data_list[record].split(',')
    X_test.append(values[1:])

In [28]:
# converting to numpy array
y_train = np.asfarray(y_train)
X_train = np.asfarray(X_train)

y_test = np.asfarray(y_test)
X_test = np.asfarray(X_test)

In [29]:
train_images = X_train.reshape((-1, 784))
test_images = X_test.reshape((-1, 784))

# check the shapes
print('y_train shape:',y_train.shape)
print('X_train shape: ',X_train.shape)

print('y_test shape:',y_test.shape)
print('X_test shape: ',X_test.shape)

y_train shape: (60000,)
X_train shape:  (60000, 784)
y_test shape: (10000,)
X_test shape:  (10000, 784)


Then we normalize our data. Instead of having pixel values from [0-255] we center them from [-0.5 to 0.5]. Usually smaller and centered values are better to train.

In [30]:
# Normalize the images.
train_images = (train_images / 255) - 0.5
test_images = (test_images / 255) - 0.5

## Building the Model

Keras provides to build **Sequential** or **Functional** models. Sequential model is the simplest model where layers of neurons stacked and fuly connected. Functional model is more customizable. Here we're going to build Sequential model.

In [36]:
# instantiate model
from keras.models import Sequential
from keras.layers import Dense

model = Sequential([
    Dense(64,activation='relu'),
    Dense(64,activation='relu'),
    Dense(10,activation='softmax')
])

First and second layers, each have 64 nodes with <a href='https://en.wikipedia.org/wiki/Rectifier_(neural_networks)'>ReLU</a> activation function. Output layer has 10 nodes, one for each label with a <a href='https://en.wikipedia.org/wiki/Softmax_function'>Softmax</a> activation function.

## Compile the Model

Now we need to compile our model before we start training. We need to define 3 main key factors:
* Optimizer - gradient descent
* Loss function
* Metric

Keras has many <a href='https://keras.io/api/optimizers/'>optimizers</a>. In our model we will use <a href='https://arxiv.org/abs/1412.6980'>**Adam** - gradient based optimization</a>. 
For the Loss function **Cross-Entropy Loss**. To learn more about loss functions, go to Keras documentation: <a href='https://keras.io/api/losses/'>Keras' loss functions</a>. As for the metric we'll use **accuracy**.


In [41]:
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

## Training the Model

In [42]:
from keras.utils import to_categorical

model.fit(
    x=train_images, #train data-set
    y=to_categorical(y_train), #labels
    epochs=5,
    batch_size=32
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x7f7ebce536d0>

Great! After 5 epochs of training we achieved 0.9790 accuracy. It may look promising but it doesn't tell us much. We need to test the model.

## Testing the Model

In [43]:
model.evaluate(
  test_images,
  to_categorical(y_test)
)



[0.088274097120855, 0.9717000126838684]

After testing, our model's loss is 0.088 and accuracy is 0.9717. Not bad at all, slightly lower accuracy than on training data.

## Experiment with Model

Let's try out different parameters to compare the results.

### Number of epochs?

In [47]:
model = Sequential([
    Dense(64,activation='relu'),
    Dense(64,activation='relu'),
    Dense(10,activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)


from keras.utils import to_categorical

model.fit(
    x=train_images, #train data-set
    y=to_categorical(y_train), #labels
    epochs=10,
    batch_size=32
)

print('test accuracy: ')

model.evaluate(
  test_images,
  to_categorical(y_test)
)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
test accuracy: 


[0.10760164717165753, 0.9682999849319458]

Looks like accuracy of the model slightly deteriorated with more iteration. May be overfitting? 

### Network Depth?

In [44]:
# more layers
model = Sequential([
    Dense(64,activation='relu'),
    Dense(64,activation='relu'),
    Dense(64,activation='relu'),
    Dense(64,activation='relu'),
    Dense(10,activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)


from keras.utils import to_categorical

model.fit(
    x=train_images, #train data-set
    y=to_categorical(y_train), #labels
    epochs=5,
    batch_size=32
)


model.evaluate(
  test_images,
  to_categorical(y_test)
)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.12213691611355171, 0.9623000025749207]

### Different Activation: Sigmoid?

In [45]:

model = Sequential([
    Dense(64,activation='sigmoid'),
    Dense(64,activation='sigmoid'),
    Dense(10,activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)


from keras.utils import to_categorical

model.fit(
    x=train_images, #train data-set
    y=to_categorical(y_train), #labels
    epochs=5,
    batch_size=32
)

print('test accuracy: ')

model.evaluate(
  test_images,
  to_categorical(y_test)
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
test accuracy: 


[0.13002270174250008, 0.9621000289916992]

## Conclusion

You can tune your parameters and hyper-parameters of your model to achieve desired outcome. We have implemented 4 layer (input, 2 hidden and output) neural network using <a href='https://keras.io'>Keras</a>, and achived 97% accuracy on train data-set, 97% on test data-set as well.

As you can see above, we can play with a model with different parameters and see the results. At each setting, results vary. We should always test our model, and try different parameters. 