## Introduction
In this kernel, we apply different neural network models to MNIST data set and compare their performance.

## About MNIST Dataset
The dataset contains hand written images of digits [ 0 - 9 ].  We have to build algorithm to find what digit is given image.

In [None]:
import pandas as pd
import numpy as np

Loading data set.

In [None]:
train = pd.read_csv('../input/train.csv')

labels = train['label'].values
images = train.drop(labels = ['label'],axis = 1).values
del train

## Data Preprocessing

In [None]:
images = images/255
images = images.reshape(-1,28,28,1)
images.shape

In [None]:
from sklearn.model_selection import train_test_split
train_images,test_images,train_labels, test_labels = train_test_split(images,labels, test_size = 0.2, random_state = 0)

In [None]:
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten

## 1 . Single Layer Neural Network

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

Next, we look at accuracy by adding a layer.

## 2 . Two Layer Neural Network

### 2.1  With 32 Neurons in hidden layer

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(32,activation = 'relu'))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

It seems accuracy is increasing when a layer is added. Let's look at what happens when we increase the number of perceptrons. We do the same with 64, 128 perceptrons in hidden layer.

### 2.2 With 64 Neurons in hidden layer

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(64,activation = 'relu'))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

### 2.3 With 128 Neurons in hidden layer

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(128,activation = 'relu'))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

From our observation with 32, 64 and 128 perceptrons in hidden layer, it seems higher the preceptrons in hidden layer, higher the accuracy.

## 3. Multilayer Neural Network

### 3.1 Three Hidden layers

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(32,activation = 'relu'))
model.add(Dense(64,activation = 'relu'))
model.add(Dense(128,activation = 'relu'))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

### 3.2 Four hidden layers

In [None]:
model = Sequential()
model.add(Flatten(input_shape = (28,28,1)))
model.add(Dense(32,activation = 'relu'))
model.add(Dense(64,activation = 'relu'))
model.add(Dense(128,activation = 'relu'))
model.add(Dense(256,activation = 'relu'))
model.add(Dense(10,activation = 'softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

Inreasing the number of layers to 3 and 4 gave less accuracy (0.9697 and 0.9709 respectively). For  two layer neural network, the accuracy was 0.972. Two layer neural network performed better. It may be due to overfitting in case of Multilayer perceptron.

In [None]:
from keras.layers import Conv2D, MaxPool2D, Dropout

## 4. Convolution Neural Network
We use kernel size of 5 * 5.

### 4.1 One convolution layer

In [None]:
model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu', input_shape = (28,28,1)))
model.add(MaxPool2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(256, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(10, activation = "softmax"))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

### 4.2 More than on Convolution layers

In [None]:
model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu', input_shape = (28,28,1)))
model.add(Conv2D(filters = 32,kernel_size = (5,5),activation = 'relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(filters = 64,kernel_size = (3,3),activation = 'relu'))
model.add(Conv2D(filters = 64,kernel_size = (3,3),activation = 'relu'))
model.add(Flatten())
model.add(Dense(256, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(10, activation = "softmax"))
model.compile(optimizer=tf.train.AdamOptimizer(), 
             loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
model.summary()

In [None]:
model.fit(train_images,train_labels,epochs = 10)

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

## Result

Convolution neural network has highest accuracy. This may be due to CNN's ability to learn hidden futures in a better way.

