<img src="https://www.kaggle.com/static/images/site-logo.png" width="100px">

**MNIST with Artificial Neural Networks**

<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQBeNgBeTlsnYyxHtEmAhENJwmspXjJbMuEq7uT3CvzZ7SWmh52zw" width="800px">

**Importing Basic Libraries**

In [1]:
# for basic operations
import numpy as np 
import pandas as pd 

# for providing the path
import os
print(os.listdir("../input"))


['mnist_train.csv', 'mnist_test.csv']


**Importing the Data**

In [2]:
# reading the data

train = pd.read_csv('../input/mnist_train.csv')
test = pd.read_csv('../input/mnist_test.csv')

print(train.shape)
print(test.shape)

(60000, 785)
(10000, 785)


In [3]:
# It has 10 digits ranging from 0-9

train.iloc[:,0].value_counts()

1    6742
7    6265
3    6131
2    5958
9    5949
0    5923
6    5918
8    5851
4    5842
5    5421
Name: label, dtype: int64

In [4]:
# splitting the data into dependent and independent variables

train_x = train.iloc[:,1:785]
train_y = train.iloc[:,0]

test_x = test.iloc[:,1:785]
test_y = test.iloc[:,0]

print(train_x.shape)
print(train_y.shape)

print(test_x.shape)
print(test_y.shape)


(60000, 784)
(60000,)
(10000, 784)
(10000,)


In [5]:
# splitting the dataset into training and testing set

from sklearn.model_selection import train_test_split

x_train, x_cv, y_train, y_cv = train_test_split(train_x, train_y, test_size = 0.25, random_state = 35)

print(x_train.shape)
print(x_cv.shape)
print(y_train.shape)
print(y_cv.shape)

(45000, 784)
(15000, 784)
(45000,)
(15000,)


In [6]:
# reshaping them as matrix

x_train = np.asmatrix(x_train).reshape(45000, 784)
x_cv = np.asmatrix(x_cv).reshape(15000, 784)

test_x = np.asmatrix(test_x).reshape(10000, 784)

In [7]:
# feature normalization

x_train = x_train.astype('float32')
x_cv = x_cv.astype('float32')

test_x = test_x.astype('float32')

x_train = x_train/255
x_cv = x_cv/255
test_x = test_x/255

In [8]:
# converting the labels into one hot encoding

# immporting keras 
import keras

digits = 10
y_train = keras.utils.to_categorical(y_train, digits)
y_cv = keras.utils.to_categorical(y_cv, digits)



Using TensorFlow backend.


In [9]:
# viewing the labels after one hot encoding

print(y_train[0])  #7
print(y_train[3])  #2
print(y_train[4])  #4

[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]


**Modelling with Artificial Neural Networks**

In [10]:
# ARTIFICIAL NEURAL NETWORKS

import keras 
from keras.layers import Dense 
from keras.models import Sequential
from keras import optimizers

<img src="https://icdn5.digitaltrends.com/image/artificial_neural_network_1-791x388.jpg" width="1000px">

In [11]:
# creating the model
model = Sequential()

# first hidden layer
model.add(Dense(output_dim = 400, init = 'uniform', activation = 'relu', input_dim = 784))

# second hidden layer
model.add(Dense(output_dim = 300, init = 'uniform', activation = 'relu'))

# third hidden layer
model.add(Dense(output_dim = 300, init = 'uniform', activation = 'relu'))

# fourth hidden layer
model.add(Dense(output_dim = 300, init = 'uniform', activation = 'relu'))

# fifth hidden layer
model.add(Dense(output_dim = 100, init = 'uniform', activation = 'relu'))

# output layer
# output_dim = no. of digits
# softmax activation function is used for multiple outputs
model.add(Dense(output_dim = 10, init = 'uniform', activation = 'softmax'))

Instructions for updating:
Colocations handled automatically by placer.


  """
  
  # This is added back by InteractiveShellApp.init_path()
  


**Model Summary**

In [12]:
# looking at the model summary

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 400)               314000    
_________________________________________________________________
dense_2 (Dense)              (None, 300)               120300    
_________________________________________________________________
dense_3 (Dense)              (None, 300)               90300     
_________________________________________________________________
dense_4 (Dense)              (None, 300)               90300     
_________________________________________________________________
dense_5 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_6 (Dense)              (None, 10)                1010      
Total params: 646,010
Trainable params: 646,010
Non-trainable params: 0
_________________________________________________________________


In [13]:
# setting the learning rate
learning_rate = 0.01
sgd = optimizers.SGD(lr = learning_rate)


# compiling the model
# using the plain vanilla stochastic gradient descent as our optimizing technique
# using categorical cross entropy for multiple outputs
model.compile(optimizer = 'sgd', loss = 'categorical_crossentropy', metrics = ['accuracy'])

# feeding the training data to the model
model.fit(x_train, y_train, batch_size = 100, epochs = 60, verbose = 2, validation_data = (x_cv, y_cv))

Instructions for updating:
Use tf.cast instead.
Train on 45000 samples, validate on 15000 samples
Epoch 1/60
 - 7s - loss: 2.3020 - acc: 0.1097 - val_loss: 2.3014 - val_acc: 0.1129
Epoch 2/60
 - 7s - loss: 2.3012 - acc: 0.1122 - val_loss: 2.3008 - val_acc: 0.1129
Epoch 3/60
 - 7s - loss: 2.3008 - acc: 0.1122 - val_loss: 2.3005 - val_acc: 0.1129
Epoch 4/60
 - 7s - loss: 2.3005 - acc: 0.1122 - val_loss: 2.3002 - val_acc: 0.1129
Epoch 5/60
 - 7s - loss: 2.3001 - acc: 0.1122 - val_loss: 2.2998 - val_acc: 0.1129
Epoch 6/60
 - 7s - loss: 2.2997 - acc: 0.1122 - val_loss: 2.2993 - val_acc: 0.1129
Epoch 7/60
 - 7s - loss: 2.2992 - acc: 0.1122 - val_loss: 2.2987 - val_acc: 0.1129
Epoch 8/60
 - 7s - loss: 2.2984 - acc: 0.1122 - val_loss: 2.2977 - val_acc: 0.1129
Epoch 9/60
 - 7s - loss: 2.2972 - acc: 0.1122 - val_loss: 2.2962 - val_acc: 0.1129
Epoch 10/60
 - 7s - loss: 2.2952 - acc: 0.1122 - val_loss: 2.2936 - val_acc: 0.1129
Epoch 11/60
 - 7s - loss: 2.2914 - acc: 0.1123 - val_loss: 2.2880 - val

<keras.callbacks.History at 0x7ff57404f860>

In [14]:
# looking at the model summary

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 400)               314000    
_________________________________________________________________
dense_2 (Dense)              (None, 300)               120300    
_________________________________________________________________
dense_3 (Dense)              (None, 300)               90300     
_________________________________________________________________
dense_4 (Dense)              (None, 300)               90300     
_________________________________________________________________
dense_5 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_6 (Dense)              (None, 10)                1010      
Total params: 646,010
Trainable params: 646,010
Non-trainable params: 0
_________________________________________________________________


In [15]:
# setting the learning rate
learning_rate = 0.01
adam = optimizers.Adam(lr = learning_rate)


# compiling the model
# using the plain vanilla stochastic gradient descent as our optimizing technique
# using categorical cross entropy for multiple outputs
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

# feeding the training data to the model
model.fit(x_train, y_train, batch_size = 100, epochs = 20, verbose = 2, validation_data = (x_cv, y_cv))

Train on 45000 samples, validate on 15000 samples
Epoch 1/20
 - 8s - loss: 0.1505 - acc: 0.9600 - val_loss: 0.1826 - val_acc: 0.9495
Epoch 2/20
 - 8s - loss: 0.0821 - acc: 0.9737 - val_loss: 0.1450 - val_acc: 0.9603
Epoch 3/20
 - 8s - loss: 0.0657 - acc: 0.9797 - val_loss: 0.1425 - val_acc: 0.9611
Epoch 4/20
 - 8s - loss: 0.0592 - acc: 0.9812 - val_loss: 0.1370 - val_acc: 0.9647
Epoch 5/20
 - 8s - loss: 0.0469 - acc: 0.9850 - val_loss: 0.1130 - val_acc: 0.9701
Epoch 6/20
 - 8s - loss: 0.0408 - acc: 0.9872 - val_loss: 0.1246 - val_acc: 0.9691
Epoch 7/20
 - 8s - loss: 0.0392 - acc: 0.9875 - val_loss: 0.1574 - val_acc: 0.9611
Epoch 8/20
 - 8s - loss: 0.0352 - acc: 0.9889 - val_loss: 0.1198 - val_acc: 0.9696
Epoch 9/20
 - 9s - loss: 0.0337 - acc: 0.9901 - val_loss: 0.1259 - val_acc: 0.9709
Epoch 10/20
 - 8s - loss: 0.0285 - acc: 0.9909 - val_loss: 0.1246 - val_acc: 0.9706
Epoch 11/20
 - 8s - loss: 0.0257 - acc: 0.9922 - val_loss: 0.1301 - val_acc: 0.9705
Epoch 12/20
 - 8s - loss: 0.0256 - 

<keras.callbacks.History at 0x7ff52036e358>

**Thanks for Reading the Kernel, Please upvote if you like.**