# MNIST Classification using fully connected Neural Networks
We will use keras a deep learning library with tensorflow as backend

1) First step is: the imports

In [1]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import SGD
from keras.utils import to_categorical 

Using TensorFlow backend.


2) Second step is: data collection, 

#with the folloing instruction, we will download the mnist data from the server

In [2]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#(x_train, y_train) is the training dataset, where (x_test, y_test) is the testing dataset

3) third step is: data preparation

#we will reshape the MNIST dataset to become 60,000 examples (ie. images) for the training dataset, and 10,000 examples (ie. images) for the testing dataet, and the input to the Network is 784 because each images has 28 x 28 pixels

In [3]:
x_train = x_train.reshape(60000, 784)
x_test  = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test  = x_test.astype('float32')

4) fourth step is data normalization:

#Artificial neural networks provide good performance when the data is normalized

In [4]:
x_train /= 255
x_test /= 255

5) fifth step is data tranformation:

#we will convert the vector of labels (desired vector) to binary class matrices, thus each row in this matrix will become as one-hot encoding

In [5]:
y_train = to_categorical(y_train, 10)
y_test =  to_categorical(y_test, 10)

6) sixth step is: to define the model

#we will create one input layer that has 787 neurons, and two hidden layers, each layer has 128 neurons, finally an output layer that has 10 neurons

#RELU(x) = max(0,x) is a non-linear function
#softmax is another non-linear function, we use this function to squash the outputs of the Artificial neural networks to be between 0 and 1

In [6]:
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(784,)))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()

7) seventh step is: to compile the created model, thus this model will become as a computaional graph

In [7]:
model.compile(loss='categorical_crossentropy',optimizer=SGD(), metrics=['accuracy'])

8) eighth step: after doing all above steps, we can start the model training 

#batch_size the number of training images in one forward/backward pass, this technique is faster and it has demonstrated that is better than using all dataset simultaneously

#epochs: one forward pass and one backward pass of all the training images

In [8]:
history = model.fit(x_train, y_train, batch_size=64, epochs=10, validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


9) the last step is model evaluation

In [9]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.16808781869783998
Test accuracy: 0.9509


#After developing fully connected Artifial Neural Networks, we have obtained a test accuracy = 0.95, and a test loss = 0.16

# To improve the proposed fully connected model, there are many methods and steps

1) first step: adding more epochs, for examples epochs = 30

In [10]:
history = model.fit(x_train, y_train, batch_size=64, epochs=30, validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [None]:
#with only epochs =30, we have obtianed 

In [11]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.08447478631325066
Test accuracy: 0.974


#with only epochs = 30, we have obtained a test accuracy = 0.974, and a test loss = 0.084, these results with adding more epochs we have obtained better results than the above results

2) second step: to obtain more accurate results, we adjust the batch size, for examples batch size = 128

In [12]:
history = model.fit(x_train, y_train, batch_size=64, epochs=30, validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [13]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.07582007238190853
Test accuracy: 0.9769


#with batch size = 128, we have obtained a test accuracy = 0.976, and a test loss = 0.075, these results is betten than the above results

3) to avoid the problem of unbalanced data, we add a weighted cross entropy loss function instead of cross entropy loss function
there are several methods to get the best weight for each class, but here we will assign each weight to 0.1

In [15]:
classes= [0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1]
history = model.fit(x_train, y_train, batch_size=128, epochs=30, validation_data=(x_test, y_test), class_weight=classes)

Train on 60000 samples, validate on 10000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [16]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.07325211380326654
Test accuracy: 0.9777


#with classes= [0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1], we have obtained a test accuracy = 0.977, and a test loss = 0.073, also these results is betten than the above results 

4) the fourth step: changing the architecture, we will add more neurons in each hidden layer

In [17]:
model = Sequential()
model.add(Dense(256, activation='relu', input_shape=(784,)))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

In [18]:
model.compile(loss='categorical_crossentropy',optimizer=SGD(), metrics=['accuracy'])

In [23]:
classes= [0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1]
history = model.fit(x_train, y_train, batch_size=40, epochs=20, validation_data=(x_test, y_test), class_weight=classes)

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [24]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.06823914499669335
Test accuracy: 0.9794


#with these changes of the Neural architecture, we have obtained a test accuracy = 0.979, and a test loss = 0.068, in which these results is better than the above results

5) the last step: we will add more layers, 3 hidden layers, each with 256 neurons

In [25]:
model = Sequential()
model.add(Dense(256, activation='relu', input_shape=(784,)))
model.add(Dense(256, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

In [26]:
model.compile(loss='categorical_crossentropy',optimizer=SGD(), metrics=['accuracy'])

In [31]:
history = model.fit(x_train, y_train, batch_size=32, epochs=30, validation_data=(x_test, y_test))

Train on 60000 samples, validate on 10000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [32]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.08305254170282296
Test accuracy: 0.9806


#we have added only one layer, and we have obtained a test accuracy = 0.98, and a test loss = 0.083.

In the next part, we will use Convolutional Neural Networks to improve the results on the testing dataset
