<a href="https://colab.research.google.com/github/kevinrosalesdev/AINN-MNIST/blob/master/mnist_neuralnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MNIST Neural Network
## Neural Network to identify a number from a picture.

We are going to build a *Neural Network* that classifies numbers from [**MNIST Dataset**](https://en.wikipedia.org/wiki/MNIST_database).

**MNIST Dataset contains a large amount of 28x28 pictures about handwritten numbers from 0 to 9.**

We will take a huge dataset part to *train* our Neural Network and after that, other part will be taken to *test* our project.

## First, we load data and introduce some parameters for our Model:
It's necessary to reshape and normalize data in order to introduce pixels values in our Model.

In [0]:
from keras.datasets import mnist
import keras

batch_size = 128
num_classes = 10
epochs = 5

# the data, split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000, 784)  # 28x28=784
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255  # We normalize to have values between 0 and 1
X_test /= 255

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)  # onehot encoding
y_test = keras.utils.to_categorical(y_test, num_classes)

## We build the Neural Network Model:

*   We create a Model with an input layer that contains 784 inputs (28x28 pixels).
*   Then, we add three hidden layers (with 980, 500 and 17 neurons). We use [*Relu*](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) as activation function.
*   Finally, we use an output layer with 10 neurons as we have 10 different numbers (Remember that we are classifying from 0 to 9!). We use [*SoftMax*](https://en.wikipedia.org/wiki/Softmax_function) as activation function as we need to classify more than 1 Output.

We use *RMSProp* as **Optimizer** and *Categorical Crossentropy* as **Loss Function.**

In [18]:
from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import SGD #Stochastic Gradient Descent Optimizer

inputs = Input(shape=(784,))  # We have 784 inputs
x = Dense(1000, activation='relu')(inputs)  # 1000 neurons and relu as activation function
y = Dense(500, activation='relu')(x) # 500 neurons and relu as activation function
z = Dense(25, activation='relu')(y)  # 25 neurons and relu as activation function
predictions = Dense(num_classes, activation='softmax')(z)  # Output layer with 10 neurons, one for each class

# This creates a model
model = Model(inputs=inputs, outputs=predictions)

# Instead of using MSE as Loss & SGD as Optimizer, I used Categorical Crossentropy & RMSProp to get adaptive LR and better results.
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',  # Categorical Crossentropy
              metrics=['accuracy'])

print(model.summary())

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_5 (InputLayer)         (None, 784)               0         
_________________________________________________________________
dense_17 (Dense)             (None, 1000)              785000    
_________________________________________________________________
dense_18 (Dense)             (None, 500)               500500    
_________________________________________________________________
dense_19 (Dense)             (None, 25)                12525     
_________________________________________________________________
dense_20 (Dense)             (None, 10)                260       
Total params: 1,298,285
Trainable params: 1,298,285
Non-trainable params: 0
_________________________________________________________________
None


## We train our Model:

In [19]:
model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs)  # starts training with 5 epochs

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7ff8b032d5f8>

## Some Tests:
We can see some **Output Values** from our **Model**. We can try to get how many outputs have been classified **correctly**.

*Test* samples are separated from *Training* samples so we can try to study the existence of **Overfitting**.

Feel free to uncomment print line if you want to see **Full Output**.

In [20]:
predictions = model.predict(X_test)
import numpy
correct = 0
incorrect = 0
for p, l in zip(predictions, y_test):
    if l[numpy.where(p == p.max())] == 1:
      correct += 1
    else:
      incorrect += 1
    #print(p,"->", l)
print("Correct: ", correct)
print("Incorrect: ", incorrect)

Correct:  9817
Incorrect:  183


## Stats

Approximately, 9817 out of 10000 samples have been classified correctly. In consequence, **Overfitting is not a problem in this Model.**

Approximate Stats:

1.   Model Accuracy: 99%
2.   Model Loss: 3%
3.   Real Model Accuracy: 98%
