# Initialization

## Importing libraries

As a first step we load the different libraries we are going to use, in this simple example we only need tensorflow (keras) and numpy

In [33]:
import tensorflow as tf
import numpy as np
from tensorflow.keras import backend as K
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers import Dense,Input
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.layers import BatchNormalization,Dropout,Flatten
from tensorflow.keras.layers import LeakyReLU,PReLU,ELU,ThresholdedReLU,ReLU
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

## Data related parameters

We define a few parameters related to the data we are going to use

In [2]:
# Parameters of the model 
num_classes = 10 # number of output class (1-9)
input_shape = (28, 28, 1) # shape of the input, we have 28 x 28 pixel size images of each number


## Loading Data

Next step is to actually load the MNIST database and perform some simple pre-process in order to introduce it to the neural network


In [3]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data() # we load the data from keras.datsets library

# This automatically generates the train & test set (otherwise we could have done it manually or through another library)

## Normalization ##
# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255 # images are encoded with up to 256 so to normalize from 0-1 we simply divide
x_test = x_test.astype("float32") / 255

# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")


# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


## Model related parameters

Here we define hyperparameters that are going to be used in the model and training below


In [4]:
n_epochs = 5 # Number of epochs, where 1 epoch = 1 use of the training set
n_batch_size = 128 # how many inputs are going to be used at the same time during one epoch. 
# note: small batch size can make the training faster but can make the generalization harder to achieve (less variety in sample)

# number of neurons for the 3 hidden layers
neurons_l1 = 400
neurons_l2 = 200
neurons_l3 = 150

## Generating & compiling the model

Below we'll create and compile the feed forward neural network we will use for this task

In [25]:

model = keras.Sequential()
model.add(Flatten()) # Since the image is 2D we need to "flatten" it into a single array in order to be used as input 
model.add(Dense(neurons_l1))
model.add(ReLU()) # one can use other activation functions for the same purpose (check a few if you want)
model.add(Dense(neurons_l2))
model.add(ReLU())
model.add(Dense(neurons_l3))
model.add(ReLU())
model.add(Dropout(0.5)) # this layers protects the model from overfitting to the test data. Can we replace it maybe?
model.add(Dense(num_classes,activation='softmax')) # multi-class classification problem : softmax activation
model.compile(
              optimizer="adam", # any optimizer works well, adam is a standard choice for simple problems
              loss="categorical_crossentropy", # loss function for classification = crossentropy
              metrics=["accuracy"])

## Example of Training proccedure


Let's now train our model for a few epochs

In [35]:
history = model.fit(x_train, y_train, epochs=n_epochs, batch_size=n_batch_size)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


## Evaluation

To evulate our model we need to see how well it performs to unknown data (test set):

In [36]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.06685923784971237
Test accuracy: 0.9807000160217285


In [39]:
predictions = model.predict_classes(x_test)
y_test_binary = np.argmax(y_test, axis=1)
    
cr = classification_report(y_test_binary, predictions)
cm = confusion_matrix(y_test_binary, predictions)
print(cm)
print(cr)

[[ 973    0    1    2    0    0    1    0    2    1]
 [   0 1126    2    1    0    0    2    0    4    0]
 [   6    1 1013    1    2    0    2    5    2    0]
 [   0    0    3  986    0    1    0    6    6    8]
 [   2    0    0    0  969    0    4    2    0    5]
 [   3    0    0   12    4  864    3    0    3    3]
 [   4    2    3    1    8    5  934    0    1    0]
 [   0    1    6    0    2    0    0 1009    3    7]
 [   5    1    2    3    4    4    0    4  947    4]
 [   2    5    0    2    9    1    0    3    1  986]]
              precision    recall  f1-score   support

           0       0.98      0.99      0.99       980
           1       0.99      0.99      0.99      1135
           2       0.98      0.98      0.98      1032
           3       0.98      0.98      0.98      1010
           4       0.97      0.99      0.98       982
           5       0.99      0.97      0.98       892
           6       0.99      0.97      0.98       958
           7       0.98      0.98   