# Deep Neural Network for MNIST Classification

We'll apply all the knowledge from the lectures in this section to write a deep neural network. The problem we've chosen is referred to as the "Hello World" of deep learning because for most students it is the first deep learning algorithm they see.

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs). 

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes. 

Our goal would be to build a neural network.


## Import the relevant packages

In [1]:
import numpy as np
import tensorflow as tf

import tensorflow_datasets as tfds



## Data

That's where we load and preprocess our data.

In [2]:

mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)
# with_info=True will also provide us with a tuple containing information about the version, features, number of samples
# we will use this information a bit below and we will store it in mnist_info
# as_supervised=True will load the dataset in a 2-tuple structure (input, target) 

# once we have loaded the dataset, we can easily extract the training and testing dataset with the built references
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

# TF has training and testing datasets, but no validation sets
# thus we must split it on our own


num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
# casting this number to an integer, as a float may cause an error along the way
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

# storing the number of test samples in a dedicated variable (instead of using the mnist_info one)
num_test_samples = mnist_info.splits['test'].num_examples
#  Castind as an integer (rather than the default float)
num_test_samples = tf.cast(num_test_samples, tf.int64)



# defined a function called: scale, that will take an MNIST image and its label which will scale the inputs from 0 to 1
def scale(image, label):
    image = tf.cast(image, tf.float32)
    # possible values for the inputs are 0 to 255 (256 different shades of grey)
    image /= 255.

    return image, label


# the method .map() allows us to apply a custom transformation to a given dataset

scaled_train_and_validation_data = mnist_train.map(scale)


test_data = mnist_test.map(scale)


BUFFER_SIZE = 1000
# Since we're dealing with enormous datasets
# we can't shuffle the whole dataset in one go because we can't fit it all in memory
# so instead TF only stores BUFFER_SIZE samples in memory at a time 
# BUFFER_SIZE in between - a computational optimization to approximate uniform shuffling


shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)


# validation data would be equal to 10% of the training set
# using the .take() method to take that many samples
# finally, we create a batch with a batch size equal to the total number of validation samples
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

# the train_data is everything else, so we skip as many samples as there are in the validation dataset
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

# determine the batch size
BATCH_SIZE = 1000

# we can also take advantage of the occasion to batch the train data
# this would be very helpful when we train, as we would be able to iterate over the different batches
train_data = train_data.batch(BATCH_SIZE)

validation_data = validation_data.batch(num_validation_samples)

# batch the test data
test_data = test_data.batch(num_test_samples)


# takes next batch (it is the only batch)
# because as_supervized=True, we've got a 2-tuple structure
validation_inputs, validation_targets = next(iter(validation_data))

## Model

### Outline the model
When thinking about a deep learning algorithm, we mostly imagine building the model. So, let's do it :)

In [3]:
input_size = 784
output_size = 10
hidden_layer_size = 200
    

model = tf.keras.Sequential([
    

    tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
    
    
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), 
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
    tf.keras.layers.Dense(output_size, activation='softmax') 
])

### Choose the optimizer and the loss function

In [4]:

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
That's where we train the model we have built.

In [5]:
# determine the maximum number of epochs
NUM_EPOCHS = 7
NUM_STEPS = num_validation_samples/BATCH_SIZE

# we fit the model, specifying the
# training data
# the total number of epochs
# and the validation data we just created ourselves in the format: (inputs,targets)
model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets),validation_steps=NUM_STEPS , verbose =2)

Epoch 1/7
54/54 - 55s - loss: 0.6385 - accuracy: 0.8347 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/7
54/54 - 52s - loss: 0.2076 - accuracy: 0.9391 - val_loss: 0.2060 - val_accuracy: 0.9437
Epoch 3/7
54/54 - 52s - loss: 0.1483 - accuracy: 0.9556 - val_loss: 0.1611 - val_accuracy: 0.9523
Epoch 4/7
54/54 - 52s - loss: 0.1148 - accuracy: 0.9657 - val_loss: 0.1372 - val_accuracy: 0.9593
Epoch 5/7
54/54 - 53s - loss: 0.0917 - accuracy: 0.9723 - val_loss: 0.1196 - val_accuracy: 0.9628
Epoch 6/7
54/54 - 53s - loss: 0.0736 - accuracy: 0.9782 - val_loss: 0.1095 - val_accuracy: 0.9673
Epoch 7/7
54/54 - 53s - loss: 0.0616 - accuracy: 0.9817 - val_loss: 0.1019 - val_accuracy: 0.9695


<tensorflow.python.keras.callbacks.History at 0x64906d5d0>

## Test the model



In [6]:
test_loss, test_accuracy = model.evaluate(test_data)



In [7]:
# We can apply some nice formatting if we want to
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.09. Test accuracy: 96.96%
