## Deep Neural Network for MNIST Classification (The "Hello world for deep learning)##

The dataset is called MNIST and it referes to a handwritten digit recognition.
This dataset provides 70,000 images (28x28 pixels) of handwritten digits. One per image

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits, this is a classification problem with 10 classes (outputs).

The goal would be to build a neural network with 2 hidden layers.

# Import the relevant packages #

In [1]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds # pip install tensorflow-datasets

## Data 

In [2]:
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True) # This will download the dataset from tensorflow_datasets to the default path directory "C:\Users\MyUser\tensorflow_datasets"
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test'] # load train and test datasets

num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples # validation sample is 10% of dataset
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)


def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255. # 255 is the max RGB value
    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)
test_data = mnist_test.map(scale)

BUFFER_SIZE = 10000
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

validation_data = shuffled_train_and_validation_data.take(num_validation_samples)
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

BATCH_SIZE = 150

train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_validation_samples)
test_data = test_data.batch(num_test_samples) # hw

validation_inputs, validation_targets = next(iter(validation_data))

## Model ##

### Outline the model ###

In [3]:
input_size = 784
output_size = 10
hidden_layer_size = 5000#200#100

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28,1)), # Input layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # First hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # Second hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # third hidden layer added as hw
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

  super().__init__(**kwargs)


## Choose the optimizer and the loss function ##

In [4]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

## Trainning ##

In [5]:
NUM_EPOCHS = 10

model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose=2)

"""
What happens inside an epoch?
1) At the beginning of each epoch, the training loss will be set to 0
2) The algorithm will iterate over a preset number of batches, all from train_data
3) The weights and biases will be updated as many times as there are batches
4) We will get a value for the loss function, indicating how the training is going
5) We will also see a training accuracy
6) At the end of the epoch, the algorithm will forward propagate the whole validation set

*When we reach the maximum number of epochs the training will be over
"""

Epoch 1/10
360/360 - 66s - 183ms/step - accuracy: 0.9308 - loss: 0.2423 - val_accuracy: 0.9647 - val_loss: 0.1194
Epoch 2/10
360/360 - 64s - 179ms/step - accuracy: 0.9719 - loss: 0.0925 - val_accuracy: 0.9768 - val_loss: 0.0787
Epoch 3/10
360/360 - 64s - 178ms/step - accuracy: 0.9806 - loss: 0.0630 - val_accuracy: 0.9828 - val_loss: 0.0562
Epoch 4/10
360/360 - 64s - 177ms/step - accuracy: 0.9836 - loss: 0.0545 - val_accuracy: 0.9825 - val_loss: 0.0624
Epoch 5/10
360/360 - 64s - 178ms/step - accuracy: 0.9873 - loss: 0.0457 - val_accuracy: 0.9870 - val_loss: 0.0406
Epoch 6/10
360/360 - 65s - 181ms/step - accuracy: 0.9888 - loss: 0.0382 - val_accuracy: 0.9902 - val_loss: 0.0354
Epoch 7/10
360/360 - 65s - 180ms/step - accuracy: 0.9900 - loss: 0.0326 - val_accuracy: 0.9883 - val_loss: 0.0387
Epoch 8/10
360/360 - 64s - 179ms/step - accuracy: 0.9921 - loss: 0.0273 - val_accuracy: 0.9880 - val_loss: 0.0385
Epoch 9/10
360/360 - 65s - 180ms/step - accuracy: 0.9920 - loss: 0.0252 - val_accuracy: 

'\nWhat happens inside an epoch?\n1) At the beginning of each epoch, the training loss will be set to 0\n2) The algorithm will iterate over a preset number of batches, all from train_data\n3) The weights and biases will be updated as many times as there are batches\n4) We will get a value for the loss function, indicating how the training is going\n5) We will also see a training accuracy\n6) At the end of the epoch, the algorithm will forward propagate the whole validation set\n\n*When we reach the maximum number of epochs the training will be over\n'

## Test the model ##

In [6]:
test_loss, test_accuracy = model.evaluate(test_data)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 674ms/step - accuracy: 0.9782 - loss: 0.0942


In [7]:
print(f"Test loss: {test_loss:.2f}. Test accuracy: {test_accuracy*100:.2f}")

Test loss: 0.09. Test accuracy: 97.82
