# Exercises

### 1. The *width* (the hidden layer size) of the algorithm. Try a hidden layer size of 200. How does the validation accuracy of the model change? What about the time it took the algorithm to train? Can you find a hidden layer size that does better?

**Solution**

Find the variable: "hidden_layer_size" and change it to 200.

The **validation accuracy is significantly higher** (as the algorithm with 50 hidden units was too simple of a model).

Naturally, it **takes the algorithm much longer to train** (**unless early stopping** is triggered too soon).

A hidden layer size of 500 (and not only) works even better. And there is not that much increan in accuracy when the hidden layer size change to 600.



1.   laywers size = 50  
    Epoch 1/5 - 9s - loss: 0.4140 - accuracy: 0.8855 val_accuracy: 0.9377  
    Epoch 5/5 - 5s - loss: 0.0974 - accuracy: 0.9710 val_accuracy: 0.9698
2.   laywers size = 200  
    Epoch 1/5 - 15s - loss: 0.2747 - accuracy: 0.9208 val_accuracy: 0.9622  
    Epoch 5/5 - 7s - loss: 0.0400 - accuracy: 0.9872 val_accuracy: 0.9860  
3.   laywers size = 300  
    Epoch 1/5 - 15s - loss: 0.2411 - accuracy: 0.9296 val_accuracy: 0.9628  
    Epoch 5/5 - 6s - loss: 0.0339 - accuracy: 0.9886 val_accuracy: 0.9875  
4.   layers size = 400  
    Epoch 1/5 - 17s - loss: 0.2304 - accuracy: 0.9319 val_accuracy: 0.9683  
    Epoch 5/5 - 8s - loss: 0.0333 - accuracy: 0.9893 val_accuracy: 0.9888  
5.   layers size = 500  
    Epoch 1/5 - 18s - loss: 0.2175 - accuracy: 0.9355 val_accuracy: 0.9645  
    Epoch 5/5 - 9s - loss: 0.0291 - accuracy: 0.9905 val_accuracy: 0.9888
6.   layers size = 600  
    Epoch 1/5 - 24s - loss: 0.2138 - accuracy: 0.9357 val_accuracy: 0.9687  
    Epoch 5/5 - 12s - loss: 0.0299 - accuracy: 0.9903 val_accuracy: 0.9843  









# Deep Neural Network for MNIST Classification

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs).

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image).

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes.

Our goal would be to build a neural network with 2 hidden layers.

## Import the relevant packages

In [None]:
import numpy as np
import tensorflow as tf

import tensorflow_datasets as tfds


## Data

That's where we load and preprocess our data.

In [None]:

mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)


def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255.

    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)
test_data = mnist_test.map(scale)


BUFFER_SIZE = 10000

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)


BATCH_SIZE = 100

train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_validation_samples)
test_data = test_data.batch(num_test_samples)

validation_inputs, validation_targets = next(iter(validation_data))

## Model

### Outline the model
When thinking about a deep learning algorithm, we mostly imagine building the model. So, let's do it :)

In [None]:
input_size = 784
output_size = 10

hidden_layer_size = 600

model = tf.keras.Sequential([

    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), # input layer

    # tf.keras.layers.Dense is basically implementing: output = activation(dot(input, weight) + bias)
    # it takes several arguments, but the most important ones for us are the hidden_layer_size and the activation function
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer

    # the final layer is no different, we just make sure to activate it with softmax
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

### Choose the optimizer and the loss function

In [None]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
That's where we train the model we have built.

In [None]:
NUM_EPOCHS = 5

model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose =2)

Epoch 1/5
540/540 - 24s - loss: 0.2138 - accuracy: 0.9357 - val_loss: 0.1006 - val_accuracy: 0.9687 - 24s/epoch - 45ms/step
Epoch 2/5
540/540 - 17s - loss: 0.0807 - accuracy: 0.9748 - val_loss: 0.0869 - val_accuracy: 0.9717 - 17s/epoch - 31ms/step
Epoch 3/5
540/540 - 15s - loss: 0.0522 - accuracy: 0.9834 - val_loss: 0.0616 - val_accuracy: 0.9795 - 15s/epoch - 28ms/step
Epoch 4/5
540/540 - 11s - loss: 0.0378 - accuracy: 0.9880 - val_loss: 0.0434 - val_accuracy: 0.9848 - 11s/epoch - 20ms/step
Epoch 5/5
540/540 - 12s - loss: 0.0299 - accuracy: 0.9903 - val_loss: 0.0498 - val_accuracy: 0.9843 - 12s/epoch - 23ms/step


<keras.src.callbacks.History at 0x7b4c089cf2e0>

## Test the model

In [None]:
test_loss, test_accuracy = model.evaluate(test_data)
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.08. Test accuracy: 97.91%


1.   laywers size = 50  
    Epoch 1/5 - 9s - loss: 0.4140 - accuracy: 0.8855 val_accuracy: 0.9377  
    Epoch 5/5 - 5s - loss: 0.0974 - accuracy: 0.9710 val_accuracy: 0.9698
2.   laywers size = 200  
    Epoch 1/5 - 15s - loss: 0.2747 - accuracy: 0.9208 val_accuracy: 0.9622  
    Epoch 5/5 - 7s - loss: 0.0400 - accuracy: 0.9872 val_accuracy: 0.9860  
3.   laywers size = 300  
    Epoch 1/5 - 15s - loss: 0.2411 - accuracy: 0.9296 val_accuracy: 0.9628  
    Epoch 5/5 - 6s - loss: 0.0339 - accuracy: 0.9886 val_accuracy: 0.9875  
4.   layers size = 400  
    Epoch 1/5 - 17s - loss: 0.2304 - accuracy: 0.9319 val_accuracy: 0.9683  
    Epoch 5/5 - 8s - loss: 0.0333 - accuracy: 0.9893 val_accuracy: 0.9888  
5.   layers size = 500  
    Epoch 1/5 - 18s - loss: 0.2175 - accuracy: 0.9355 val_accuracy: 0.9645  
    Epoch 5/5 - 9s - loss: 0.0291 - accuracy: 0.9905 val_accuracy: 0.9888
6.   layers size = 600  
    Epoch 1/5 - 24s - loss: 0.2138 - accuracy: 0.9357 val_accuracy: 0.9687  
    Epoch 5/5 - 12s - loss: 0.0299 - accuracy: 0.9903 val_accuracy: 0.9843  







