# Exercises
### 3. The *width and depth* of the algorithm. Add as many additional layers as you need to reach 5 hidden layers. Moreover, adjust the width of the algorithm as you find suitable. How does the validation accuracy change? What about the time it took the algorithm to train?

**Solution**

This exercise is pretty much the same as the previous one. However, it will get us to a much deeper net. As we noted in the previous exercise, you a deeeper net may need to be wider to produce better results.

We tried with 1000 hidden units in each layer and 5 hidden layers.

The result (as you can see below) is that our **model's training was going very well,** until it *overfit*. It did so by quite a lot.

It took my personal computer around 5-6 minutes to train the model.

What if you have more epochs?

1.   3 hidden laywers and 150 layers size  
    Epoch 1/5 - 12s - loss: 0.2760 - accuracy: 0.9177 val_accuracy: 0.9622  
    Epoch 5/5 - 3s - loss: 0.0438 - accuracy: 0.9864 val_accuracy: 0.9838
2.   4 hidden laywers and 400 layers size      
    Epoch 1/5 - 13s - loss: 0.2255 - accuracy: 0.9320 val_accuracy: 0.9650  
    Epoch 5/5 - 3s - loss: 0.0413 - accuracy: 0.9867 val_accuracy: 0.9860  
3.   5 hidden laywers and 700 layers size  (overfit occur)  
    Epoch 1/5 - 13s - loss: 0.2351 - accuracy: 0.9289 val_accuracy: 0.9597   
    Epoch 3/5 - 5s - loss: 0.0800 - accuracy: 0.9776 val_accuracy: 0.9805   
    Epoch 4/5 - 4s - loss: 0.0587 - accuracy: ***0.9830*** val_accuracy: ***0.9842***  
    Epoch 5/5 - 6s - loss: 0.0518 - accuracy: ***0.9848*** val_accuracy: ***0.9793***  
4.   5 hidden laywers and 1000 layers size  (overfit occur)    
    Epoch 1/5 - 13s - loss: 0.2351 - accuracy: 0.9300 val_accuracy: 0.9645  
    Epoch 2/5 - 4s - loss: 0.1082 - accuracy: 0.9700 val_accuracy: 0.9750  
    Epoch 3/5 - 7s - loss: 0.0788 - accuracy: 0.9773 val_accuracy: 0.9765  
    Epoch 4/5 - 5s - loss: 0.0643 - accuracy: 0.9821 val_accuracy: 0.9815  
    Epoch 5/5 - 4s - loss: 0.0595 - accuracy: 0.9841 val_accuracy: 0.9745  
5.   ----   
    Epoch 1/5 - 14s - loss: 0.2348 - accuracy: 0.9277 val_accuracy: 0.9657  
    Epoch 5/5 - 4s - loss: 0.0493 - accuracy: 0.9856 val_accuracy: 0.9832
6.   ----   
    Epoch 1/5 - 24s - loss: 0.2138 - accuracy: 0.9357 val_accuracy: 0.9687  
    Epoch 5/5 - 12s - loss: 0.0299 - accuracy: 0.9903 val_accuracy: 0.9843  









# Deep Neural Network for MNIST Classification

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs).

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image).

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes.

Our goal would be to build a neural network with 2 hidden layers.

## Import the relevant packages

In [1]:
import numpy as np
import tensorflow as tf

import tensorflow_datasets as tfds



## Data

That's where we load and preprocess our data.

In [2]:

mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)


def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255.

    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)
test_data = mnist_test.map(scale)


BUFFER_SIZE = 10000

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)


BATCH_SIZE = 100

train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_validation_samples)
test_data = test_data.batch(num_test_samples)

validation_inputs, validation_targets = next(iter(validation_data))

Downloading and preparing dataset 11.06 MiB (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to /root/tensorflow_datasets/mnist/3.0.1...


Dl Completed...:   0%|          | 0/5 [00:00<?, ? file/s]

Dataset mnist downloaded and prepared to /root/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.


## Model

### Outline the model
When thinking about a deep learning algorithm, we mostly imagine building the model. So, let's do it :)

In [3]:
input_size = 784
output_size = 10

hidden_layer_size = 1000

model = tf.keras.Sequential([

    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), # input layer

    # tf.keras.layers.Dense is basically implementing: output = activation(dot(input, weight) + bias)
    # it takes several arguments, but the most important ones for us are the hidden_layer_size and the activation function
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 3rd hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 4th hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 5th hidden layer

    # the final layer is no different, we just make sure to activate it with softmax
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

### Choose the optimizer and the loss function

In [4]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
That's where we train the model we have built.

In [5]:
NUM_EPOCHS = 5

model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose =2)

Epoch 1/5
540/540 - 14s - loss: 0.2391 - accuracy: 0.9280 - val_loss: 0.1276 - val_accuracy: 0.9675 - 14s/epoch - 26ms/step
Epoch 2/5
540/540 - 4s - loss: 0.1072 - accuracy: 0.9695 - val_loss: 0.1087 - val_accuracy: 0.9743 - 4s/epoch - 7ms/step
Epoch 3/5
540/540 - 4s - loss: 0.0795 - accuracy: 0.9774 - val_loss: 0.0749 - val_accuracy: 0.9823 - 4s/epoch - 7ms/step
Epoch 4/5
540/540 - 4s - loss: 0.0670 - accuracy: 0.9811 - val_loss: 0.0572 - val_accuracy: 0.9850 - 4s/epoch - 8ms/step
Epoch 5/5
540/540 - 4s - loss: 0.0506 - accuracy: 0.9860 - val_loss: 0.0604 - val_accuracy: 0.9855 - 4s/epoch - 7ms/step


<keras.src.callbacks.History at 0x7aab9b426590>

## Test the model


In [6]:
test_loss, test_accuracy = model.evaluate(test_data)
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.09. Test accuracy: 97.72%


1.   3 hidden laywers and 150 layers size  
    Epoch 1/5 - 12s - loss: 0.2760 - accuracy: 0.9177 val_accuracy: 0.9622  
    Epoch 5/5 - 3s - loss: 0.0438 - accuracy: 0.9864 val_accuracy: 0.9838
2.   4 hidden laywers and 400 layers size      
    Epoch 1/5 - 13s - loss: 0.2255 - accuracy: 0.9320 val_accuracy: 0.9650  
    Epoch 5/5 - 3s - loss: 0.0413 - accuracy: 0.9867 val_accuracy: 0.9860  
3.   5 hidden laywers and 700 layers size  (overfit occur)  
    Epoch 1/5 - 13s - loss: 0.2351 - accuracy: 0.9289 val_accuracy: 0.9597   
    Epoch 3/5 - 5s - loss: 0.0800 - accuracy: 0.9776 val_accuracy: 0.9805   
    Epoch 4/5 - 4s - loss: 0.0587 - accuracy: ***0.9830*** val_accuracy: ***0.9842***  
    Epoch 5/5 - 6s - loss: 0.0518 - accuracy: ***0.9848*** val_accuracy: ***0.9793***  
4.   5 hidden laywers and 1000 layers size  (overfit occur)    
    Epoch 1/5 - 13s - loss: 0.2351 - accuracy: 0.9300 val_accuracy: 0.9645  
    Epoch 2/5 - 4s - loss: 0.1082 - accuracy: 0.9700 val_accuracy: 0.9750  
    Epoch 3/5 - 7s - loss: 0.0788 - accuracy: 0.9773 val_accuracy: 0.9765  
    Epoch 4/5 - 5s - loss: 0.0643 - accuracy: 0.9821 val_accuracy: 0.9815  
    Epoch 5/5 - 4s - loss: 0.0595 - accuracy: 0.9841 val_accuracy: 0.9745  
5.   ----   
    Epoch 1/5 - 14s - loss: 0.2348 - accuracy: 0.9277 val_accuracy: 0.9657  
    Epoch 5/5 - 4s - loss: 0.0493 - accuracy: 0.9856 val_accuracy: 0.9832
6.   ----   
    Epoch 1/5 - 24s - loss: 0.2138 - accuracy: 0.9357 val_accuracy: 0.9687  
    Epoch 5/5 - 12s - loss: 0.0299 - accuracy: 0.9903 val_accuracy: 0.9843  







