# Deep Neural Network for MNIST Classification

We'll apply all the knowledge from the lectures in this section to write a deep neural network. The problem we've chosen is referred to as the "Hello World" of deep learning because for most students it is the first deep learning algorithm they see.

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs).

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image).

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes.

Our goal would be to build a neural network with 2 hidden layers.

## Import the relevant packages

In [1]:
import numpy as np
import tensorflow as tf

import tensorflow_datasets as tfds







## Data

In [2]:
# tfds.load(name) loads a dataset from TensorFlow datasets
# tfds.load(name, with_info) loads a dataset from TensorFlow datasets -> with_info = True, provide a tuple containing info about version, features, # samples of the data
# tfds.load(name,with_info,as_supervised) loads a datasets from TensorFlow datasets -> as_supervised = True, loads the data in 2-tuple structure [input,output]
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

# Set number of validation samples
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples

# Override number of validation samples
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

# Set the test_sample in the dedicated variable
num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(tf.cast(num_validation_samples, tf.int64), tf.int64)

# Scale data
def scale(image, label):
    
    # Make sure values are floats
    image = tf.cast(image, tf.float32)
    
    # Scaling
    image /= 255.
    return image, label


scaled_train_and_validation_data = mnist_train.map(scale)

test_data = mnist_test.map(scale)

# Shuffling Data

# Define buffer size
BUFFER_SIZE = 10000

# NOTE
# buffer_size = 1 --> Not suffling will happen
# bubber_size >= num_samples --> shuffling will happen at once
# 1 < buffer_size < num_samples --> it will optimize the computational power

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

# Train data
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

BATCH_SIZE = 1000

# dataset.batch(batch_size) a method that combines the consecutive elemets of a dataset into batches
train_data = train_data.batch(BATCH_SIZE)

# Validation data
validation_data = validation_data.batch(num_validation_samples)

test_data = test_data.batch(num_test_samples)

# next() load the next element of an iterable object 
# iter() creates an object which can be iterated one elemet at a time
validation_inputs, validation_targets = next(iter(validation_data))

## Model


#### Outline the model

In [3]:
input_size = 784
output_size = 10

# Hidden Layers
hidden_layer_size = 1000

# tf.keras.Sequetial() function that is laying down the model
model = tf.keras.Sequential([
    
    # tf.keras.layers.Flatten(original shape) transform(flattens) a tensor into a vector
    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), # Input layer
    
    # tf.keras.Dense(output size) takes the input, provided to the model and calculates the dot product of the input and the weight and adds the bias
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])







## Choose the optimizer and the loss function

In [4]:
# mode.compile(optimmizer, loss, metric) configures the model for training
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])







## Training

In [5]:
# Maximum number of epochs
NUM_EPOCHS = 5

# fit model
# model.fit(data that would be trin, number of epochs, validation data we just created ourselves in the format: (inputs,targets))
model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose =2)

Epoch 1/5












54/54 - 4s - loss: 0.4045 - accuracy: 0.8834 - val_loss: 0.1750 - val_accuracy: 0.9488 - 4s/epoch - 69ms/step
Epoch 2/5
54/54 - 3s - loss: 0.1294 - accuracy: 0.9616 - val_loss: 0.1064 - val_accuracy: 0.9688 - 3s/epoch - 49ms/step
Epoch 3/5
54/54 - 3s - loss: 0.0815 - accuracy: 0.9758 - val_loss: 0.0825 - val_accuracy: 0.9730 - 3s/epoch - 49ms/step
Epoch 4/5
54/54 - 3s - loss: 0.0554 - accuracy: 0.9837 - val_loss: 0.0641 - val_accuracy: 0.9803 - 3s/epoch - 49ms/step
Epoch 5/5
54/54 - 3s - loss: 0.0398 - accuracy: 0.9887 - val_loss: 0.0516 - val_accuracy: 0.9827 - 3s/epoch - 51ms/step


<keras.src.callbacks.History at 0x21badaf6130>

## Test model

In [6]:
test_loss, test_accuracy = model.evaluate(test_data)



In [7]:
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.08. Test accuracy: 97.73%
