# Deep Neural Network for MNIST Classification

Source: 
- https://www.udemy.com/course/the-data-science-course-complete-data-science-bootcamp/
- Section 50: Deep Learning - Classifying on the MNIST Dataset
- https://de.wikipedia.org/wiki/MNIST-Datenbank

We'll apply all the knowledge from the lectures in this section to write a deep neural network. The problem we've chosen is referred to as the "Hello World" of deep learning because for most students it is the first deep learning algorithm they see.

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs). 

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes. 

Our goal would be to build a neural network with 2 hidden layers.

## Import the relevant packages

In [119]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

# TensorFLow includes a data provider for MNIST that we'll use.
# It comes with the tensorflow-datasets module, therefore, if you haven't please install the package using
# pip install tensorflow-datasets 
# or
# conda install tensorflow-datasets

import tensorflow_datasets as tfds

# these datasets will be stored in C:\Users\*USERNAME*\tensorflow_datasets\...
# the first time you download a dataset, it is stored in the respective folder 
# every other time, it is automatically loading the copy on your computer 

## Load and preprocess data

In [120]:
# tfds.load actually loads a dataset (or downloads and then loads if that's the first time you use it) 
# in our case, we are interesteed in the MNIST; the name of the dataset is the only mandatory argument
# there are other arguments we can specify, which we can find useful
# mnist_dataset = tfds.load(name='mnist', as_supervised=True)
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

In [121]:
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

In [122]:
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)

def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255.
    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)

test_data = mnist_test.map(scale)


In [123]:
BUFFER_SIZE = 10000

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

validation_data = shuffled_train_and_validation_data.take(num_validation_samples)
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)
test_data = test_data.batch(num_test_samples)

BATCH_SIZE = 100

train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_validation_samples)

validation_inputs, validation_targets = next(iter(validation_data))

2023-08-07 17:52:35.668035: W tensorflow/core/kernels/data/cache_dataset_ops.cc:854] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.


## Define the multi-layer model

In [124]:
input_size = 784
output_size = 10
hidden_layer_size = 200

model = tf.keras.Sequential([
                            tf.keras.layers.Flatten(input_shape=(28,28,1)),
                            tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            # tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            # tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            # tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            tf.keras.layers.Dense(output_size, activation='softmax')
                            ])

## Choose the optimizer and the loss function

In [125]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

## Train the model

In [126]:
NUM_EPOCHS = 5

model.fit(train_data, epochs = NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose=2)

Epoch 1/5
540/540 - 1s - loss: 0.2758 - accuracy: 0.9217 - val_loss: 0.1289 - val_accuracy: 0.9638 - 1s/epoch - 2ms/step
Epoch 2/5
540/540 - 1s - loss: 0.1061 - accuracy: 0.9683 - val_loss: 0.0876 - val_accuracy: 0.9762 - 832ms/epoch - 2ms/step
Epoch 3/5
540/540 - 1s - loss: 0.0726 - accuracy: 0.9779 - val_loss: 0.0710 - val_accuracy: 0.9792 - 874ms/epoch - 2ms/step
Epoch 4/5
540/540 - 1s - loss: 0.0531 - accuracy: 0.9833 - val_loss: 0.0724 - val_accuracy: 0.9775 - 843ms/epoch - 2ms/step
Epoch 5/5
540/540 - 1s - loss: 0.0418 - accuracy: 0.9868 - val_loss: 0.0481 - val_accuracy: 0.9858 - 863ms/epoch - 2ms/step


<keras.src.callbacks.History at 0x2d51b7a50>

## Test the model

In [127]:
test_loss, test_accuracy = model.evaluate(test_data)



In [128]:
print(f"test loss:     {test_loss}")
print(f"test accuracy: {test_accuracy}")

test loss:     0.07256560772657394
test accuracy: 0.9779000282287598
