# Deep Neural Network for MNIST Classification

The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook).

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image).

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes.

Our goal would be to build a neural network with hidden layers.

In [20]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
import keras_tuner as kt
import tensorflow_datasets as tfds

## Data Preprocessing

In [10]:
# Load the data from tfds
mnist_datasets, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

# with_info=True will also provide us with a tuple containing information about the version, features, number of samples
# as_supervised=True will load the dataset in a 2-tuple structure (input, target)

In [14]:
# Check the info of the mnist datasets
mnist_info

tfds.core.DatasetInfo(
    name='mnist',
    full_name='mnist/3.0.1',
    description="""
    The MNIST database of handwritten digits.
    """,
    homepage='http://yann.lecun.com/exdb/mnist/',
    data_path='C:\\Users\\josep\\tensorflow_datasets\\mnist\\3.0.1',
    file_format=tfrecord,
    download_size=11.06 MiB,
    dataset_size=21.00 MiB,
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=uint8),
        'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
    }),
    supervised_keys=('image', 'label'),
    disable_shuffling=False,
    splits={
        'test': <SplitInfo num_examples=10000, num_shards=1>,
        'train': <SplitInfo num_examples=60000, num_shards=1>,
    },
    citation="""@article{lecun2010mnist,
      title={MNIST handwritten digit database},
      author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
      journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
      volume={2},
      year={2010}
    }""",
)

In [16]:
# Check the mnist datasets
mnist_datasets

{'train': <_PrefetchDataset element_spec=(TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>,
 'test': <_PrefetchDataset element_spec=(TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>}

In [18]:
# once we have loaded the dataset, we can easily extract the training and testing dataset
mnist_train, mnist_test = mnist_datasets['train'], mnist_datasets['test']

In [22]:
# total number of training data in the mnist datasets
tf.cast(mnist_info.splits['train'].num_examples, tf.int64)

<tf.Tensor: shape=(), dtype=int64, numpy=60000>

By default, TF has training and testing datasets but no validation sets,
thus we must split it on our own

In [25]:
# we start by defining the number of validation samples as a % of the train samples
num_validation_samples = 0.1 * mnist_info.splits["train"].num_examples
# let's cast this number to an integer, as a float may cause an error along the way
num_validation_samples = tf.cast(num_validation_samples, tf.int64)
num_validation_samples

<tf.Tensor: shape=(), dtype=int64, numpy=6000>

In [27]:
# let's also store the number of test samples in a dedicated variable
num_test_samples = tf.cast(mnist_info.splits['test'].num_examples, tf.int64)
num_test_samples

<tf.Tensor: shape=(), dtype=int64, numpy=10000>

Normally, we would like to scale our data in some way to make the result more numerically stable. \
In this case we will simply prefer to have inputs between 0 and 1.\
The method dataset.map(custom_function) allows us to apply a custom transformation to a given dataset

In [30]:
# let's define a function called: scale, that will take an MNIST image and its label
def scale(image, label):
    image = tf.cast(image, tf.float32) #float values
    # since the possible values for the inputs are 0 to 255 (256 different shades of grey)
    # if we divide each element by 255, we would get the desired result -> all elements will be between 0 and 1 
    image /= 255.

    return image, label

# we will apply the scaling transformation on the mnist_train and store it as both train and validation data
scaled_train_and_validation_data = mnist_train.map(scale)
# same thing for the test data so it has the same magnitude as the train and validation
scaled_test_data = mnist_test.map(scale)

We will also shuffle the combine train and validation data. \
There is no need to shuffle the test data, because we won't be training on it.

This BUFFER_SIZE parameter is here for cases when we're dealing with enormous datasets,
then we can't shuffle the whole dataset in one go because we can't fit it all in memory, so instead TF only stores BUFFER_SIZE samples in memory at a time and shuffles them.
* if BUFFER_SIZE = 1 -- no shuffling will actually happen
* if BUFFER_SIZE >= num samples -- shuffling is uniform
* BUFFER_SIZE in between -- a computational optimization to approximate uniform shuffling

In [33]:
BUFFER_SIZE = 10000

# there is a shuffle method readily available and we just need to specify the buffer size
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

# once we have scaled and shuffled the data, we can proceed to actually extracting the train and validation
# our validation data would be equal to 10% of the training set, which we've already calculated
# we use the .take() method to take that many samples
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

# similarly, the train_data is everything else, so we skip as many samples as there are in the validation dataset
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

# determine the batch size
BATCH_SIZE = 100

# batch the data
train_data = train_data.batch(BATCH_SIZE)

validation_data = validation_data.batch(num_validation_samples)

test_data = scaled_test_data.batch(num_test_samples)

# takes next batch (it is the only batch)
# because as_supervized=True, we've got a 2-tuple structure
validation_inputs, validation_targets = next(iter(validation_data))

In [35]:
# Batched train dataset
train_data

<_BatchDataset element_spec=(TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int64, name=None))>

## Model

### DNN using Tensorflow