# Deep Neural Network for MNIST Classification

The MNIST classification refers to handwritten digit recognition. The MNIST dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

This model detects which digit is written.

## Import the relevant packages

In [1]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

## Data

In [2]:
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)
# with_info=True will also provide us with a tuple containing information about the version, features, number of samples
# as_supervised=True will load the dataset in a 2-tuple structure (input, target) 

#extracting the training and testing dataset with the built references
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

# by default, TF has training and testing datasets, but no validation sets
# thus we must split it on our own
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
# casting this number to an integer
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)


# scaling our data in some way to make the result more numerically stable
# in this case the inputs are preferred in between 0 and 1
def scale(image, label):
    # we make sure the value is a float
    image = tf.cast(image, tf.float32)
    # since the possible values for the inputs are 0 to 255 (256 different shades of grey)
    # if we divide each element by 255, we would get the desired result -> all elements will be between 0 and 1 
    image /= 255.

    return image, label


scaled_train_and_validation_data = mnist_train.map(scale)

test_data = mnist_test.map(scale)


# let's also shuffle the data
BUFFER_SIZE = 10000
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

# making batches in training data
BATCH_SIZE = 150
train_data = train_data.batch(BATCH_SIZE)

validation_data = validation_data.batch(num_validation_samples)
test_data = test_data.batch(num_test_samples)

validation_inputs, validation_targets = next(iter(validation_data))

## Model

In [3]:
input_size = 784
output_size = 10
hidden_layer_size = 200
    
# define how the model will look like
model = tf.keras.Sequential([
    
    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), # input layer

    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

### Choose the optimizer and the loss function

In [4]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
That's where we train the model we have built.

In [5]:
# the maximum number of epochs
NUM_EPOCHS = 5

model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets),validation_steps=1, verbose =2)

Epoch 1/5
360/360 - 4s - loss: 0.3054 - accuracy: 0.9132 - val_loss: 0.1430 - val_accuracy: 0.9578
Epoch 2/5
360/360 - 2s - loss: 0.1170 - accuracy: 0.9658 - val_loss: 0.1000 - val_accuracy: 0.9708
Epoch 3/5
360/360 - 2s - loss: 0.0786 - accuracy: 0.9770 - val_loss: 0.0771 - val_accuracy: 0.9792
Epoch 4/5
360/360 - 2s - loss: 0.0575 - accuracy: 0.9832 - val_loss: 0.0594 - val_accuracy: 0.9848
Epoch 5/5
360/360 - 2s - loss: 0.0433 - accuracy: 0.9863 - val_loss: 0.0513 - val_accuracy: 0.9870


<tensorflow.python.keras.callbacks.History at 0x2a8b0abca90>

## Test the model

In [6]:
test_loss, test_accuracy = model.evaluate(test_data)

      1/Unknown - 1s 708ms/step - loss: 0.0752 - accuracy: 0.9777

In [7]:
# We can apply some nice formatting if we want to
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.08. Test accuracy: 97.77%
