# Deep Neural Network for MNIST Classification

The dataset is called MNIST and refers to handwritten digit recognition.

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes. 

The goal would be to build a neural network with 2 or 3 hidden layers. and test the model using the test dataset. 

## Import the relevant packages

In [2]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

## Data

That's where we load and preprocess our data.

In [3]:
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

[1mDownloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to C:\Users\Mohsen\tensorflow_datasets\mnist\3.0.1...[0m


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Generating splits...:   0%|          | 0/2 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

Shuffling mnist-train.tfrecord...:   0%|          | 0/60000 [00:00<?, ? examples/s]

Generating test examples...: 0 examples [00:00, ? examples/s]

Shuffling mnist-test.tfrecord...:   0%|          | 0/10000 [00:00<?, ? examples/s]

[1mDataset mnist downloaded and prepared to C:\Users\Mohsen\tensorflow_datasets\mnist\3.0.1. Subsequent calls will reuse this data.[0m


In [16]:
mnist_dataset

{Split('train'): <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>,
 Split('test'): <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>}

In [19]:
mnist_train , mnist_test = mnist_dataset['train'] , mnist_dataset['test']

In [35]:
train_size = sum(1 for _ in mnist_train)
test_size = sum(1 for _ in mnist_test)

print('The size of the train dataset is: ', train_size)
print('The size of the test dataset is: ', test_size)

The size of the train dataset is:  60000
The size of the test dataset is:  10000


In [38]:
# I want to have a validation sample
# which is 10% of the training dataset
minst_validation = 0.1 * mnist_info.splits['train'].num_examples
# to make sure the validations are integers:
num_minst_validation = tf.cast(minst_validation, tf.int64)

In [39]:
#I am going to make sure the same for the test sample
num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)

In [44]:
"""
here I want to scale the inputs
each image has between 0 to 255 shades of gery
"""
def scale(image,label):
    image = tf.cast(image, tf.float32)
    image = image /255.
    return image , label

scaled_train_validation = mnist_train.map(scale)
scaled_test = mnist_test.map(scale)

In [48]:
#I want to shuffle the data 
BUFFER_SIZE = 5000 # 5000 samples at a time

shuffled_train_validation = scaled_train_validation.shuffle(BUFFER_SIZE)

validation_data = shuffled_train_validation.take(num_minst_validation)
train_data = shuffled_tran_validation.skip(num_minst_validation)

In [49]:
#Batching
BATCH_SIZE = 100
train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_minst_validation) # actually it is only one batch cuz we do not need to batch validation and test data
test_data = scaled_test.batch(num_test_samples)

In [50]:
validation_inputs , validation_targets = next(iter(validation_data))

# MODEL

In [60]:
input_size = 784 # 28 * 28 * 1
output_size = 10  # 0,1,...,9
hidden_layer_size = 200

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape = (28,28,1)), #the input layer
    tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'), #the first hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'), #the second hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation = 'relu'), #the third hidden layer
    tf.keras.layers.Dense(output_size, activation = 'softmax') # the output is probabilities
    
])

## Choosing the optimizer & loss function

In [61]:
#as the result would be in one_hot encoding format I used sparse_categorical_crossentropy

model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy',metrics = ['accuracy'])

## Training the Model

In [62]:
NUM_EPOCHS = 10

model.fit(train_data, epochs = NUM_EPOCHS, validation_data = (validation_inputs, validation_targets), verbose = 2)

Epoch 1/10
540/540 - 8s - loss: 0.2675 - accuracy: 0.9204 - val_loss: 0.1308 - val_accuracy: 0.9598
Epoch 2/10
540/540 - 7s - loss: 0.0987 - accuracy: 0.9692 - val_loss: 0.0924 - val_accuracy: 0.9747
Epoch 3/10
540/540 - 7s - loss: 0.0664 - accuracy: 0.9790 - val_loss: 0.0757 - val_accuracy: 0.9780
Epoch 4/10
540/540 - 7s - loss: 0.0484 - accuracy: 0.9847 - val_loss: 0.0684 - val_accuracy: 0.9792
Epoch 5/10
540/540 - 7s - loss: 0.0397 - accuracy: 0.9873 - val_loss: 0.0658 - val_accuracy: 0.9808
Epoch 6/10
540/540 - 7s - loss: 0.0327 - accuracy: 0.9894 - val_loss: 0.0564 - val_accuracy: 0.9820
Epoch 7/10
540/540 - 7s - loss: 0.0267 - accuracy: 0.9914 - val_loss: 0.0696 - val_accuracy: 0.9798
Epoch 8/10
540/540 - 7s - loss: 0.0238 - accuracy: 0.9922 - val_loss: 0.0560 - val_accuracy: 0.9850
Epoch 9/10
540/540 - 7s - loss: 0.0227 - accuracy: 0.9925 - val_loss: 0.0394 - val_accuracy: 0.9888
Epoch 10/10
540/540 - 7s - loss: 0.0225 - accuracy: 0.9924 - val_loss: 0.0414 - val_accuracy: 0.9875

<tensorflow.python.keras.callbacks.History at 0x208924a2e50>

## Testing the model

In [63]:
test_loss , test_accuracy = model.evaluate(test_data)

