Deep Neural Network for MNIST Classification

The dataset is called MNIST and refers to handwritten digit recognition. it provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image).
  
The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes.
Fares Yassen

# Importing the relevant packages 

In [1]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

# Data
We will load and preprocess the data.

In [7]:
mnist_dataset, mnist_info = tfds.load(name='mnist',with_info=True, as_supervised=True)

mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']


num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
num_validation_samples = tf.cast(num_validation_samples, tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_samples = tf.cast(num_test_samples, tf.int64)


def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255.
    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)

test_data = mnist_test.map(scale)

#shuffling is important to do in preprocessing 
#dividing the data into buffers to optimize the computational power 
BUFFER_SIZE = 10000

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

#We will take validation data as 10% of the training data, as we already calculated
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)
#the rest is train data so we skip the number of validation data samples
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)


BATCH_SIZE = 100

train_data = train_data.batch(BATCH_SIZE)
#we dont really need batching for validation data as we are only forward propgating so that takes not much computational power
# However, the model excepts validation data in a batch form too , so one single batch is enough 
validation_data = validation_data.batch(num_validation_samples)
test_data = test_data.batch(num_test_samples)

# convert the validation data into a input-target shape like test and train datsets 
validation_inputs, validation_targets = next(iter(validation_data))

# Model

## outline the model

In [71]:
input_size = 784
output_size = 10
hidden_layer_size = 280

model = tf.keras.Sequential([
                            #input layer
                            tf.keras.layers.Flatten(input_shape=(28,28,1)),
                            #hidden layer 1
                            tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            #hidden layer 2
                            tf.keras.layers.Dense(hidden_layer_size, activation='relu'),
                            #output layer
                            # note that we used softmax since we're making a classifier we need output values as probabilities 
                            tf.keras.layers.Dense(output_size, activation='softmax')    
                            ])

## Choose the optimizer and the loss function

In [72]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

## Training
Here we train the model we have built

In [73]:
NUM_EPOCHS = 5

model.fit(train_data, epochs = NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose=2)

Epoch 1/5
540/540 - 5s - loss: 0.2490 - accuracy: 0.9273 - val_loss: 0.1317 - val_accuracy: 0.9632 - 5s/epoch - 10ms/step
Epoch 2/5
540/540 - 6s - loss: 0.0948 - accuracy: 0.9713 - val_loss: 0.0846 - val_accuracy: 0.9752 - 6s/epoch - 10ms/step
Epoch 3/5
540/540 - 5s - loss: 0.0621 - accuracy: 0.9808 - val_loss: 0.0634 - val_accuracy: 0.9820 - 5s/epoch - 9ms/step
Epoch 4/5
540/540 - 5s - loss: 0.0450 - accuracy: 0.9861 - val_loss: 0.0527 - val_accuracy: 0.9842 - 5s/epoch - 9ms/step
Epoch 5/5
540/540 - 6s - loss: 0.0337 - accuracy: 0.9889 - val_loss: 0.0437 - val_accuracy: 0.9882 - 6s/epoch - 11ms/step


<keras.src.callbacks.History at 0x1453f3730a0>

# Test the model

In [74]:
test_loss, test_accuracy = model.evaluate(test_data)



In [75]:
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.07. Test accuracy: 97.99%
