# Deep Neural Network for MNIST Classification


The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes. 

Our goal would be to build a neural network with 2 hidden layers.

## Import the relevant packages

In [2]:
%pip install tensorflow_datasets

Collecting tensorflow_datasets
  Downloading tensorflow_datasets-4.6.0-py3-none-any.whl (4.3 MB)
Collecting tqdm
  Using cached tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
Collecting promise
  Downloading promise-2.3.tar.gz (19 kB)
Collecting tensorflow-metadata
  Downloading tensorflow_metadata-1.9.0-py3-none-any.whl (51 kB)
Collecting dill
  Downloading dill-0.3.5.1-py2.py3-none-any.whl (95 kB)
Collecting toml
  Downloading toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting etils[epath]
  Downloading etils-0.6.0-py3-none-any.whl (98 kB)
Collecting importlib_resources
  Downloading importlib_resources-5.8.0-py3-none-any.whl (28 kB)
Collecting zipp
  Using cached zipp-3.8.0-py3-none-any.whl (5.4 kB)
Collecting googleapis-common-protos<2,>=1.52.0
  Downloading googleapis_common_protos-1.56.3-py2.py3-none-any.whl (211 kB)
Building wheels for collected packages: promise
  Building wheel for promise (setup.py): started
  Building wheel for promise (setup.py): finished with status 'done'
  Cr

In [1]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds


  from .autonotebook import tqdm as notebook_tqdm


## Data

That's where we load and preprocess our data.

In [2]:

# tfds.load actually loads a dataset (or downloads and then loads if that's the first time you use it) 
mnist_dataset, mnist_info = tfds.load(name='mnist', with_info=True, as_supervised=True)

# once we have loaded the dataset, we can easily extract the training and testing dataset with the built references
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

# we start by defining the number of validation samples as a % of the train samples
# this is also where we make use of mnist_info 
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
# let's cast this number to an integer, as a float may cause an error along the way
num_validation_samples = tf.cast(num_validation_samples, tf.int64)
# let's also store the number of test samples in a dedicated variable 
num_test_samples = mnist_info.splits['test'].num_examples
# once more, we'd prefer an integer
num_test_samples = tf.cast(num_test_samples, tf.int64)



# let's define a function called: scale, that will take an MNIST image and its label
def scale(image, label):   
    image = tf.cast(image, tf.float32)
    image /= 255.
    return image, label

scaled_train_and_validation_data = mnist_train.map(scale)

test_data = mnist_test.map(scale)


# let's also shuffle the data

BUFFER_SIZE = 10000

shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)

# we use the .take() method to take that many samples
# finally, we create a batch with a batch size equal to the total number of validation samples
validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

# similarly, the train_data is everything else, so we skip as many samples as there are in the validation dataset
train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

# determine the batch size
BATCH_SIZE = 100

# we can also take advantage of the occasion to batch the train data

train_data = train_data.batch(BATCH_SIZE)

validation_data = validation_data.batch(num_validation_samples)

# batch the test data
test_data = test_data.batch(num_test_samples)


validation_inputs, validation_targets = next(iter(validation_data))

In [3]:
mnist_info

tfds.core.DatasetInfo(
    name='mnist',
    full_name='mnist/3.0.1',
    description="""
    The MNIST database of handwritten digits.
    """,
    homepage='http://yann.lecun.com/exdb/mnist/',
    data_path='~\\tensorflow_datasets\\mnist\\3.0.1',
    file_format=tfrecord,
    download_size=11.06 MiB,
    dataset_size=21.00 MiB,
    features=FeaturesDict({
        'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
    }),
    supervised_keys=('image', 'label'),
    disable_shuffling=False,
    splits={
        'test': <SplitInfo num_examples=10000, num_shards=1>,
        'train': <SplitInfo num_examples=60000, num_shards=1>,
    },
    citation="""@article{lecun2010mnist,
      title={MNIST handwritten digit database},
      author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
      journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
      volume={2},
      year={2010}
    }""",
)

## Model

### Outline the model
When thinking about a deep learning algorithm, we mostly imagine building the model. So, let's do it :)

In [4]:
input_size = 784
output_size = 10

hidden_layer_size = 50
    

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), 
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

### Choose the optimizer and the loss function

In [5]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

### Training
That's where we train the model we have built.

In [6]:
# determine the maximum number of epochs
NUM_EPOCHS = 5
model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose =2)

Epoch 1/5
540/540 - 3s - loss: 0.4212 - accuracy: 0.8835 - val_loss: 0.2094 - val_accuracy: 0.9408 - 3s/epoch - 6ms/step
Epoch 2/5
540/540 - 2s - loss: 0.1801 - accuracy: 0.9478 - val_loss: 0.1525 - val_accuracy: 0.9578 - 2s/epoch - 4ms/step
Epoch 3/5
540/540 - 2s - loss: 0.1408 - accuracy: 0.9593 - val_loss: 0.1318 - val_accuracy: 0.9645 - 2s/epoch - 4ms/step
Epoch 4/5
540/540 - 2s - loss: 0.1194 - accuracy: 0.9651 - val_loss: 0.1111 - val_accuracy: 0.9683 - 2s/epoch - 4ms/step
Epoch 5/5
540/540 - 2s - loss: 0.1010 - accuracy: 0.9696 - val_loss: 0.1010 - val_accuracy: 0.9713 - 2s/epoch - 4ms/step


<keras.callbacks.History at 0x20267dcb0d0>

### Test the Model

In [7]:
test_loss, test_accuracy = model.evaluate(test_data)



In [8]:
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.12. Test accuracy: 96.43%
