In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import tensorflow as tf
from tensorflow import keras

MNIST data is a database of handwritten digits from 0 to 9. The database contains 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images. Create an ANN model to identify the digit from the handwritten images.

In [2]:
df = keras.datasets.mnist
(x_train_full,y_train_full),(x_test,y_test) = df.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
x_train_full[6]

In [None]:
plt.imshow(x_train_full[6])

### Data Normalisation

In [4]:
x_train_norm = x_train_full/255.0
x_test_norm = x_test/255.0

### Train-Val-Test Split

In [5]:
x_val, x_train = x_train_norm[:6000], x_train_norm[6000:]
y_val, y_train = y_train_full[:6000], y_train_full[6000:]
x_test = x_test_norm

### Delete Model

In [None]:
del model
# free up the resources for next model training.
keras.backend.clear_session()

Random is to have the same output each time I run the code or to have the same output on every machine.

In [6]:
np.random.seed(42)
tf.random.set_seed(42)

### Model architecture / structure

ANN model with two dense layers of 200 and 100 neurons

In [7]:
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(200, activation='relu'))
model.add(keras.layers.Dense(100, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))

### Compile

In [8]:
model.compile(optimizer = "sgd",
             loss = "sparse_categorical_crossentropy",
             metrics = ['accuracy'])

### Using Callbacks

Usually, we have large datasets that can take 8-10 hours to train. For such a scenario, we can use callbacks. Callbacks are checkpoints that allow us to save the model while training after each epoch. 

### Saving the best model

In [None]:
checkpoint_cb = keras.callbacks.ModelCheckpoint("ANN_Digits_Model.h5", save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=4, restore_best_weights=True)

When we train our model for a very large number of epoch values such as epochs=200, we will keep an eye on our validation set score. Suppose after 60 epochs, the val_accuracy is not improving, then we can stop the training at that point & use the model with the best validation score so far. <br> Patience is the number of epochs after which training will be interrupted if there is no progress on val_accruacy.<br>
A model can be saved only after completion of its training. <br>

### Train

In [None]:
model_history = model.fit(x_train, y_train, epochs=200, validation_data=(x_val, y_val),
                         callbacks = [checkpoint_cb, early_stopping_cb])

In [None]:
pd.DataFrame(model_history.history).plot(figsize=(8,5))
plt.grid(True)
plt.gca().set_ylim(0,1)
plt.show()

### Restoring the best saved model & Evaluating the model on test set

In [None]:
model = keras.models.load_model("ANN_Digits_Model.h5")
model.evaluate(x_test,y_test)

In [None]:
y_pred = np.argmax(model.predict(x_test[:5]),axis=1)
y_pred

In [None]:
plt.imshow(x_test[2])

## Saving the Model