**MLP classifier trained on the MNIST dataset**

# Image classification using the MNIST dataset
We will train and evaluate an MLP on the MNIST dataset. It consists of 70.000 grayscale images of 28x28 pixels each, and there are 10 classes.

## Setup

In [1]:
# Common imports
import sys
import os
import sklearn
import numpy as np
import tensorflow as tf
from tensorflow import keras

# to make this notebook's output stable across runs
np.random.seed(42)
tf.random.set_seed(42)

# To plot pretty figures
%matplotlib inline
import matplotlib.pyplot as plt

# Ignore useless warnings (see SciPy issue #5998)
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

## Load the data

In [2]:
mnist = tf.keras.datasets.mnist
(X_train_full, y_train_full), (X_test, y_test) = mnist.load_data()

In [3]:
# Show the size and dimension of the dataset.
X_train_full.shape

(60000, 28, 28)

In [4]:
# Each pixel intensity is represented as a byte (0 to 255).
X_train_full.dtype

dtype('uint8')

In [5]:
# Split the full training set into a validation set and a (smaller) training set.
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

## Hyperparameter tuning with KerasTuner

In [6]:
# Install KerasTuner if you are using Google Colab (if you are running on your own computer
# it may also be necessary to install KerasTuner once from the command prompt).
if "google.colab" in sys.modules:
    %pip install -q -U keras_tuner

In [7]:
# Create a function that builds, compiles and returns a Keras model.
# Note that the function takes a HyperParameters object (hp) as a parameter, which it can
# use to define hyperparameters along with their range of possible values.
import keras_tuner as kt

def build_model(hp):
    n_hidden = hp.Int("n_hidden", min_value=2, max_value=12)
    n_neurons = hp.Int("n_neurons", min_value=150, max_value=300)

    # Momentum optimization
    optimizer = tf.keras.optimizers.SGD(momentum=0.9)
    
    model = tf.keras.Sequential()

    # Rescaling layer (divides each pixel by 255):
    model.add(keras.layers.Rescaling(1./255))

    model.add(tf.keras.layers.Flatten())
    for _ in range(n_hidden):
        model.add(tf.keras.layers.Dense(n_neurons, activation="relu"))
        
    # Dropout    
    model.add(keras.layers.Dropout(rate=0.2))
    
    model.add(tf.keras.layers.Dense(10, activation="softmax"))
    
    model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer,
                  metrics=["accuracy"])
    return model

In [8]:
# Use KerasTuner RandomSearch to tune hyperparameters
random_search_tuner = kt.RandomSearch(
    build_model, objective="val_accuracy", max_trials=100, overwrite=True,
    directory="my_mnist", project_name="my_rnd_search", seed=42)

# Early stopping
early_stopping_cb = keras.callbacks.EarlyStopping(patience=8, restore_best_weights=True)

# Learning rate scheduling
lr_scheduler = keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=3)

# The parameters that you can pass to the search() method are similar to the
# parameters that you can pass to the fit() method of a Keras model.
# Here I have doubled the batch size compared to the default value to speed up
# training, and I use callbacks.
random_search_tuner.search(X_train, y_train, epochs=80, batch_size=64, 
                           callbacks=[lr_scheduler, early_stopping_cb],
                           validation_data=(X_valid, y_valid))

Trial 50 Complete [00h 00m 49s]
val_accuracy: 0.9846000075340271

Best val_accuracy So Far: 0.9868000149726868
Total elapsed time: 00h 34m 30s
INFO:tensorflow:Oracle triggered exit


In [9]:
# Display a summary of the best model:
best_trial = random_search_tuner.oracle.get_best_trials(num_trials=1)[0]
best_trial.summary()

Trial summary
Hyperparameters:
n_hidden: 7
n_neurons: 234
Score: 0.9868000149726868


In [10]:
# Display the best model's accuracy measured on the validation set:
best_trial.metrics.get_last_value("val_accuracy")

0.9868000149726868

In [11]:
# Evaluate the model's accuracy on the test set:
best_model = random_search_tuner.get_best_models(num_models=1)[0]
test_loss, test_accuracy = best_model.evaluate(X_test, y_test)

