<a href="https://colab.research.google.com/github/nyp-sit/it3103/blob/main/week13/keras_tuner.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to the Keras Tuner

## Overview

The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called *hyperparameter tuning* or *hypertuning*.

Hyperparameters are the variables that govern the training process and the topology of an ML model. These variables remain constant over the training process and directly impact the performance of your ML program. Hyperparameters are of two types:
1. **Model hyperparameters** which influence model selection such as the number and width of hidden layers
2. **Algorithm hyperparameters** which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier

In this lab, you will use the Keras Tuner to perform hypertuning for an image classification application.


*Acknowledgement*: This notebook is adapted from https://www.tensorflow.org/tutorials/keras/keras_tuner

## Setup

In [None]:
import tensorflow as tf
from tensorflow import keras

Install and import the Keras Tuner and also the logger library that allows the hyper-parameters search to be visualized in Tensorboard.

In [None]:
!pip install -q -U keras-tuner

In [None]:
# Clean up any previous logs
!rm -rf ./logs
!rm -rf ./my_dir

In [None]:
import keras_tuner as kt

## Download and prepare the dataset

In this tutorial, you will use the Keras Tuner to find the best hyperparameters for a machine learning model that classifies images of clothing from the [Fashion MNIST dataset](https://github.com/zalandoresearch/fashion-mnist).

Load the data.

In [None]:
(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()

In [None]:
# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0

## Define the model

When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a *hypermodel*.

You can define a hypermodel through two approaches:

* By using a model builder function
* By subclassing the `HyperModel` class of the Keras Tuner API

You can also use two pre-defined `HyperModel` classes - [HyperXception](https://keras-team.github.io/keras-tuner/documentation/hypermodels/#hyperxception-class) and [HyperResNet](https://keras-team.github.io/keras-tuner/documentation/hypermodels/#hyperresnet-class) for computer vision applications.

In this tutorial, you use a model builder function to define the image classification model. The model builder function returns a compiled model and uses hyperparameters you define inline to hypertune the model.

In [None]:
def model_builder(hp):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
  for i in range(hp.Int('num_layers', 1, 2)):
    model.add(tf.keras.layers.Dense(units=hp.Int('units_' + str(i), 32, 64, step=16),
                                    activation='relu'))
  model.add(tf.keras.layers.Dense(10, activation='softmax'))
  model.compile(optimizer=tf.keras.optimizers.Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  return model

## Instantiate the tuner and perform hypertuning

Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available - `RandomSearch`, `Hyperband`, `BayesianOptimization`, and `Sklearn`. In this tutorial, you use the [Hyperband](https://arxiv.org/pdf/1603.06560.pdf) tuner.

To instantiate the Hyperband tuner, you must specify the hypermodel, the `objective` to optimize and the maximum number of epochs to train (`max_epochs`).

In [None]:
tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=30,
                     factor=3,
                     project_name='intro_to_kt',
                     directory='my_dir')

The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing 1 + log<sub>`factor`</sub>(`max_epochs`) and rounding it up to the nearest integer.

Run the hyperparameter search. The arguments for the search method are the same as those used for `tf.keras.model.fit` in addition to the callback above.

In [None]:
tuner.search(img_train, label_train, epochs=30, validation_split=0.2, callbacks=[keras.callbacks.TensorBoard("logs/hparams")])


In [None]:
# Get the optimal hyperparameters

best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
best_num_layers = best_hps.get('num_layers')
print(f'best number of layers = {best_num_layers}')

for i in range(best_num_layers):
    best_units = best_hps.get('units_' + str(i))
    print(f'best number of units for layer {i} = {best_units}')

## Visualize the results in TensorBoard's HParams plugin

The HParams dashboard can now be opened. Start TensorBoard and click on "HParams" at the top.

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/hparams

The left pane of the dashboard provides filtering capabilities that are active across all the views in the HParams dashboard:

- Filter which hyperparameters/metrics are shown in the dashboard
- Filter which hyperparameter/metrics values are shown in the dashboard
- Filter on run status (running, success, ...)
- Sort by hyperparameter/metric in the table view
- Number of session groups to show (useful for performance when there are many experiments)


The HParams dashboard has three different views, with various useful information:

* The **Table View** lists the runs, their hyperparameters, and their metrics.
* The **Parallel Coordinates View** shows each run as a line going through an axis for each hyperparemeter and metric. Click and drag the mouse on any axis to mark a region which will highlight only the runs that pass through it. This can be useful for identifying which groups of hyperparameters are most important. The axes themselves can be re-ordered by dragging them.
* The **Scatter Plot View** shows plots comparing each hyperparameter/metric with each metric. This can help identify correlations. Click and drag to select a region in a specific plot and highlight those sessions across the other plots. 

A table row, a parallel coordinates line, and a scatter plot market can be clicked to see a plot of the metrics as a function of training steps for that session (although in this tutorial only one step is used for each run).

## Train the model

Now we will train the model with the hyperparameters obtained from the search, and to save the best checkpoint (based on validation accuracy)

In [None]:
# Build the model with the optimal hyperparameters and train it on the data for 30 epochs
model = tuner.hypermodel.build(best_hps)
model.summary()

In [None]:
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath="bestcheckpoint",
    save_weights_only=True,
    monitor='val_accuracy',
    mode='max',
    save_best_only=True)

history = model.fit(img_train, label_train, epochs=30, validation_split=0.2, callbacks=[model_checkpoint_callback])

To finish this lab, evaluate the hypermodel on the test data.

In [None]:
model.load_weights('bestcheckpoint')
eval_result = model.evaluate(img_test, label_test)
print("[test loss, test accuracy]:", eval_result)

The `my_dir/intro_to_kt` directory contains detailed logs and checkpoints for every trial (model configuration) run during the hyperparameter search. If you re-run the hyperparameter search, the Keras Tuner uses the existing state from these logs to resume the search. To disable this behavior, pass an additional `overwrite=True` argument while instantiating the tuner.

## Exercise

Using the best hyper-parameters found earlier (number of layers, and number of units), modify the codes above to perform hyper-parameter search on the learning rate of Adam optimizer, with the following search space `\[0.001, 0.01\]`. 

You can use [`hp.Choice()`](https://keras.io/api/keras_tuner/hyperparameters/#choice-method) to search among the list of choices. 

Change the tuner to use [BayesianOptimization](https://keras.io/api/keras_tuner/tuners/bayesian/). Use the default parameter values for the BayesianOptimization search.

Which learning rate give you the best validation accuracy?