The **Keras Tuner** is a library that helps us pick the **optimal set of hyperparameters** for our Tensorflow program. Hyperparameters are of two types:
- **Model hyperparameters** which influence model selection such as the number and width of hidden layers
- **Algorithm hypeparameters** which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier


Summary:
- create `model_builder()` function
- create **Hyperband** tuner
- use `tuner.search()` to search the best hyperparameters
    - get theses best hypeparameters via `tuner.get_best_hyperparameters()`
- create model from theses hyperparameters `tuner.hypermodel.build(best_hps)`

## Setup

In [1]:
import tensorflow as tf
from tensorflow import keras
import IPython

!pip install -q -U keras-tuner
import kerastuner as kt

## Download and prepare the model

In [2]:
(img_train, label_train), (img_test, label_test) = tf.keras.datasets.mnist.load_data()

In [3]:
img_train = img_train.astype('float32') / 255.0
img_test  = img_test.astype('float32') / 255.0

## Define the model
When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a **hypermodel**.

You can define a hypermodel through two approaches:

- By using a model builder function
- By subclassing the `HyperModel` class of the Keras Tuner API

In [17]:
def model_builder(hp):
    model = keras.Sequential()
    model.add(keras.layers.Flatten(input_shape=(28, 28)))
    
    #  Tune the number of units in the first Dense layer
    # Choose an optimal value between 60-64
    hp_units = hp.Int('units', min_value=60, max_value=64, step=2)
    model.add(keras.layers.Dense(units=hp_units, activation='relu'))
    model.add(keras.layers.Dense(10))
    
    # Tune the learning rate for the optimizer 
    # Choose an optimal value from 0.01, 0.001, or 0.0001
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
    
    model.compile(optimizer = keras.optimizers.Adam(learning_rate=hp_learning_rate),
                  loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics = ['accuracy'])
    
    return model

## Instantiate the tuner and perform hypertuning

Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available:
- RandomSearch
- Hyperband
- BayesianOptimization
- Sklearn

We'll use `Hyperband`. To instantiate the Hyperband tuner, we must specify the hypermodel, the `objective` to optimize and the maximum number of epochs to train (`max_epochs`).

In [18]:
tuner = kt.Hyperband(model_builder,
                     objective = 'val_accuracy',
                     max_epochs = 10,
                     factor = 3,
                     directory = 'my_dir',
                     project_name = 'intro_to_kt')

INFO:tensorflow:Reloading Oracle from existing project my_dir/intro_to_kt/oracle.json
INFO:tensorflow:Reloading Tuner from my_dir/intro_to_kt/tuner0.json


The **Hyperband** tuning algorithm uses adaptive resource allocation and **early-stopping to quickly converge** on a high-performing model. 

Before running the hyperparameter search, define a callback to **clear the training outputs at the end of every training step**.

In [19]:
class ClearTrainingOutput(tf.keras.callbacks.Callback):
    def on_train_end(*args, **kwargs):
        IPython.display.clear_output(wait=True)

Run the hyperparameter search. The **arguments** for the **search method** are the **same** as those used for `tf.keras.model.fit` in addition to the callback above.

In [None]:
# Hyperparameter search
# Take so much time
tuner.search(img_train,
             label_train,
             epochs = 10,
             validation_data = (img_test, label_test),
             callbacks = [ClearTrainingOutput()]
            )

In [None]:
best_hps = tuner.get_best_hyperparameters(num_trials = 1)[0]

print(f"""
The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is {best_hps.get('units')} and the optimal learning rate for the optimizer
is {best_hps.get('learning_rate')}.
""")

To finish, retrain the model with the optimal hyperparameters from the search.

In [None]:
model = tuner.hypermodel.build(best_hps)
model.fit(img_train, label_train, epochs = 10, validation_data = (img_test, label_test))