The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called hyperparameter tuning or hypertuning.

Hyperparameters are the variables that govern the training process and the topology of an ML model. These variables remain constant over the training process and directly impact the performance of your ML program. Hyperparameters are of two types:

Model hyperparameters which influence model selection such as the number and width of hidden layers
Algorithm hyperparameters which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier

In [1]:
import tensorflow as tf
from tensorflow import keras

In [2]:
pip install -q -U keras-tuner

Note: you may need to restart the kernel to use updated packages.


In [3]:
import keras_tuner as kt

In [4]:
(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()

In [5]:
# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0

Define the model
When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a hypermodel

hp.Int: Creates an integer parameter.
'units': Name of the parameter.
min_value=32: Minimum allowed value for the parameter (inclusive).
max_value=512: Maximum allowed value for the parameter (inclusive).
step=32: Incremental step size for the parameter.

In [None]:
def model_builder(hp): # hp == hyperparameter
  model = keras.Sequential()
  model.add(keras.layers.Flatten(input_shape=(28, 28)))

  # Tune the number of units in the first Dense layer
  # Choose an optimal value between 32-512
  hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
  
  model.add(keras.layers.Dense(units=hp_units, activation='relu'))
  model.add(keras.layers.Dense(10))

  # Tune the learning rate for the optimizer
  # Choose an optimal value from 0.01, 0.001, or 0.0001
  hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
                loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])

  return model

hp.Choice: Creates a categorical parameter with discrete values.
'learning_rate': Name of the parameter.
values=[1e-2, 1e-3, 1e-4]: List of possible values for the parameter.

Instantiate the tuner and perform hypertuning
Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available - RandomSearch, Hyperband, BayesianOptimization, and Sklearn. In this tutorial, you use the Hyperband tuner

In [7]:
tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3,
                     directory='my_dir',
                     project_name='intro_to_kt')

  super().__init__(**kwargs)


the Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing 1 + logfactor(max_epochs) and rounding it up to the nearest integer.

Create a callback to stop training early after reaching a certain value for the validation loss.

In [8]:
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

his TensorFlow Keras code defines an Early Stopping callback to prevent overfitting during model training:
Parameters
monitor='val_loss': Tracks validation loss as the metric to monitor.
patience=5: Stops training if validation loss doesn't improve for 5 consecutive epochs.

Run the hyperparameter search. The arguments for the search method are the same as those used for tf.keras.model.fit in addition to the callback above.

In [None]:
tuner.search(img_train, label_train, epochs=5, validation_split=0.2, callbacks=[stop_early])

Trial 11 Complete [00h 00m 11s]
val_accuracy: 0.847083330154419

Best val_accuracy So Far: 0.8576666712760925
Total elapsed time: 00h 02m 37s

Search: Running Trial #12

Value             |Best Value So Far |Hyperparameter
32                |96                |units
0.0001            |0.001             |learning_rate
2                 |2                 |tuner/epochs
0                 |0                 |tuner/initial_epoch
2                 |2                 |tuner/bracket
0                 |0                 |tuner/round

Epoch 1/2
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.5978 - loss: 1.2951 - val_accuracy: 0.7963 - val_loss: 0.6268
Epoch 2/2
[1m 227/1500[0m [32m━━━[0m[37m━━━━━━━━━━━━━━━━━[0m [1m1s[0m 1ms/step - accuracy: 0.7917 - loss: 0.6321

KeyboardInterrupt: 

In [10]:
# Get the optimal hyperparameters
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

In [13]:
import os

In [16]:
best_hps

<keras_tuner.src.engine.hyperparameters.hyperparameters.HyperParameters at 0x1f598204990>

In [11]:

print(f"""
The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is {best_hps.get('units')} and the optimal learning rate for the optimizer
is {best_hps.get('learning_rate')}.
""")


The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is 96 and the optimal learning rate for the optimizer
is 0.001.



In [17]:
# Build the model with the optimal hyperparameters and train it on the data for 50 epochs
model = tuner.hypermodel.build(best_hps)

  super().__init__(**kwargs)


In [18]:
history = model.fit(img_train, label_train, epochs=5, validation_split=0.2)

Epoch 1/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.7661 - loss: 0.6764 - val_accuracy: 0.8502 - val_loss: 0.4275
Epoch 2/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8544 - loss: 0.4048 - val_accuracy: 0.8592 - val_loss: 0.3871
Epoch 3/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8686 - loss: 0.3642 - val_accuracy: 0.8643 - val_loss: 0.3700
Epoch 4/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8782 - loss: 0.3301 - val_accuracy: 0.8725 - val_loss: 0.3640
Epoch 5/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8857 - loss: 0.3090 - val_accuracy: 0.8771 - val_loss: 0.3390


In [20]:
val_acc_per_epoch = history.history['val_accuracy']

In [21]:
val_acc_per_epoch

[0.8501666784286499,
 0.85916668176651,
 0.8643333315849304,
 0.8725000023841858,
 0.8770833611488342]

In [22]:
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))

Best epoch: 5


In [23]:
# Re-instantiate the hypermodel and train it with the optimal number of epochs from above.


hypermodel = tuner.hypermodel.build(best_hps)

# Retrain the model
hypermodel.fit(img_train, label_train, epochs=best_epoch, validation_split=0.2)

Epoch 1/5


  super().__init__(**kwargs)


[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.7687 - loss: 0.6687 - val_accuracy: 0.8263 - val_loss: 0.4747
Epoch 2/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8571 - loss: 0.4004 - val_accuracy: 0.8585 - val_loss: 0.3853
Epoch 3/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8731 - loss: 0.3490 - val_accuracy: 0.8593 - val_loss: 0.3927
Epoch 4/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8793 - loss: 0.3270 - val_accuracy: 0.8783 - val_loss: 0.3430
Epoch 5/5
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8864 - loss: 0.3079 - val_accuracy: 0.8758 - val_loss: 0.3493


<keras.src.callbacks.history.History at 0x1f5b9491cd0>

In [24]:
eval_result = hypermodel.evaluate(img_test, label_test)
print("[test loss, test accuracy]:", eval_result)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.8688 - loss: 0.3689
[test loss, test accuracy]: [0.38127461075782776, 0.8639000058174133]
