In [1]:
!pip install -U keras_tuner

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting keras_tuner
  Downloading keras_tuner-1.1.3-py3-none-any.whl (135 kB)
[K     |████████████████████████████████| 135 kB 5.5 MB/s 
Collecting kt-legacy
  Downloading kt_legacy-1.0.4-py3-none-any.whl (9.6 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 33.4 MB/s 
Installing collected packages: jedi, kt-legacy, keras-tuner
Successfully installed jedi-0.18.1 keras-tuner-1.1.3 kt-legacy-1.0.4


Keras Tuner is an easy-to-use, distributable hyperparameter optimization framework that solves the pain points of performing a hyperparameter search. Keras Tuner makes it easy to define a search space and leverage included algorithms to find the best hyperparameter values. Keras Tuner comes with Bayesian Optimization, Hyperband, and Random Search algorithms built-in, and is also designed to be easy for researchers to extend in order to experiment with new search algorithms.

In [2]:
import keras_tuner

print("Keras Tuner Version : {}".format(keras_tuner.__version__))

Keras Tuner Version : 1.1.3


In [3]:

from tensorflow import keras

print("Keras Version : {}".format(keras.__version__))

Keras Version : 2.8.0


# 2. Classification Example (Random Hyperparameters Search) 

As a part of project, we will use a random search tuner for classification tasks. We have loaded the Fashion MNIST dataset below for our task. The dataset has grayscale images of shape (28,28) pixels for 10 different fashion items. 

The dataset is already divided into the train (60k images) and test (10k images) sets. We'll be trying various convolutional neural networks on this dataset to check which one is giving the best results.




In [4]:
import numpy as np
from tensorflow.keras import datasets

(X_train_classif, Y_train_classif), (X_test_classif, Y_test_classif) = datasets.fashion_mnist.load_data()

X_train_classif, X_test_classif = X_train_classif.reshape(-1,28,28,1), X_test_classif.reshape(-1,28,28,1)

classes = np.unique(Y_train_classif)

X_train_classif.shape, X_test_classif.shape, Y_train_classif.shape, Y_test_classif.shape

((60000, 28, 28, 1), (10000, 28, 28, 1), (60000,), (10000,))

# Build model

In the below cell, we have created a new class that extends HyperModel class. The class has build() method that takes HyperParameters instance as input and returns a compiled keras model. It sets various hyperparameters using methods of HyperParameters instance. We'll be giving an instance of this class to RandomSearch() constructor later.



In [5]:
from keras_tuner import HyperModel
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

class ConvNetwork(HyperModel):
    def build(self, hp):
        model = Sequential()
        model.add(layers.Input(shape=X_train_classif.shape[1:]))
        model_type = hp.Choice("ConvNetType", ["Conv1","Conv2"])

        if model_type == "Conv1":
            with hp.conditional_scope("ConvNetType", ["Conv1"]):
                activation = hp.Choice("activation", ["relu", "tanh"])
                kern_init = hp.Choice("kernel_initializer", ["random_normal", "lecun_normal","he_normal"])

                model.add(layers.Conv2D(filters=hp.Int("Conv1_1", 16, 33, step=16), kernel_size=(3,3), padding="same", kernel_initializer=kern_init, activation=activation))
                model.add(layers.Conv2D(filters=hp.Int("Conv1_2", 16, 33, step=16), kernel_size=(3,3), padding="same", kernel_initializer=kern_init, activation=activation))
        elif model_type == "Conv2":
            with hp.conditional_scope("ConvNetType", ["Conv2"]):
                activation = hp.Choice("activation", ["relu", "tanh"])
                kern_init = hp.Choice("kernel_initializer", ["random_normal", "lecun_normal","he_normal"])

                model.add(layers.Conv2D(filters=hp.Int("Conv2_1", 16, 33, step=16), kernel_size=(3,3), padding="same", kernel_initializer=kern_init, activation=activation))
                model.add(layers.Conv2D(filters=hp.Int("Conv2_2", 16, 33, step=16), kernel_size=(3,3), padding="same", kernel_initializer=kern_init, activation=activation))
                model.add(layers.Conv2D(filters=hp.Int("Conv2_3", 8, 17, step=8), kernel_size=(3,3), padding="same", kernel_initializer=kern_init, activation=activation))

        model.add(layers.Flatten())
        model.add(layers.Dense(units=len(classes), activation="softmax"))

        model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

        return model

# RandomSearch

In the below cell, we have created a random search tuner and executed it for 5 trials. We have given our instance of HyperModel to it and have asked it to maximize validation accuracy using Objective instance.

We have executed the tuning process by calling search() function giving it train data validation data, batch size (512), and epochs (10).



In [6]:
from keras_tuner import RandomSearch
from keras_tuner import Objective

conv2 = ConvNetwork()
tuner2 =  RandomSearch(hypermodel=conv2,
                      objective=Objective(name="val_accuracy",direction="max"),
                      max_trials=1,
                      #seed=123,
                      project_name="Classification",
                      overwrite=True
                    )

tuner2.search(X_train_classif, Y_train_classif, batch_size=512812, epochs=1, validation_data=(X_test_classif, Y_test_classif))

Trial 1 Complete [00h 01m 32s]
val_accuracy: 0.2870999872684479

Best val_accuracy So Far: 0.2870999872684479
Total elapsed time: 00h 01m 32s


In the next cells, we have retrieved the best model and used it to evaluate performance on the test dataset which we had used as a validation dataset. Then, we have printed the tuning summary as well.

In [7]:
best_params = tuner2.get_best_hyperparameters()

best_params[0].values

{'ConvNetType': 'Conv2',
 'activation': 'relu',
 'kernel_initializer': 'random_normal',
 'Conv2_1': 16,
 'Conv2_2': 16,
 'Conv2_3': 8}

In [8]:
best_model = tuner2.get_best_models()[0]

best_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 28, 28, 16)        160       
                                                                 
 conv2d_1 (Conv2D)           (None, 28, 28, 16)        2320      
                                                                 
 conv2d_2 (Conv2D)           (None, 28, 28, 8)         1160      
                                                                 
 flatten (Flatten)           (None, 6272)              0         
                                                                 
 dense (Dense)               (None, 10)                62730     
                                                                 
Total params: 66,370
Trainable params: 66,370
Non-trainable params: 0
_________________________________________________________________


In [9]:
best_model.evaluate(X_test_classif, Y_test_classif)




[1.9415361881256104, 0.2870999872684479]

# Hyperband Algorithm

In this section, we have performed hyperparameters optimization using Hyperband algorithm. It is a variation of random search with explore-exploit theory to find good hyperparameters settings. It focuses on speeding up random search through adaptive resource allocation and early stopping. 

It randomly allocates resources like iterations, data samples, and features to different hyperparameters settings and tries to solve stochastic bandit problems where it keeps on eliminating underperforming settings. The keras tuner provides an implementation of Hyperband algorithm tuner through Hyperband() constructor. 

It has the majority of the parameters same as random search with a few additional parameters as listed below.



In [10]:
from keras_tuner import Hyperband
from keras_tuner import Objective

In [None]:
conv6 = ConvNetwork()
tuner6 =  Hyperband(hypermodel=conv6,
                   objective=Objective(name="val_accuracy",direction="max"),
                   hyperband_iterations=1,
                   #seed=123
                   project_name="Hyperband",
                   overwrite=True
                  )

tuner6.search(X_train_classif, Y_train_classif, batch_size=128, epochs=1, validation_data=(X_test_classif, Y_test_classif))

Trial 13 Complete [00h 05m 23s]
val_accuracy: 0.8873000144958496

Best val_accuracy So Far: 0.8985000252723694
Total elapsed time: 00h 59m 53s

Search: Running Trial #14

Value             |Best Value So Far |Hyperparameter
Conv2             |Conv2             |ConvNetType
tanh              |tanh              |activation
he_normal         |random_normal     |kernel_initializer
32                |16                |Conv2_1
32                |16                |Conv2_2
16                |8                 |Conv2_3
2                 |2                 |tuner/epochs
0                 |0                 |tuner/initial_epoch
4                 |4                 |tuner/bracket
0                 |0                 |tuner/round

Epoch 1/2
Epoch 2/2

In [None]:
best_params = tuner6.get_best_hyperparameters()

best_params[0].values

In [None]:
best_model = tuner6.get_best_models()[0]

best_model.summary()

In [None]:
best_model.evaluate(X_test_classif, Y_test_classif)


# Bayesian Optimization Algorithm 

In this example, I have already explained bayesian optimization tuner available from keras tuner. Bayesian optimization uses Bayes theorem to find the best hyperparameters settings. We can use the Bayesian optimization tuner by BayesianOptimization() constructor of the keras tuner. It has almost the same parameters as a random search tuner with a few additional parameters listed below.

num_initial_points - This parameter accepts integer values specifying the number of randomly generated samples for the initial training of the network. The default is 2.

alpha - This parameter accepts float value added to the diagonal of kernel matrix during fitting. It is the expected amount of noise in the observed performances in the Bayesian optimization process. The default value is 1e-4.

beta - This parameter accepts float value specifying balancing factor of exploration and exploitation. The larger value means more exploration. The default value is 2.6.

Below, we have initialized the bayesian optimization tuner and tried to find good hyperparameters settings for our classification task network (CNN). As usual, we have performed a search by calling search() method on the tuner object.


In [None]:


from keras_tuner import BayesianOptimization
from keras_tuner import Objective

conv7 = ConvNetwork()
tuner7 =  BayesianOptimization(hypermodel=conv7,
                               objective=Objective(name="val_accuracy",direction="max"),
                               max_trials=10,
                               num_initial_points=2,
                               #seed=123
                               project_name="BayesianOptimization",
                               overwrite=True
                              )

tuner7.search(X_train_classif, Y_train_classif, batch_size=128, epochs=5, validation_data=(X_test_classif, Y_test_classif))

In [None]:
best_params = tuner7.get_best_hyperparameters()

best_params[0].values

In [None]:
best_model = tuner7.get_best_models()[0]

best_model.summary()

In [None]:
best_model.evaluate(X_test_classif, Y_test_classif)


This ends our small tutorial explaining how we can use various tuners available from keras tuner to find the best hyperparameters for the given model. I have explained all hyperparameters tuning algorithms available from keras tuner. 

