<a href="https://colab.research.google.com/github/joew2k/WQU_ml_fin/blob/main/Working_with_Keras_Tuner.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The keras Tunner is a library that helps you pick the optimal set of hyper-parameters for your TensorFlow program

There are two types of hyperparameters:
- *Model Hyperparameters:* Which influence model selection such as the number and width of hidden layers
- *Algorithm hyperparameters:* This influence the speed and qualityy of the learning algorithm such as learning rate for SGD and number of neighbors for a KNN classifier

In [5]:
import tensorflow as tf
from tensorflow import keras
try:
    import keras_tuner as kt
except:
    !pip install keras-tuner
    import keras_tuner as kt

Collecting keras-tuner
  Downloading keras_tuner-1.4.7-py3-none-any.whl (129 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
Collecting kt-legacy (from keras-tuner)
  Downloading kt_legacy-1.0.5-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras-tuner
Successfully installed keras-tuner-1.4.7 kt-legacy-1.0.5


In [2]:
# Download and Prepare the Dataset
# We will be using the Fashion MNIST dataset
(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [3]:
# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') /  255.0
img_test = img_test.astype('float32') / 255.0

## Define the model
When building a model for hypertuning, there is need to define the hyperparameter search space in addition to the model architecture. The model setup for hypertuning is call a hypermodel. This can be archieved in two ways
1. Model builder function
2. Subclassing the *HyperModel* class of the keras tuner API

This program will be working with model bulder function to define the image classification model. The model builder function returns a compiled model and uses hyperparameters you define inline to hypertune the model.

In [4]:
def model_builder(hp):
    model = keras.Sequential()
    model.add(keras.layers.Flatten(input_shape=(28, 28)))

    # Tune the number of units in the first Dense layer
    # Choose an optimal value between 32-512
    hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
    model.add(keras.layers.Dense(units=hp_units, activation='relu'))
    model.add(keras.layers.Dense(10))

    # Tune the learning rate for the optimizer
    # Choose an optimal value from 0.01, 0.001, or 0.0001
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

    model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate), loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'])

    return model

## Instantiate the tuner and perform hypertuning
Instantiate the tuner to perform the hypertuning. The keras tuner has four tuners available *RandomSearch, Hyperband, BayesianOptimization* and *Sklearn*. **Hyperband** tuner will be used here

In [7]:
tuner = kt.Hyperband(
    model_builder,
    objective='val_accuracy',
    max_epochs=10,
    factor=3,
    directory='my_dir',
    project_name ='intro_to_kt'
)

The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a larger number of models for a few epochs and carries forward only the top performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing $1 + log_{factor}(max_epochs)$ and rounding it up to the nearest integer

Create a callback to stop training early after reaching a certain value for the validation loss

In [8]:
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

In [10]:
# Run the hyperparameter search. the arguments for the search method are the same as those used for t
# tf.keras.model.fit in addition to the callback above
tuner.search(img_train, label_train, epochs=50, validation_split=0.2, callbacks=[stop_early])

# Get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
print(f'''The hyperparameter search is complete. The optimal number
 of units in the first densensly-connected layer is {best_hps.get('learning_rate')}''')

The hyperparameter search is complete. The optimal number
 of units in the first densensly-connected layer is 0.001


## Train the model
Find the optimal number of epochs to train the model with the hyperparameters obtained from the search

In [11]:
# Build the model with the optimal hyperparameters and train it on the data for 50 epochs
model = tuner.hypermodel.build(best_hps)
history = model.fit(img_train, label_train, epochs=50, validation_split=0.2)
val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Best epoch: 43


Re-instantiate the hypermodel and train it with the optimal number of epochs from above

In [12]:
hypermodel = tuner.hypermodel.build(best_hps)

# Retrain the model
hypermodel.fit(img_train, label_train, epochs=best_epoch, validation_split=0.2)

Epoch 1/43
Epoch 2/43
Epoch 3/43
Epoch 4/43
Epoch 5/43
Epoch 6/43
Epoch 7/43
Epoch 8/43
Epoch 9/43
Epoch 10/43
Epoch 11/43
Epoch 12/43
Epoch 13/43
Epoch 14/43
Epoch 15/43
Epoch 16/43
Epoch 17/43
Epoch 18/43
Epoch 19/43
Epoch 20/43
Epoch 21/43
Epoch 22/43
Epoch 23/43
Epoch 24/43
Epoch 25/43
Epoch 26/43
Epoch 27/43
Epoch 28/43
Epoch 29/43
Epoch 30/43
Epoch 31/43
Epoch 32/43
Epoch 33/43
Epoch 34/43
Epoch 35/43
Epoch 36/43
Epoch 37/43
Epoch 38/43
Epoch 39/43
Epoch 40/43
Epoch 41/43
Epoch 42/43
Epoch 43/43


<keras.src.callbacks.History at 0x7db7f0443d00>

In [13]:
# Evaluate the model on the test data
eval_result = hypermodel.evaluate(img_test, label_test)
print("[test loss, test accuracy]:", eval_result)

[test loss, test accuracy]: [0.5517007112503052, 0.8906999826431274]
