#### CSC 219 Machine Learning (Fall 2023)

#### Dr. Haiquan Chen, Dept of Computer Scicence

#### California State University, Sacramento


<a name='section0'></a>
# Lab 6:  Keras Tuner:  Automatic Hyperparameter Tuning

There are several libraries developed for tuning the hyperparameters of neural networks. One is the ***Keras Tuner*** for tuning Keras models. 

The Keras Tuner is somewhat similar to the Grid Search and Random Search in scikit-learn, and allows to define the search space for the hyperparameters over which the model will be fit, and it returns an optimal set of hyperparameters. 

Keras Tuner is not part of the `Keras` package and it needs to be installed and imported. 

In [1]:
#!pip install -q -U keras-tuner

In [None]:
import tensorflow as tf
import keras_tuner as kt

### Load MNIST Dataset

To demonstrate the use of the Keras Tuner we will work with the MNIST dataset. 

In [4]:
(img_train, label_train), (img_test, label_test) = tf.keras.datasets.fashion_mnist.load_data()


img_train = img_train.reshape(img_train.shape[0], 28, 28, 1)
img_test = img_test.reshape(img_test.shape[0], 28, 28, 1)


# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [10]:
num_classes = 10

# Converts a class vector (integers) to binary class matrix.   One-hot encoding!  Use with categorical_crossentropy.
label_train = tf.keras.utils.to_categorical(label_train, num_classes)
label_test = tf.keras.utils.to_categorical(label_test, num_classes)

### Model Builder

In the cell below, a function called `model_builder` is created, which performs search over two hyperparameters:

- Number of neurons in the first Dense layer,
- Learning rate. 

The line `hp_units = hp.Int('units', min_value=32, max_value=512, step=32)` defines a grid search for the number of neurons in the Dense layer in the range [32, 64, 96, ..., 512].

Next, a grid search for the learning rate is defined in the range `[1e-2, 1e-3, 1e-4]`.

In [11]:
def model_builder(hp):
  model = tf.keras.Sequential()
  
  model.add(tf.keras.layers.Conv2D(64, kernel_size=(3, 3), strides=(1, 1),
                 activation='relu',
                 input_shape=(28, 28, 1)))
  
  model.add(tf.keras.layers.Flatten())

  # Tune the number of units in the first Dense layer
  # Choose an optimal value between 32-512
  hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
  model.add(tf.keras.layers.Dense(units=hp_units, activation='relu'))
  model.add(tf.keras.layers.Dense(10, activation='softmax'))

  # Tune the learning rate for the optimizer
  # Choose an optimal value from 0.01, 0.001, or 0.0001
  hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

  model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate),
                loss=tf.keras.losses.categorical_crossentropy,
                metrics=['accuracy'])

  return model

### Hyperparameter Tuning

The Keras Tuner has four tuning algorithms available: 

- RandomSearch Tuner, similar to the Random Grid in scikit-learn performs a random search over a distribution of values for the hyperparameters.
- Hyperband Tuner, trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round, to converge to a high-performing model. 
- BayesianOptimization Tuner, performs BayesianOptimization by creating a probabilistic mapping of the model to the loss function, and iteratively evaluating promising sets of hyperparameters.
- Sklearn Tuner, designed for use with scikit-learn models.


In [12]:
#tuner = kt.Hyperband(model_builder,
                 #    objective='val_accuracy',
                 #    max_epochs=10,
                 #    factor=3)

#max_trials represents the number of hyperparameter combinations that will be tested by the tuner, 
#while execution_per_trial is the number of models that should be built and fit for each trial for robustness purposes.

tuner = kt.BayesianOptimization(
    model_builder,
    objective="val_accuracy",
    max_trials=2,
    executions_per_trial=1,
    directory="mnist_kt_test",
    overwrite=True,
)

NameError: name 'kt' is not defined

The following code take about 1 min for each trial for the ENTIRE dataset


In [None]:
tuner.search(img_train[0:1000], label_train[0:1000], epochs=2, validation_data=(img_test[0:1000], label_test[0:1000]), callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)])

In [3]:
# Get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]          # num_trials: Optional number of HyperParameters objects to return.

print(f"Optimal number of neuron in the Dense layer: {best_hps.get('units')}")
print (f"Optimal learning rate: {best_hps.get('learning_rate')}")

NameError: name 'tuner' is not defined

### Train and Evaluate the Model

Next, we will use the optimal hyperparameters from the Keras Tuner to create a model, and afterward we will evaluate the accuracy on the test dataset. 


In [59]:
# Build the model with the optimal hyperparameters and train it on the data for 2 epochs
model = tuner.hypermodel.build(best_hps)
model.fit(img_train[0:1000], label_train[0:1000], epochs=2, validation_data=(img_test[0:1000], label_test[0:1000]), verbose=2, callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)])

Epoch 1/2
32/32 - 4s - loss: 3.0903 - accuracy: 0.6050 - val_loss: 0.8005 - val_accuracy: 0.7130
Epoch 2/2
32/32 - 3s - loss: 0.4335 - accuracy: 0.8470 - val_loss: 0.6588 - val_accuracy: 0.7920


<tensorflow.python.keras.callbacks.History at 0x1789343a820>

In [61]:
eval_result = model.evaluate(img_test[0:1000], label_test[0:1000])
print("[val loss, val accuracy]:", eval_result)

[val loss, val accuracy]: [0.6587802767753601, 0.7919999957084656]
