<a href="https://colab.research.google.com/github/hsabaghpour/Searching-Indexing-Algorithm/blob/main/Fine_Tuning_Neural_Network_Hyperparameters.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%pip install -q -U keras-tuner

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.5/129.5 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m950.8/950.8 kB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
[?25h

You can use the keras_tuner library, often imported as kt, to create a function that builds, compiles, and returns a Keras model. This function takes a kt.HyperParameters object as an argument, which defines hyperparameters like the number of hidden layers, neurons per layer, learning rate, and optimizer type. These hyperparameters are used to configure the model. For example, you can use it to create an MLP for classifying Fashion MNIST images with various hyperparameter options.

In summary, you can use kt to build flexible Keras models with adjustable hyperparameters for tasks like image classification.






In [None]:
import tensorflow as tf

import keras_tuner as kt



def build_model(hp):
    n_hidden = hp.Int("n_hidden", min_value=0, max_value=8, default=2)
    n_neurons = hp.Int("n_neurons", min_value=16, max_value=256)
    learning_rate = hp.Float("learning_rate", min_value=1e-4, max_value=1e-2,
                             sampling="log")
    optimizer = hp.Choice("optimizer", values=["sgd", "adam"])
    if optimizer == "sgd":
        optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate)
    else:
        optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

    model = tf.keras.Sequential()
    model.add(tf.keras.layers.Flatten())
    for _ in range(n_hidden):
        model.add(tf.keras.layers.Dense(n_neurons, activation="relu"))
    model.add(tf.keras.layers.Dense(10, activation="softmax"))
    model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer,
                  metrics=["accuracy"])
    return model




If you're interested in performing a simple random search, you can set up a random search tuner with kt.RandomSearch. You provide the build_model function to the tuner's constructor and then invoke the search() method.






The function defines hyperparameters for a machine learning model. For instance, it checks if "n_hidden" is already defined; if not, it creates it as an integer between 0 and 8 (defaulting to 2 if not set). "n_neurons" is handled similarly. "learning_rate" is registered as a floating-point number between 10^-4 and 10^-2. The optimizer is either "sgd" or "adam" (defaulting to "sgd"). Depending on the optimizer, an SGD or Adam optimizer with the specified learning rate is created.

In summary, this function manages hyperparameters for a model, providing default values and allowable ranges for each hyperparameter. It's a way to configure and experiment with different model settings.


The second part of the function constructs the model based on the hyperparameter values. It starts with a Sequential model, adds a Flatten layer, followed by the specified number of hidden layers with ReLU activation, and an output layer with 10 neurons (one for each class) using the softmax activation. The model is then compiled and returned.



In [5]:
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]



random_search_tuner = kt.RandomSearch(
    build_model, objective="val_accuracy", max_trials=5, overwrite=True,
    directory="my_fashion_mnist", project_name="my_rnd_search", seed=42)
random_search_tuner.search(X_train, y_train, epochs=10,
                           validation_data=(X_valid, y_valid))


Trial 5 Complete [00h 01m 12s]
val_accuracy: 0.8353999853134155

Best val_accuracy So Far: 0.8628000020980835
Total elapsed time: 00h 08m 02s


running 5 more trials: this means you don’t have to run all the trials in one shot. Lastly, since objective is set to "val_accuracy", the tuner prefers models with a higher validation accuracy, so once the tuner has finished searching, you can get the best models like this:

In [6]:
top3_models = random_search_tuner.get_best_models(num_models=3)
best_model = top3_models[0]


In [7]:
#You can also call get_best_hyperparameters() to get the kt.HyperParameters of the best models:


top3_params = random_search_tuner.get_best_hyperparameters(num_trials=3)
top3_params[0].values  # best hyperparameter values


{'n_hidden': 7,
 'n_neurons': 100,
 'learning_rate': 0.0012482904754698163,
 'optimizer': 'sgd'}

In [10]:
best_trial = random_search_tuner.oracle.get_best_trials(num_trials=1)[0]
best_trial.summary()


Trial 1 summary
Hyperparameters:
n_hidden: 7
n_neurons: 100
learning_rate: 0.0012482904754698163
optimizer: sgd
Score: 0.8628000020980835


In [11]:
#You can also access all the metrics directly:



best_trial.metrics.get_last_value("val_accuracy")


0.8628000020980835

In [12]:
#If you are happy with the best model’s performance, you may continue training it for a few epochs on the full training set (X_train_full and y_train_full), then evaluate it on the test set, and deploy it to production


best_model.fit(X_train_full, y_train_full, epochs=10)
test_loss, test_accuracy = best_model.evaluate(X_test, y_test)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In some situations, you might need to fine-tune data preprocessing or model training settings, like the batch size. To do this, you use a different approach: you create a subclass of the kt.HyperModel class and define two methods - build() and fit().

The build() method is similar to the build_model() function and specifies the model architecture using hyperparameters.
The fit() method takes hyperparameters, a compiled model, training data, and model.fit() arguments. It trains the model and returns the training history. This method can use hyperparameters to make decisions about data preprocessing, batch size, and more.
For instance, the provided code defines a class that builds the same model as before with similar hyperparameters. It introduces a Boolean "normalize" hyperparameter, which controls whether the training data should be standardized before fitting the model.

In summary, when subclassing the kt.HyperModel class, you can customize how your model is built and trained, including data preprocessing, based on hyperparameter settings.




In [13]:
class MyClassificationHyperModel(kt.HyperModel):
    def build(self, hp):
        return build_model(hp)

    def fit(self, hp, model, X, y, **kwargs):
        if hp.Boolean("normalize"):
            norm_layer = tf.keras.layers.Normalization()
            X = norm_layer(X)
        return model.fit(X, y, **kwargs)


In [15]:
#You can then pass an instance of this class to the tuner of your choice, instead of passing the build_model function.
#For example, let’s build a kt.Hyperband tuner based on a MyClassificationHyperModel instance:



hyperband_tuner = kt.Hyperband(
    MyClassificationHyperModel(), objective="val_accuracy", seed=42,
    max_epochs=10, factor=3, hyperband_iterations=2,
    overwrite=True, directory="my_fashion_mnist", project_name="hyperband")


Now, we will execute the Hyperband tuner. We'll also utilize the TensorBoard callback, specifying the root log directory (the tuner manages unique subdirectories for each trial). Additionally, we include an EarlyStopping callback.






In [None]:
#%load_ext tensorboard
#%tensorboard --logdir=./my_logs

from pathlib import Path
from time import strftime

def get_run_logdir(root_logdir="my_logs"):
    return Path(root_logdir) / strftime("run_%Y_%m_%d_%H_%M_%S")

run_logdir = get_run_logdir()

root_logdir = Path(hyperband_tuner.project_dir) / "tensorboard"
tensorboard_cb = tf.keras.callbacks.TensorBoard(root_logdir)
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=2)
hyperband_tuner.search(X_train, y_train, epochs=10,
                       validation_data=(X_valid, y_valid),
                       callbacks=[early_stopping_cb, tensorboard_cb])


Trial 54 Complete [00h 00m 47s]
val_accuracy: 0.8479999899864197

Best val_accuracy So Far: 0.8705999851226807
Total elapsed time: 00h 38m 05s

Search: Running Trial #55

Value             |Best Value So Far |Hyperparameter
7                 |7                 |n_hidden
247               |100               |n_neurons
0.00039877        |0.00044489        |learning_rate
adam              |adam              |optimizer
True              |False             |normalize
10                |10                |tuner/epochs
4                 |4                 |tuner/initial_epoch
1                 |2                 |tuner/bracket
1                 |2                 |tuner/round
0049              |0042              |tuner/trial_id

Epoch 5/10
   6/1719 [..............................] - ETA: 19s - loss: 0.2948 - accuracy: 0.8906    



Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10

Hyperband is faster than pure random search but still explores hyperparameters randomly, making it a bit coarse. However, Keras Tuner offers a BayesianOptimization tuner, which learns promising hyperparameter regions over time using a probabilistic model called a Gaussian process. It gradually refines the search for the best hyperparameters. Note that this tuner has its own hyperparameters, like alpha for noise level and beta for exploration. The defaults for these hyperparameters are 10^-4 and 2.6, respectively. Besides that, you can use this tuner in a similar way to the previous ones.






In [31]:
bayesian_opt_tuner = kt.BayesianOptimization(
    MyClassificationHyperModel(), objective="val_accuracy", seed=42,
    max_trials=10, alpha=1e-4, beta=2.6,
    overwrite=True, directory="my_fashion_mnist", project_name="bayesian_opt")
bayesian_opt_tuner.search([...])

"""
hyperband_tuner.search(X_train, y_train, epochs=10,
                       validation_data=(X_valid, y_valid),
                       callbacks=[early_stopping_cb, tensorboard_cb])
"""

Trial 2 Complete [00h 00m 00s]

Best val_accuracy So Far: None
Total elapsed time: 00h 00m 01s

Search: Running Trial #3

Value             |Best Value So Far |Hyperparameter
4                 |5                 |n_hidden
74                |25                |n_neurons
0.0090513         |0.00065625        |learning_rate
adam              |sgd               |optimizer



Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/keras_tuner/src/engine/base_tuner.py", line 273, in _try_run_and_update_trial
    self._run_and_update_trial(trial, *fit_args, **fit_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/keras_tuner/src/engine/base_tuner.py", line 238, in _run_and_update_trial
    results = self.run_trial(trial, *fit_args, **fit_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/keras_tuner/src/engine/tuner.py", line 314, in run_trial
    obj_value = self._build_and_fit_model(trial, *args, **copied_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/keras_tuner/src/engine/tuner.py", line 233, in _build_and_fit_model
    results = self.hypermodel.fit(hp, model, *args, **kwargs)
TypeError: MyClassificationHyperModel.fit() missing 1 required positional argument: 'y'


RuntimeError: ignored