# **DNN - Hyperparameter Tuning - KerasTuner**

- Hyperparameter tuning is a crucial step in building effective Deep Neural Networks.

- It involves finding the best set of hyperparameters (parameters that are not learned from the data but are set before the training process begins) for your model and dataset.

- Examples of hyperparameters include the learning rate, the number of layers, the number of neurons in each layer, the type and strength of regularization, and the batch size.

- Manually searching for the best combination of hyperparameters can be very time-consuming. Libraries like KerasTuner are designed to automate this search process.

In [6]:
! pip install keras-tuner

Collecting keras-tuner
  Downloading keras_tuner-1.4.7-py3-none-any.whl.metadata (5.4 kB)
Collecting kt-legacy (from keras-tuner)
  Downloading kt_legacy-1.0.5-py3-none-any.whl.metadata (221 bytes)
Downloading keras_tuner-1.4.7-py3-none-any.whl (129 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading kt_legacy-1.0.5-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras-tuner
Successfully installed keras-tuner-1.4.7 kt-legacy-1.0.5


In [7]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
import keras_tuner as kt # Import KerasTuner
import numpy as np


In [2]:
# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize the pixel values from 0-255 to 0-1
x_train = x_train.reshape(-1, 28 * 28).astype('float32') / 255.0
x_test = x_test.reshape(-1, 28 * 28).astype('float32') / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


In [3]:
# Convert labels to one-hot encoding
num_classes = 10
y_train_one_hot = to_categorical(y_train, num_classes)
y_test_one_hot = to_categorical(y_test, num_classes)

In [9]:
print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train_one_hot.shape}")
print(f"Testing data shape: {x_test.shape}")
print(f"Testing labels shape: {y_test_one_hot.shape}")


Training data shape: (60000, 784)
Training labels shape: (60000, 10)
Testing data shape: (10000, 784)
Testing labels shape: (10000, 10)


In [10]:
# 1. Define a function that builds the model with tunable hyperparameters
def build_model(hp):
    model = Sequential([
        Flatten(input_shape=(28 * 28,)),
    ])

    # Tune the number of neurons in the first Dense layer
    # Choose an optimal value between 32 and 128
    hp_units1 = hp.Int('units1', min_value=32, max_value=128, step=32)
    model.add(Dense(units=hp_units1, activation='relu'))

    # Tune the number of neurons in the second Dense layer
    # Choose an optimal value between 32 and 64
    hp_units2 = hp.Int('units2', min_value=32, max_value=64, step=16)
    model.add(Dense(units=hp_units2, activation='relu'))

    # Output layer
    model.add(Dense(num_classes, activation='softmax'))

    # Tune the learning rate for the optimizer
    # Choose an optimal value from a list of options
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

    model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    return model


In [11]:
# 2. Instantiate a Tuner (e.g., RandomSearch)
# RandomSearch randomly tries different combinations of hyperparameters.
tuner = kt.RandomSearch(
    hypermodel=build_model, # The model-building function
    objective='val_accuracy', # The metric to optimize (maximize validation accuracy)
    max_trials=10, # The total number of hyperparameter combinations to try
    executions_per_trial=2, # The number of models to train for each combination
    overwrite=True, # Overwrite previous results
    directory='my_mnist_kt_dir', # Directory to store results
    project_name='mnist_hyperparameter_tuning' # Project name
)

# Display the search space
tuner.search_space_summary()


Search space summary
Default search space size: 3
units1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 128, 'step': 32, 'sampling': 'linear'}
units2 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 64, 'step': 16, 'sampling': 'linear'}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


  super().__init__(**kwargs)


In [None]:
# 3. Run the hyperparameter search
print("\nRunning hyperparameter search...")
# The search process trains models with different hyperparameters
# and evaluates them on the validation data (using x_test, y_test here for simplicity).
tuner.search(x_train, y_train_one_hot, epochs=10, validation_data=(x_test, y_test_one_hot))



Trial 9 Complete [00h 03m 02s]
val_accuracy: 0.9492999911308289

Best val_accuracy So Far: 0.9773000180721283
Total elapsed time: 00h 29m 08s

Search: Running Trial #10

Value             |Best Value So Far |Hyperparameter
64                |96                |units1
64                |48                |units2
0.01              |0.001             |learning_rate

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - accuracy: 0.8830 - loss: 0.3836 - val_accuracy: 0.9395 - val_loss: 0.2131
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - accuracy: 0.9469 - loss: 0.1857 - val_accuracy: 0.9552 - val_loss: 0.1651
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9562 - loss: 0.1585 - val_accuracy: 0.9578 - val_loss: 0.1605
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 4ms/step - accuracy: 0.9617 - loss: 0.1441 - val_accuracy: 0.9

In [17]:
# 4. Get the best hyperparameters and the best model
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0] # Get the top 1 best hyperparameters
best_model = tuner.get_best_models(num_models=1)[0] # Get the top 1 best model

print(f"\nBest hyperparameters found:")
print(f"Number of units in first dense layer: {best_hps.get('units1')}")
print(f"Number of units in second dense layer: {best_hps.get('units2')}")
print(f"Learning rate: {best_hps.get('learning_rate')}")



Best hyperparameters found:
Number of units in first dense layer: 96
Number of units in second dense layer: 48
Learning rate: 0.001


In [18]:
# 5. Evaluate the best model on the test set
print("\nEvaluating the best model found by the tuner:")
loss, accuracy = best_model.evaluate(x_test, y_test_one_hot, verbose=0)

print(f"Test Loss of best model: {loss:.4f}")
print(f"Test Accuracy of best model: {accuracy:.4f}")



Evaluating the best model found by the tuner:
Test Loss of best model: 0.0881
Test Accuracy of best model: 0.9786


In [19]:
# Optional: Train the best model for more epochs on the full training data

print("\nTraining the best model for more epochs:")
history_best_model = best_model.fit(x_train, y_train_one_hot, epochs=50, batch_size=32, validation_split=0.2)



Training the best model for more epochs:
Epoch 1/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9934 - loss: 0.0207 - val_accuracy: 0.9914 - val_loss: 0.0258
Epoch 2/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 5ms/step - accuracy: 0.9942 - loss: 0.0163 - val_accuracy: 0.9915 - val_loss: 0.0242
Epoch 3/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4ms/step - accuracy: 0.9959 - loss: 0.0130 - val_accuracy: 0.9898 - val_loss: 0.0320
Epoch 4/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9958 - loss: 0.0123 - val_accuracy: 0.9889 - val_loss: 0.0350
Epoch 5/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 5ms/step - accuracy: 0.9967 - loss: 0.0096 - val_accuracy: 0.9877 - val_loss: 0.0375
Epoch 6/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 4ms/step - accuracy: 0.9959 - loss: 0.0115 - val_accuracy






**Explanation of the Code:**

1.  **Import KerasTuner:** We import the `keras_tuner` library as `kt`. You'll need to install it (`pip install keras-tuner`).
2.  **`build_model(hp)` function:** This function is crucial for KerasTuner. It takes a `HyperParameters` object (`hp`) as input. Inside this function, you define your model architecture, and for the hyperparameters that want to tune, use methods provided by the `hp` object (e.g., `hp.Int()` for integer values within a range, `hp.Choice()` for selecting from a list of values).
3.  **Instantiate a Tuner:** We create an instance of a tuner. `kt.RandomSearch` is a simple choice that randomly samples hyperparameter combinations. Other tuners like `kt.Hyperband` are also available and can be more efficient.
    * `hypermodel=build_model`: We pass our model-building function.
    * `objective='val_accuracy'`: We tell the tuner to maximize the validation accuracy.
    * `max_trials`: The total number of different hyperparameter combinations to try.
    * `executions_per_trial`: How many times to train a model with the same hyperparameter combination to account for variability.
4.  **Run the Search:** The `tuner.search()` method starts the hyperparameter tuning process. It takes your training data and labels, the number of epochs to train each candidate model for, and validation data. KerasTuner will call your `build_model` function with different `hp` values, train the resulting models, and evaluate them on the validation data.
5.  **Get Best Hyperparameters and Model:** After the search is complete, `tuner.get_best_hyperparameters()` and `tuner.get_best_models()` allow you to retrieve the best performing hyperparameter combination and the corresponding model.
6.  **Evaluate Best Model:** You can then evaluate the best model on your test set to get an unbiased estimate of its performance.



In [None]:

from tensorflow.keras.layers import Dropout
from tensorflow.keras.regularizers import l2 # Import L2 for potential regularization tuning


In [20]:
# Define a function that builds the model with more tunable hyperparameters
def build_model_extended(hp):
    model = Sequential([
        Flatten(input_shape=(28 * 28,)),
    ])

    # Tune the number of hidden layers
    hp_num_layers = hp.Int('num_layers', min_value=1, max_value=3, step=1)

    for i in range(hp_num_layers):
        # Tune the number of neurons in each hidden layer
        hp_units = hp.Int(f'units_{i}', min_value=32, max_value=128, step=32)
        model.add(Dense(units=hp_units, activation='relu'))

        # Optionally add dropout after each hidden layer
        # Tune the dropout rate
        if hp.Boolean(f'dropout_{i}'): # Decide whether to add dropout for this layer
             hp_dropout_rate = hp.Float(f'dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1)
             model.add(Dropout(rate=hp_dropout_rate))

        # Optional: Tune L2 kernel regularization for dense layers
        # hp_l2 = hp.Float(f'l2_{i}', min_value=1e-4, max_value=1e-2, sampling='log')
        # model.add(Dense(units=hp_units, activation='relu', kernel_regularizer=l2(hp_l2)))
        # if hp.Boolean(f'dropout_{i}'):
        #      hp_dropout_rate = hp.Float(f'dropout_rate_{i}', min_value=0.1, max_value=0.5, step=0.1)
        #      model.add(Dropout(rate=hp_dropout_rate))


    # Output layer
    model.add(Dense(num_classes, activation='softmax'))

    # Tune the optimizer
    hp_optimizer = hp.Choice('optimizer', values=['adam', 'sgd', 'rmsprop'])

    # Tune the learning rate (can be made conditional on the optimizer if needed)
    hp_learning_rate = hp.Float('learning_rate', min_value=1e-4, max_value=1e-2, sampling='log')


    if hp_optimizer == 'adam':
        optimizer = keras.optimizers.Adam(learning_rate=hp_learning_rate)
    elif hp_optimizer == 'sgd':
        optimizer = keras.optimizers.SGD(learning_rate=hp_learning_rate)
    else: # rmsprop
        optimizer = keras.optimizers.RMSprop(learning_rate=hp_learning_rate)


    model.compile(optimizer=optimizer,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    return model


In [21]:
# Instantiate a Tuner (e.g., Hyperband for potentially better efficiency)
# Hyperband is often more efficient than RandomSearch for larger search spaces.
tuner_extended = kt.Hyperband(
    hypermodel=build_model_extended, # The extended model-building function
    objective='val_accuracy',       # The metric to optimize
    max_epochs=10,                  # Maximum number of epochs to train a model
    factor=3,                       # Factor by which to reduce the number of models and epochs
    hyperband_iterations=2,         # Number of iterations of Hyperband
    overwrite=True,                 # Overwrite previous results
    directory='my_mnist_kt_extended_dir', # Directory to store results
    project_name='mnist_hyperparameter_tuning_extended' # Project name
)

# Display the extended search space
tuner_extended.search_space_summary()


Search space summary
Default search space size: 5
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 1, 'max_value': 3, 'step': 1, 'sampling': 'linear'}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 128, 'step': 32, 'sampling': 'linear'}
dropout_0 (Boolean)
{'default': False, 'conditions': []}
optimizer (Choice)
{'default': 'adam', 'conditions': [], 'values': ['adam', 'sgd', 'rmsprop'], 'ordered': False}
learning_rate (Float)
{'default': 0.0001, 'conditions': [], 'min_value': 0.0001, 'max_value': 0.01, 'step': None, 'sampling': 'log'}


In [23]:
# Run the hyperparameter search
print("\nRunning extended hyperparameter search...")
tuner_extended.search(x_train, y_train_one_hot, epochs=50, validation_data=(x_test, y_test_one_hot))
# Note: The 'epochs' here is the total epochs for the *entire* search process per trial,
# not the max_epochs for a single model in Hyperband.

# Get the best hyperparameters and the best model
best_hps_extended = tuner_extended.get_best_hyperparameters(num_trials=1)[0]
best_model_extended = tuner_extended.get_best_models(num_models=1)[0]

print(f"\nBest hyperparameters found:")
print(f"Number of hidden layers: {best_hps_extended.get('num_layers')}")
for i in range(best_hps_extended.get('num_layers')):
    print(f"Units in layer {i}: {best_hps_extended.get(f'units_{i}')}")
    if best_hps_extended.get(f'dropout_{i}'):
         print(f"Dropout rate after layer {i}: {best_hps_extended.get(f'dropout_rate_{i}')}")

print(f"Optimizer: {best_hps_extended.get('optimizer')}")
print(f"Learning rate: {best_hps_extended.get('learning_rate')}")




Trial 60 Complete [00h 01m 40s]
val_accuracy: 0.9757999777793884

Best val_accuracy So Far: 0.9807999730110168
Total elapsed time: 00h 44m 03s

Best hyperparameters found:
Number of hidden layers: 3
Units in layer 0: 128
Dropout rate after layer 0: 0.1
Units in layer 1: 96
Units in layer 2: 32
Optimizer: adam
Learning rate: 0.0002677699795927549


In [24]:

# Evaluate the best model on the test set
print("\nEvaluating the best model found by the tuner:")
loss_extended, accuracy_extended = best_model_extended.evaluate(x_test, y_test_one_hot, verbose=0)

print(f"Test Loss of best model: {loss_extended:.4f}")
print(f"Test Accuracy of best model: {accuracy_extended:.4f}")



Evaluating the best model found by the tuner:
Test Loss of best model: 0.0652
Test Accuracy of best model: 0.9808


In [25]:
# Train the best model for more epochs
print("\nTraining the best model for more epochs:")
history_best_model_extended = best_model_extended.fit(x_train, y_train_one_hot, epochs=50, batch_size=32, validation_split=0.2)



Training the best model for more epochs:
Epoch 1/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 6ms/step - accuracy: 0.9887 - loss: 0.0343 - val_accuracy: 0.9929 - val_loss: 0.0228
Epoch 2/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 6ms/step - accuracy: 0.9916 - loss: 0.0278 - val_accuracy: 0.9908 - val_loss: 0.0270
Epoch 3/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - accuracy: 0.9919 - loss: 0.0267 - val_accuracy: 0.9907 - val_loss: 0.0284
Epoch 4/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 6ms/step - accuracy: 0.9920 - loss: 0.0246 - val_accuracy: 0.9898 - val_loss: 0.0295
Epoch 5/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 7ms/step - accuracy: 0.9921 - loss: 0.0233 - val_accuracy: 0.9869 - val_loss: 0.0380
Epoch 6/50
[1m1500/1500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 6ms/step - accuracy: 0.9929 - loss: 0.0208 - val_accurac