# __Hyperparameter Tuning__
- Hyperparameter tuning is the process of systematically searching for the best combination of hyperparameter values for a machine learning model.
- It involves selecting a subset of hyperparameters and exploring different values for each hyperparameter to find the configuration that optimizes the model's performance on a given dataset.

Let's understand how it works.

In [1]:
# prompt: load google drive

from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


### Import the required libraries


In [2]:
!pip install keras-tuner

Collecting keras-tuner
  Downloading keras_tuner-1.4.7-py3-none-any.whl.metadata (5.4 kB)
Collecting kt-legacy (from keras-tuner)
  Downloading kt_legacy-1.0.5-py3-none-any.whl.metadata (221 bytes)
Downloading keras_tuner-1.4.7-py3-none-any.whl (129 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading kt_legacy-1.0.5-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras-tuner
Successfully installed keras-tuner-1.4.7 kt-legacy-1.0.5


In [3]:
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras import optimizers

import keras_tuner
from keras_tuner import HyperModel
from keras_tuner.tuners import RandomSearch
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

### Step 2: Load the dataset and normalise it

In [4]:
import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/Indigo Training - Feb 2025/Indigo Training - 2025/Deep Learning/Exercises/In class Exercise - Baggage prediction/customer_booking.csv',encoding="latin-1")
df.head()

Unnamed: 0,num_passengers,sales_channel,trip_type,purchase_lead,length_of_stay,flight_hour,flight_day,route,booking_origin,wants_extra_baggage,wants_preferred_seat,wants_in_flight_meals,flight_duration,booking_complete
0,2,Internet,RoundTrip,262,19,7,Sat,AKLDEL,New Zealand,1,0,0,5.52,0
1,1,Internet,RoundTrip,112,20,3,Sat,AKLDEL,New Zealand,0,0,0,5.52,0
2,2,Internet,RoundTrip,243,22,17,Wed,AKLDEL,India,1,1,0,5.52,0
3,1,Internet,RoundTrip,96,31,4,Sat,AKLDEL,New Zealand,0,0,1,5.52,0
4,2,Internet,RoundTrip,68,22,15,Wed,AKLDEL,India,1,0,1,5.52,0


In [5]:
df = pd.get_dummies(df,columns=['sales_channel','trip_type','flight_day','route', 'booking_origin'],drop_first = True)

In [6]:
X = df.drop('wants_extra_baggage',axis=1)
y = df['wants_extra_baggage']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify = y)

# Standardize the data
scaler = MinMaxScaler()
train_sc = scaler.fit_transform(X_train)
test_sc = scaler.transform(X_test)

### Build and Train a Basic Deep Learning Model

- Constructs a sequential neural network with two hidden layers of 32 and 16 neurons respectively, using ReLU activation, and a dropout layer to reduce overfitting. The output layer uses a sigmoid activation function for binary classification.

- Prepares the model for training by specifying the Adam optimizer, binary cross-entropy loss function for binary classification, and tracks the accuracy metric.

- Fits the model on the standardized training data for 100 epochs, using 10% of it as a validation set to monitor performance, without verbosity to minimize output during training.

- Assesses the model's performance on the standardized test data, obtaining the loss and accuracy, then prints the accuracy to give an indication of how well the model predicts unseen data.

In [8]:
# Build a basic model
basic_model = Sequential([
    Dense(32, activation='relu', input_shape=(train_sc.shape[1],)),
    Dropout(0.2),
    Dense(16, activation='relu'),
    Dense(1, activation='sigmoid')
])

# Compile the model
basic_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
basic_model.fit(train_sc, y_train, epochs=100, validation_split=0.1, verbose=0)

# Evaluate the model on the test set
basic_loss, basic_accuracy = basic_model.evaluate(test_sc, y_test)
print("Basic Model Accuracy: ", basic_accuracy)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7113 - loss: 0.7574
Basic Model Accuracy:  0.7124000191688538


### Define the HyperModel
Defines a custom class MyHyperModel that extends the HyperModel class from Keras Tuner, used for hyperparameter tuning.

- The class `MyHyperModel` is designed to construct a neural network model dynamically, with varying hyperparameters.
- The `__init__` method initializes the class with an `input_shape`, which is the shape (number of features) of the input data that the model will expect. This is stored as a class attribute to be used later in building the model.

- The build method creates a neural network model architecture with tunable hyperparameters.
- `hp.Int('units', min_value=10, max_value=100, step=10)` This line specifies that the number of units in the Dense layers should be treated as a hyperparameter, with possible values ranging from 10 to 100 in steps of 10.

- `hp.Float('dropout', min_value=0.0, max_value=0.5, step=0.1)` This specifies that the dropout rate should also be a hyperparameter, ranging from 0.0 to 0.5 with a step of 0.1.

- `model.add(Dense(1, activation='sigmoid'))` Adds an output layer with a single unit and sigmoid activation suitable for binary classification.

- Learning rate for the Adam optimizer is configured with another tunable parameter `(hp.Float('learning_rate', ...))` which varies logarithmically from 0.0001 to 0.01.


In [10]:
class MyHyperModel(HyperModel):
    def __init__(self, input_shape):
        self.input_shape = input_shape

    def build(self, hp):
        model = Sequential()
        model.add(Dense(
            units=hp.Int('units', min_value=10, max_value=100, step=10),
            activation='relu', input_shape=(self.input_shape,)
        ))
        model.add(Dropout(
            hp.Float('dropout', min_value=0.0, max_value=0.5, step=0.1)
        ))
        model.add(Dense(
            units=hp.Int('units', min_value=10, max_value=100, step=10),
            activation='relu'
        ))
        model.add(Dense(1, activation='sigmoid'))
        model.compile(
            optimizer=tf.keras.optimizers.Adam(
                hp.Float('learning_rate', min_value=1e-4, max_value=1e-2, sampling='LOG')),
            loss='binary_crossentropy',
            metrics=['accuracy']
        )
        return model

### Instantiate the Tuner and Perform Hyperparameter Tuning
- Conducts hyperparameter tuning using Keras Tuner's RandomSearch, optimizing the neural network's configuration to maximize validation accuracy by testing different combinations of model parameters and identifying the best performing model.

- It sets up a hyperparameter optimization process targeting the validation accuracy for a model defined by hypermodel.

- The process will try up to 10 different sets of hyperparameters, running each configuration twice to ensure stability in the reported performance metrics, all within the specified project directory for organized storage and potential review.

- This approach is useful for exploring a potentially vast hyperparameter space more efficiently than exhaustively testing all combinations.

In [11]:
# Assuming 'train_sc' and 'y_train' are defined as your scaled training data and labels
input_shape = train_sc.shape[1]  # Extract the number of features

# Create an instance of the HyperModel
hypermodel = MyHyperModel(input_shape=input_shape)

# Instantiate the tuner
tuner = RandomSearch(
    hypermodel,
    objective='val_accuracy',
    max_trials=10,
    executions_per_trial=2,
    directory='tuner_data',
    project_name='Baggage prediction'
)

# Perform hyperparameter tuning
tuner.search(train_sc, y_train, epochs=50, validation_split=0.2)

# Get the best model
best_model = tuner.get_best_models(num_models=1)[0]


Search: Running Trial #1

Value             |Best Value So Far |Hyperparameter
60                |60                |units
0.3               |0.3               |dropout
0.0016902         |0.0016902         |learning_rate

Epoch 1/50
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 4ms/step - accuracy: 0.6825 - loss: 0.5863 - val_accuracy: 0.7010 - val_loss: 0.5589
Epoch 2/50
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - accuracy: 0.7102 - loss: 0.5477 - val_accuracy: 0.7049 - val_loss: 0.5567
Epoch 3/50
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 5ms/step - accuracy: 0.7210 - loss: 0.5406 - val_accuracy: 0.7074 - val_loss: 0.5548
Epoch 4/50
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4ms/step - accuracy: 0.7296 - loss: 0.5290 - val_accuracy: 0.7065 - val_loss: 0.5556
Epoch 5/50
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 4ms/step - accuracy: 0.7389 - loss: 0.5206

KeyboardInterrupt: 

In [None]:
# from kerastuner.tuners import BayesianOptimization

# tuner = BayesianOptimization(
#     MyHyperModel(input_shape=20),
#     objective='val_accuracy',
#     max_trials=10,  # Number of different hyperparameter sets to try
#     executions_per_trial=2,  # Number of times to train each model for stability
#     directory='tuner_results',
#     project_name='bayesian_tuning'
# )

# tuner.search(x_train, y_train, epochs=10, validation_data=(x_val, y_val))

In [None]:
# Predict probabilities for the basic and best models
basic_predictions_proba = basic_model.predict(test_sc)
basic_predictions = (basic_predictions_proba > 0.5).astype(int)

best_predictions_proba = best_model.predict(test_sc)
best_predictions = (best_predictions_proba > 0.5).astype(int)

# Print classification report for basic model
print("Basic Model Classification Report:")
print(classification_report(y_test, basic_predictions, target_names=['Benign', 'Malignant']))

# Calculate and print ROC AUC for the basic model
basic_auc = roc_auc_score(y_test, basic_predictions_proba)
print("Basic Model ROC AUC:", basic_auc)

# Print classification report for best model
print("Best Model Classification Report:")
print(classification_report(y_test, best_predictions, target_names=['Benign', 'Malignant']))

# Calculate and print ROC AUC for the best model
best_auc = roc_auc_score(y_test, best_predictions_proba)
print("Best Model ROC AUC:", best_auc)

Basic Model Classification Report:
              precision    recall  f1-score   support

      Benign       0.98      0.95      0.96        43
   Malignant       0.97      0.99      0.98        71

    accuracy                           0.97       114
   macro avg       0.97      0.97      0.97       114
weighted avg       0.97      0.97      0.97       114

Basic Model ROC AUC: 0.9931215198165739
Best Model Classification Report:
              precision    recall  f1-score   support

      Benign       1.00      0.95      0.98        43
   Malignant       0.97      1.00      0.99        71

    accuracy                           0.98       114
   macro avg       0.99      0.98      0.98       114
weighted avg       0.98      0.98      0.98       114

Best Model ROC AUC: 0.9934490664919752
