Advanced Hyperparameter Tuning with Bayesian Optimisation

Adavanced method for hyperparamter tunign that balances exploration and exploitation. 
uses a probabilistic model to guide the search for optimal hyperparameters
Bayesian optimisation works by building a probabilistic model (usually a Gaussian Process) of the objective function. It uses this model to predict which hyperparameters are likely to yield the best results, balancing exploration (trying new areas) and exploitation (refining known good areas). At each step, it selects the next set of hyperparameters to evaluate based on an acquisition function, updates the model with the new result, and repeats the process until it finds the optimal values.

Why use it?
Efficient for high dimensional and expensive to evaluate functions 
reduces the numer of evaluations required to find the best parameters

Apply bayesian optimisation using optuna to tuna an xgboost model and compare the results with grid search and random search

In [44]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from sklearn.preprocessing import StandardScaler
import pandas as pd
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
import optuna

Load the dataset

In [45]:
data = load_breast_cancer()
X, y = data.data, data.target

Split into training and test sets

In [46]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

Standardise features

In [47]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
print(f"Training data shape {X_train.shape}")
print(f"Test data shape {X_test.shape}")

Training data shape (455, 30)
Test data shape (114, 30)


Train baseline XGBoost model

In [48]:
baseline_model = XGBClassifier(eval_metrix='logloss',random_state=42)
baseline_model.fit(X_train,y_train)

Parameters: { "eval_metrix" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)


Evaluate the model

In [49]:
baseline_preds = baseline_model.predict(X_test)
baseline_accruacy = accuracy_score(y_test,baseline_preds)
print(f"Baseline XGBoost Accuracy {baseline_accruacy:.4f}")

Baseline XGBoost Accuracy 0.9561


Define theobjective function for optuna

In [50]:
def objective(trial):
    params = {
        "n_estimators": trial.suggest_int("n_estimators", 50, 500),
        "max_depth": trial.suggest_int("max_depth", 3, 100),
        "learning_rate": trial.suggest_float("learning_rate", 0.01, 0.3),
        "subsample": trial.suggest_float("subsample", 0.6, 1.0),
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.6, 1.0),
        "gamma": trial.suggest_float("gamma", 0.0, 5),
        "reg_alpha": trial.suggest_float("reg_alpha", 0, 10),
        "reg_lambda": trial.suggest_float("reg_alpha", 0, 10),
    }
    # train XGBoost model with suggested parrams
    model = XGBClassifier(eval_metric="logloss", random_state=42)
    model.fit(X_train, y_train)

    # evaluate model on validation set
    preds = model.predict(X_test)
    accuracy = accuracy_score(y_test, preds)
    return accuracy

Create an optuna study

In [51]:
study = optuna.create_study(direction="maximize")
study.optimize(objective,n_trials=50)

[I 2025-06-11 10:12:14,944] A new study created in memory with name: no-name-bcce6678-a338-4cc6-8b34-ca370cd2ab92
[I 2025-06-11 10:12:14,991] Trial 0 finished with value: 0.956140350877193 and parameters: {'n_estimators': 492, 'max_depth': 97, 'learning_rate': 0.12207077738795719, 'subsample': 0.8761719323279196, 'colsample_bytree': 0.7313084930135565, 'gamma': 4.363595283019187, 'reg_alpha': 0.4333722328445011}. Best is trial 0 with value: 0.956140350877193.
[I 2025-06-11 10:12:15,035] Trial 1 finished with value: 0.956140350877193 and parameters: {'n_estimators': 53, 'max_depth': 41, 'learning_rate': 0.17571810939344557, 'subsample': 0.641244331750831, 'colsample_bytree': 0.8435960377010879, 'gamma': 3.3206810427864513, 'reg_alpha': 9.261860282040459}. Best is trial 0 with value: 0.956140350877193.
[I 2025-06-11 10:12:15,080] Trial 2 finished with value: 0.956140350877193 and parameters: {'n_estimators': 419, 'max_depth': 61, 'learning_rate': 0.09443704723083032, 'subsample': 0.79095

Best hyperparameters

In [52]:
print("Best hyperparameters", study.best_params)
print("Best accuracy", study.best_value)

Best hyperparameters {'n_estimators': 492, 'max_depth': 97, 'learning_rate': 0.12207077738795719, 'subsample': 0.8761719323279196, 'colsample_bytree': 0.7313084930135565, 'gamma': 4.363595283019187, 'reg_alpha': 0.4333722328445011}
Best accuracy 0.956140350877193


Define the parameter grid

In [53]:
param_grid = {
    'n_estimators': [100,200,300],
    'max_depth': [3,5,7],
    'learning_rate': [0.01,0.1,0.2],
    'subsample': [0.6,0.8,1.0]
    }

Train XGbBoost with Grid Search

In [54]:
grid_search = GridSearchCV(
    estimator=XGBClassifier(eval_metric="logloss", random_state=42),
    param_grid=param_grid,
    scoring='accuracy',
    cv=3,
    verbose=1
)
grid_search.fit(X_train,y_train)
# best parameters and accuracy
print("Grid search best parameters:", grid_search.best_params_)
print("Grid search best accuracy:", grid_search.best_score_)

Fitting 3 folds for each of 81 candidates, totalling 243 fits
Grid search best parameters: {'learning_rate': 0.1, 'max_depth': 3, 'n_estimators': 300, 'subsample': 0.6}
Grid search best accuracy: 0.9757900546067155


Deine parameter distribution

In [55]:
param_dist = {
    "n_estimators": [50, 100, 200, 300, 400],
    "max_depth": [3, 5, 7, 9],
    "learning_rate": [0.01, 0.05, 0.1, 0.2],
    "subsample": [0.6, 0.7, 0.8, 0.9, 1.0],
    "colsample_bytree": [0.6, 0.7, 0.8, 0.9, 1.0],
}

Train xgboost with random search

In [56]:
random_search = RandomizedSearchCV(
    estimator=XGBClassifier(eval_metric="logloss", random_state=42),
    param_distributions=param_dist,
    n_iter=50,
    scoring="accuracy",
    cv=3,
    verbose=1,
    random_state=42,
)

random_search.fit(X_train,y_train)

# best parameters and accuracy
print("Random Search best parameters", random_search.best_params_)
print("Random Search best accuracy", random_search.best_score_)

Fitting 3 folds for each of 50 candidates, totalling 150 fits
Random Search best parameters {'subsample': 0.6, 'n_estimators': 200, 'max_depth': 9, 'learning_rate': 0.05, 'colsample_bytree': 0.7}
Random Search best accuracy 0.9758045776693388
