## Hyperparameter Tuning XGBoost

For some time now I've been wondering what is the default search space used by PyCaret's `tune_model` for XGBoost model. I understand that no one search space yields the best result for all scenarios, but I've gotten some pretty good results when I used PyCaret's `tune_model` with `search_library = 'optuna'`, so naturally I wanted to know the search space used.

Today, I finally figured this out using the help of [perplexity.ai](https://www.perplexity.ai/)!

Here's the search space from the [source](https://github.com/pycaret/pycaret/blob/97649adf8965fd02831c14982a48323b6ce7de4c/pycaret/containers/models/classification.py#L1230-L1240):

```Python
{'learning_rate': UniformDistribution(lower=1e-06, upper=0.5, log=True),
 'n_estimators': IntUniformDistribution(lower=10, upper=300, log=False),
 'subsample': UniformDistribution(lower=0.2, upper=1, log=False),
 'max_depth': IntUniformDistribution(lower=1, upper=11, log=False),
 'colsample_bytree': UniformDistribution(lower=0.5, upper=1, log=False),
 'min_child_weight': IntUniformDistribution(lower=1, upper=4, log=False),
 'reg_alpha': UniformDistribution(lower=1e-10, upper=10, log=True),
 'reg_lambda': UniformDistribution(lower=1e-10, upper=10, log=True),
 'scale_pos_weight': UniformDistribution(lower=1, upper=50, log=False)}
```

In [None]:
# Here's the code to implement the same hyperparameter optimization using Optuna without using PyCaret.
import optuna
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# 1. Define an objective function to be maximized.
def objective(trial):
    X, y = load_breast_cancer(return_X_y=True)
    X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.2)

    # 2. Suggest values of the hyperparameters using a trial object.
    params = {
        # Boosting learning rate (xgb’s “eta”)
        "learning_rate": trial.suggest_float("learning_rate", 1e-6, 0.5, log=True),
        # Number of gradient boosted trees.
        "n_estimators": trial.suggest_int("n_estimators", 10, 300),
        # Sampling ratio for training data.
        "subsample": trial.suggest_float("subsample", 0.2, 1.0, log=False),
        # Maximum tree depth for base learners.
        "max_depth": trial.suggest_int("max_depth", 1, 11, step=1),
        # Sampling ratio of columns when constructing each tree.
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.5, 1.0, log=False),
        # minimum child weight, larger the term more conservative the tree.
        "min_child_weight": trial.suggest_float("min_child_weight", 1, 4, log=False),
        # L1 regularization weight (xgb’s alpha).
        "reg_alpha": trial.suggest_float("reg_alpha", 1e-10, 10.0, log=True),
        # L2 regularization weight (xgb’s lambda).
        "reg_lambda": trial.suggest_float("reg_lambda", 1e-10, 10, log=True),
        # Balancing of positive and negative weights.
        "scale_pos_weight": trial.suggest_float("scale_pos_weight", 1, 50, log=False),
    }

    clf = XGBClassifier(**params)
    clf.fit(X_train, y_train)
    y_preds = clf.predict(X_valid)
    accuracy = accuracy_score(y_valid, y_preds) # PyCaret uses accuracy as the default metric.
    return accuracy

# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

# 4. Print the best parameter.
print(study.best_params)

  from .autonotebook import tqdm as notebook_tqdm
[I 2025-01-25 23:02:38,674] A new study created in memory with name: no-name-6174a67e-993e-44d7-ba94-5a53e4e2a8f3
[I 2025-01-25 23:02:38,861] Trial 0 finished with value: 0.6052631578947368 and parameters: {'learning_rate': 0.0013227553947418294, 'n_estimators': 212, 'subsample': 0.5075700892365237, 'max_depth': 5, 'colsample_bytree': 0.8548878710319846, 'min_child_weight': 1.4948683678333534, 'reg_alpha': 6.833867981674323e-07, 'reg_lambda': 6.550218955483345e-06, 'scale_pos_weight': 39.722840487958436}. Best is trial 0 with value: 0.6052631578947368.
[I 2025-01-25 23:02:38,957] Trial 1 finished with value: 0.956140350877193 and parameters: {'learning_rate': 0.13872333961293243, 'n_estimators': 262, 'subsample': 0.28200106543917, 'max_depth': 5, 'colsample_bytree': 0.5148418919728771, 'min_child_weight': 1.9886273601214315, 'reg_alpha': 2.9525558718308784e-05, 'reg_lambda': 5.593456535521854e-10, 'scale_pos_weight': 26.53166114100737}.

{'learning_rate': 0.0476769561387674, 'n_estimators': 244, 'subsample': 0.49749823079045474, 'max_depth': 7, 'colsample_bytree': 0.5798886887633423, 'min_child_weight': 3.430041859343055, 'reg_alpha': 2.0922234799437995e-06, 'reg_lambda': 1.2042849929927403e-09, 'scale_pos_weight': 42.228474492092225}
