This is an example notebook on how to use `SearchSpace`

In [None]:
# import necessary libraries
from HPSearchSpace import SearchSpace

import hyperopt
import optuna
from flaml.tune import tune

from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

To demonstrate how to use `SearchSpace` for different hyperparameter optimization libraries, we use the `iris` dataset to perform a simple classification task.

In [2]:
iris = load_iris()
x, y = iris.data, iris.target

The first step is to define what machine learning methods and what hyperparameter to optimize. The configuration is already stored in `example.yaml` file. We can simply load it.

In [3]:
search_space = SearchSpace(config_file='example.yaml')

The next step is to define an objective function that takes in the config and returns a score to minimize. We simply use the cross-validation score as the objective function.

In [4]:
def objective(config):
    estimator = config['estimators']['estimator_class']
    params = config['estimators']['params']
    model = estimator(**params)
    score = cross_val_score(model, x, y, cv=5).mean()
    return -score

Then, we can use different packages to optimize the hyperparameters. Note that for each run, the results may vary.

- Hyperopt

In [5]:
hyperopt_space = search_space.get_hyperopt_space()
hp_trials = hyperopt.Trials()
hp_best = hyperopt.fmin(objective, hyperopt_space, trials=hp_trials, algo=hyperopt.tpe.suggest, max_evals=100)

100%|██████████| 100/100 [01:43<00:00,  1.03s/trial, best loss: -0.9866666666666667]


In [6]:
print(hp_trials.best_trial["result"], hp_trials.best_trial["misc"]["vals"])

{'loss': -0.9866666666666667, 'status': 'ok'} {'estimators': [np.int64(0)], 'estimators_sklearn.ensemble.GradientBoostingClassifier_max_depth': [], 'estimators_sklearn.ensemble.GradientBoostingClassifier_n_estimators': [], 'estimators_sklearn.ensemble.RandomForestClassifier_max_depth': [], 'estimators_sklearn.ensemble.RandomForestClassifier_n_estimators': [], 'estimators_sklearn.neighbors.KNeighborsClassifier_n_neighbors': [], 'estimators_sklearn.svm.SVC_C': [np.float64(1.6508939957015805)], 'estimators_sklearn.svm.SVC_kernel': [np.int64(0)]}


- Optuna

In [7]:
# optuna objective function
def objective_optuna(trial):
    # the objective function of optuna takes in a trial object, and sampling in done using the trial object.
    # A workaround is wrap the objective function. It takes in trial object and returns the sampled hyperparameters.
    config = search_space.get_optuna_space(trial)
    return objective(config)

In [8]:
# optuna optimization
study = optuna.create_study(direction="minimize")
study.optimize(objective_optuna, n_trials=100)

[I 2025-02-05 19:43:29,831] A new study created in memory with name: no-name-a1fc2360-483f-4c5b-bdbc-dacd2704a1ae
[I 2025-02-05 19:43:31,823] Trial 0 finished with value: -0.96 and parameters: {'estimators': 'sklearn.ensemble.RandomForestClassifier', 'estimators_sklearn.ensemble.RandomForestClassifier_n_estimators': 464, 'estimators_sklearn.ensemble.RandomForestClassifier_max_depth': 19}. Best is trial 0 with value: -0.96.
[I 2025-02-05 19:43:36,026] Trial 1 finished with value: -0.9666666666666668 and parameters: {'estimators': 'sklearn.ensemble.RandomForestClassifier', 'estimators_sklearn.ensemble.RandomForestClassifier_n_estimators': 983, 'estimators_sklearn.ensemble.RandomForestClassifier_max_depth': 26}. Best is trial 1 with value: -0.9666666666666668.
[I 2025-02-05 19:43:39,142] Trial 2 finished with value: -0.9666666666666668 and parameters: {'estimators': 'sklearn.ensemble.GradientBoostingClassifier', 'estimators_sklearn.ensemble.GradientBoostingClassifier_n_estimators': 503, '

In [9]:
print(study.best_trial.value, study.best_trial.params)

-0.9866666666666667 {'estimators': 'sklearn.svm.SVC', 'estimators_sklearn.svm.SVC_C': 0.9808419850049673, 'estimators_sklearn.svm.SVC_kernel': 'linear'}


- FLAML

In [10]:
flaml_result = tune.run(objective, 
                  config=search_space.get_flaml_space(),
                  mode="min", num_samples=100, use_ray=False)

[flaml.tune.tune: 02-05 19:44:31] {582} INFO - Using search algorithm BlendSearch.


[I 2025-02-05 19:44:31,865] A new study created in memory with name: optuna


[flaml.tune.tune: 02-05 19:44:31] {884} INFO - trial 1 config: {'estimators': {'params': {'n_estimators': 106, 'max_depth': 5}, 'estimator_name': 'sklearn.ensemble.GradientBoostingClassifier', 'estimator_class': <class 'sklearn.ensemble._gb.GradientBoostingClassifier'>}}
[flaml.tune.tune: 02-05 19:44:33] {884} INFO - trial 2 config: {'estimators': {'params': {'C': 9.662068742486023e-06, 'kernel': 'linear'}, 'estimator_name': 'sklearn.svm.SVC', 'estimator_class': <class 'sklearn.svm._classes.SVC'>}}
[flaml.tune.tune: 02-05 19:44:33] {884} INFO - trial 3 config: {'estimators': {'params': {'n_estimators': 97, 'max_depth': 22}, 'estimator_name': 'sklearn.ensemble.GradientBoostingClassifier', 'estimator_class': <class 'sklearn.ensemble._gb.GradientBoostingClassifier'>}}
[flaml.tune.tune: 02-05 19:44:33] {884} INFO - trial 4 config: {'estimators': {'params': {'C': 0.00013343220665499586, 'kernel': 'linear'}, 'estimator_name': 'sklearn.svm.SVC', 'estimator_class': <class 'sklearn.svm._classes

In [11]:
print(flaml_result.best_result)

{'_metric': np.float64(-0.9666666666666668), 'training_iteration': 0, 'config': {'estimators': {'estimator_name': 'sklearn.ensemble.GradientBoostingClassifier', 'estimator_class': <class 'sklearn.ensemble._gb.GradientBoostingClassifier'>, 'params': {'n_estimators': 88, 'max_depth': 18}}}, 'config/estimators': {'estimator_name': 'sklearn.ensemble.GradientBoostingClassifier', 'estimator_class': <class 'sklearn.ensemble._gb.GradientBoostingClassifier'>, 'params': {'n_estimators': 88, 'max_depth': 18}}, 'experiment_tag': 'exp', 'time_total_s': 0.7919940948486328}
