<img src="https://raw.githubusercontent.com/optuna/optuna/master/docs/image/optuna-logo.png"
     align="right"
     width="30%"/>

# Hyper-Parameter Optimization with Optuna

This trains an XGBoost model and does hyperparameter optimization using Optuna to search and scikit-learn for cross validation.

## Launch Cluster


In [None]:
%%time

import coiled

cluster = coiled.Cluster(
    n_workers=20,
    name="hpo",
)

client = cluster.get_client()

## Optuna Study

We use the Dask scheduler to track work between the different experiments.

In [None]:
import optuna
from optuna.integration.dask import DaskStorage

study = optuna.create_study(
    direction="maximize",
    storage=DaskStorage(),
)

## Objective function

We ...

-   Load data
-   Get recommended hyper-parameters from Optuna
-   Train
-   Report Score

In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import cross_val_score, KFold
import xgboost as xgb
from optuna.samplers import RandomSampler

def objective(trial):
    X, y = load_breast_cancer(return_X_y=True)
    params = {
        "n_estimators": 10,
        "verbosity": 0,
        "lambda": trial.suggest_float("lambda", 1e-8, 100.0, log=True),
        "alpha": trial.suggest_float("alpha", 1e-8, 100.0, log=True),
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.2, 1.0),
        "max_depth": trial.suggest_int("max_depth", 2, 10, step=1),
        "min_child_weight": trial.suggest_float("min_child_weight", 1e-8, 100, log=True),
        "learning_rate": trial.suggest_float("learning_rate", 1e-8, 1.0, log=True),
        "gamma": trial.suggest_float("gamma", 1e-8, 1.0, log=True),
        "grow_policy": "depthwise",
        "eval_metric": "logloss"
    }
    clf = xgb.XGBClassifier(**params)
    fold = KFold(n_splits=5, shuffle=True, random_state=0)
    score = cross_val_score(clf, X, y, cv=fold, scoring='neg_log_loss')
    return score.mean()


## Execute at Scale

All of the actual coordination happens within Optuna.  It's Dask's job just to provide a lot of firepower, which we do by submitting the optimize method many times.

In [None]:
from dask.distributed import wait

futures = [
    client.submit(study.optimize, objective, n_trials=1, pure=False)
    for _ in range(500)
]

_ = wait(futures)

## Results

In [None]:
study.best_params

In [None]:
study.best_value

In [None]:
from optuna.visualization.matplotlib import plot_optimization_history, plot_param_importances

plot_optimization_history(study)

In [None]:
plot_param_importances(study)

## Clean up

This cost us about $0.08

In [None]:
cluster.shutdown()