# Optuna Hyperparameter optimization

https://github.com/optuna/optuna

Tutorials
* lightgbm https://towardsdatascience.com/how-to-beat-the-heck-out-of-xgboost-with-lightgbm-comprehensive-tutorial-5eba52195997
* optuna https://towardsdatascience.com/kagglers-guide-to-lightgbm-hyperparameter-tuning-with-optuna-in-2021-ed048d9838b5

## Setup libraries

In [2]:
import optuna

In [3]:
import sklearn.datasets
import sklearn.ensemble
import sklearn.model_selection
import sklearn.svm

In [4]:
# FYI: Objective functions can take additional arguments
# (https://optuna.readthedocs.io/en/stable/faq.html#objective-func-additional-args).
def objective(trial):
    iris = sklearn.datasets.load_iris()
    x, y = iris.data, iris.target

    classifier_name = trial.suggest_categorical(
        "classifier", ["SVC", "RandomForest"]
    )
    if classifier_name == "SVC":
        svc_c = trial.suggest_float("svc_c", 1e-10, 1e10, log=True)
        classifier_obj = sklearn.svm.SVC(C=svc_c, gamma="auto")
    else:
        rf_max_depth = trial.suggest_int("rf_max_depth", 2, 32, log=True)
        classifier_obj = sklearn.ensemble.RandomForestClassifier(
            max_depth=rf_max_depth, n_estimators=10
        )

    score = sklearn.model_selection.cross_val_score(
        classifier_obj, x, y, n_jobs=-1, cv=3
    )
    accuracy = score.mean()
    return accuracy

## Optimize hyper parameters

In [7]:
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100)

[32m[I 2021-11-14 21:57:19,395][0m A new study created in memory with name: no-name-6c49a3c3-92b5-40de-9d78-2d2a02beeda2[0m
[32m[I 2021-11-14 21:57:19,432][0m Trial 0 finished with value: 0.9533333333333333 and parameters: {'classifier': 'RandomForest', 'rf_max_depth': 9}. Best is trial 0 with value: 0.9533333333333333.[0m
[32m[I 2021-11-14 21:57:19,468][0m Trial 1 finished with value: 0.9533333333333333 and parameters: {'classifier': 'RandomForest', 'rf_max_depth': 29}. Best is trial 0 with value: 0.9533333333333333.[0m
[32m[I 2021-11-14 21:57:19,481][0m Trial 2 finished with value: 0.96 and parameters: {'classifier': 'SVC', 'svc_c': 193110242.45379505}. Best is trial 2 with value: 0.96.[0m
[32m[I 2021-11-14 21:57:19,495][0m Trial 3 finished with value: 0.9466666666666667 and parameters: {'classifier': 'SVC', 'svc_c': 0.12544090855342385}. Best is trial 2 with value: 0.96.[0m
[32m[I 2021-11-14 21:57:19,507][0m Trial 4 finished with value: 0.32 and parameters: {'classi

## Results

In [6]:
print(study.best_trial)

FrozenTrial(number=43, values=[0.9866666666666667], datetime_start=datetime.datetime(2021, 11, 14, 21, 56, 7, 846108), datetime_complete=datetime.datetime(2021, 11, 14, 21, 56, 7, 861429), params={'classifier': 'SVC', 'svc_c': 4.34044996322569}, distributions={'classifier': CategoricalDistribution(choices=('SVC', 'RandomForest')), 'svc_c': LogUniformDistribution(high=10000000000.0, low=1e-10)}, user_attrs={}, system_attrs={}, intermediate_values={}, trial_id=43, state=TrialState.COMPLETE, value=None)


In [9]:
study.best_params

{'classifier': 'SVC', 'svc_c': 4.738517484230827}