# Tunning HyperParameters

**Summary of the Article**
- High level intro of Optuna and Hyperparams.
- Hyperparameter tuning with Optuna.
## Intro to HyperParameters

HyperParameters are the parameters of a model are values that control the learning process of a given model. They determine how the model parameters will end up being optimized in the training phase. Part of the job of a machine learning engineer is to find the best values for the hyperparameters of a model, so there are several techinques to set up such values:
- Not tunning and use common values.
- Random Search.
- Grid Search.
- Using tools like Optuna.

## Intro to Optuna 
Optuna is a hyperparameter optimization framework applicable to machine learning frameworks and black-box optimization solvers. It is a Python library that provides a simple and easy way to optimize hyperparameters of machine learning models. 

Optuna optimization is divided in two phases, the Sampling Strategy and the Prunning Strategy:
 - *Sampling Strategy* - Where to look? Focus using Baeasyian filtering to look the most promising hyperparameters.
 - *Prunning Strategy* - If a trial is not promising, it temrinates early to save time for better trials.

 Note: An Optuna trial is a single run of a model with a set of hyperparameters.

 The structure of an Optuna search is as follows:
 ```python	
import optuna

def objective(trial):
    ## your code here
    return evaluation_score
study = optuna.create_study()
num_trials = 100
study.optimize(objective, n_trials=num_trials)
```

 

In [7]:
import os
"""
Note: The code bellow is adapted from the optuna repo examples! 

Optuna example that optimizes a multioutput regressor configuration for
electrical grid constraint detection forecasting using XGBoost.
In this example, we optimize the validation rmse. We optimize both the choice of booster model and its
hyperparameters.
""" 

import numpy as np

import sklearn.datasets
import pandas as pd 
import sklearn.metrics
from sklearn.model_selection import train_test_split
import optuna

import sys
sys.path.append('..')
from thesis_package import aimodels as my_ai, utils

def objective(trial):
    # import data
    y_min_u = pd.read_csv('..\data\ground_truth\\res_line_percent_max_bool_constr.csv').drop(columns=['timestamps'])
    exogenous_data = pd.read_csv('..\data\processed\production\exogenous_data_extended.csv').drop(columns=['date'])
    train_x, valid_x, train_y, valid_y = utils.split_and_suffle(exogenous_data, y_min_u)
    data = {'X_train': train_x, 'X_test': valid_x, 'y_train': train_y, 'y_test': valid_y}
    param = {
        "verbosity": 0,
        "objective": "binary:logistic",
        # use exact for small dataset.
        "tree_method": "exact",
        # defines booster, gblinear for linear functions.
        "booster": trial.suggest_categorical("booster", ["gbtree", "gblinear", "dart"]),
        # L2 regularization weight.
        "lambda": trial.suggest_float("lambda", 1e-8, 1.0, log=True),
        # L1 regularization weight.
        "alpha": trial.suggest_float("alpha", 1e-8, 1.0, log=True),
        # sampling ratio for training data.
        "subsample": trial.suggest_float("subsample", 0.2, 1.0),
        # sampling according to each tree.
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.2, 1.0),
    }

    if param["booster"] in ["gbtree", "dart"]:
        # maximum depth of the tree, signifies complexity of the tree.
        param["max_depth"] = trial.suggest_int("max_depth", 3, 9, step=2)
        # minimum child weight, larger the term more conservative the tree.
        param["min_child_weight"] = trial.suggest_int("min_child_weight", 2, 10)
        param["eta"] = trial.suggest_float("eta", 1e-8, 1.0, log=True)
        # defines how selective algorithm is.
        param["gamma"] = trial.suggest_float("gamma", 1e-8, 1.0, log=True)
        param["grow_policy"] = trial.suggest_categorical("grow_policy", ["depthwise", "lossguide"])

    if param["booster"] == "dart":
        param["sample_type"] = trial.suggest_categorical("sample_type", ["uniform", "weighted"])
        param["normalize_type"] = trial.suggest_categorical("normalize_type", ["tree", "forest"])
        param["rate_drop"] = trial.suggest_float("rate_drop", 1e-8, 1.0, log=True)
        param["skip_drop"] = trial.suggest_float("skip_drop", 1e-8, 1.0, log=True)

    model = my_ai.Context(my_ai.XGBoostRegressorStrategy(param))
    model.fit(data)
    prediction = model.predict(data)
    # evaluate the regression performance
    rmse = sklearn.metrics.mean_squared_error(valid_y, prediction, squared=False)
    return rmse


study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=2, timeout=600)

print("Number of finished trials: ", len(study.trials))
print("Best trial:")
trial = study.best_trial

print("  Value: {}".format(trial.value))
print("  Params: ")
for key, value in trial.params.items():
    print("    {}: {}".format(key, value))
# Write the results to a csv file.
with open("./hyper_params_results/optuna_xgboost_regression_results.csv", "w") as f:
    f.write("params,value\n")
    for key, value in trial.params.items():
        f.write("{},{}\n".format(key, value))
    f.write("value,{}\n".format(trial.value))

[32m[I 2022-08-07 15:25:45,501][0m A new study created in memory with name: no-name-e72e6130-23ef-48c3-a007-247cd1c54fe3[0m


<class 'thesis_package.aimodels.Context'>


[32m[I 2022-08-07 15:26:32,622][0m Trial 0 finished with value: 0.49984508752822876 and parameters: {'booster': 'gbtree', 'lambda': 1.8406093472937077e-08, 'alpha': 0.07293031730603843, 'subsample': 0.4131826128046079, 'colsample_bytree': 0.3088078275355107, 'max_depth': 9, 'min_child_weight': 9, 'eta': 3.099182894861593e-06, 'gamma': 0.3783100528550611, 'grow_policy': 'lossguide'}. Best is trial 0 with value: 0.49984508752822876.[0m


<class 'thesis_package.aimodels.Context'>


[32m[I 2022-08-07 15:27:24,347][0m Trial 1 finished with value: 0.43325719237327576 and parameters: {'booster': 'gbtree', 'lambda': 1.9818016585265665e-07, 'alpha': 2.4424849536649444e-06, 'subsample': 0.3742400184110838, 'colsample_bytree': 0.22126096591807515, 'max_depth': 7, 'min_child_weight': 7, 'eta': 0.00143186761836147, 'gamma': 3.8016446963068855e-05, 'grow_policy': 'lossguide'}. Best is trial 1 with value: 0.43325719237327576.[0m


Number of finished trials:  2
Best trial:
  Value: 0.43325719237327576
  Params: 
    booster: gbtree
    lambda: 1.9818016585265665e-07
    alpha: 2.4424849536649444e-06
    subsample: 0.3742400184110838
    colsample_bytree: 0.22126096591807515
    max_depth: 7
    min_child_weight: 7
    eta: 0.00143186761836147
    gamma: 3.8016446963068855e-05
    grow_policy: lossguide
