# A Quick Introduction to Optuna (https://optuna.org/)

This Jupyter notebook goes through the basic usage of Optuna.

- Install Optuna
- Write a training algorithm that involves hyperparameters
  - Read train/valid data
  - Define and train model
  - Evaluate model
- Use Optuna to tune the hyperparameters (hyperparameter optimization, HPO)
- Visualize HPO

## Install `optuna`

Optuna can be installed via `pip` or `conda`.

In [None]:
!pip install --quiet optuna

In [None]:
import optuna

optuna.__version__

## Optimize Hyperparameters

### Define a simple scikit-learn model

We start with a simple random forest model to classify flowers in the Iris dataset. We define a function called `objective` that encapsulates the whole training process and outputs the accuracy of the model.

In [None]:
import sklearn.datasets
import sklearn.ensemble
import sklearn.model_selection

def objective():
    iris = sklearn.datasets.load_iris()  # Prepare the data.
    
    # Decision Tree를 여러 개 쌓은 RandomForest에서 중요한 hyper parameter?
    # number & depth of decision tree
    clf = sklearn.ensemble.RandomForestClassifier(    
        n_estimators=5, max_depth=3)  # Define the model.
    
    # cv : number of cross validation
    return sklearn.model_selection.cross_val_score(
        clf, iris.data, iris.target, n_jobs=-1, cv=3).mean()  # Train and evaluate the model.

print('Accuracy: {}'.format(objective()))

### Optimize hyperparameters of the model

The hyperparameters of the above algorithm are `n_estimators` and `max_depth` for which we can try different values to see if the model accuracy can be improved. The `objective` function is modified to accept a trial object. This trial has several methods for sampling hyperparameters. We create a study to run the hyperparameter optimization and finally read the best hyperparameters.

In [None]:
import optuna

def objective(trial):
    iris = sklearn.datasets.load_iris()
    
    # define search space
    n_estimators = trial.suggest_int('n_estimators', 2, 20) # 자연수
    max_depth = int(trial.suggest_float('max_depth', 1, 32, log=True)) # 자연수
    
    clf = sklearn.ensemble.RandomForestClassifier(
        n_estimators=n_estimators, max_depth=max_depth)
    
    return sklearn.model_selection.cross_val_score(
        clf, iris.data, iris.target, n_jobs=-1, cv=3).mean()

study = optuna.create_study(direction='maximize') # maximize accuracy, minimize loss
study.optimize(objective, n_trials=100) # if num of epochs is 20, 20*100 trials are performed

trial = study.best_trial

print('Accuracy: {}'.format(trial.value))
print("Best hyperparameters: {}".format(trial.params))

# Optuna <=> Bayesian Optimization

https://distill.pub/2020/bayesian-optimization/

It is possible to condition hyperparameters using Python `if` statements. We can for instance include another classifier, a support vector machine, in our HPO and define hyperparameters specific to the random forest model and the support vector machine.

bayesian optimization: uncertainty 측정 -> uncertainty 낮으면서 성능이 높은 곳 vs uncertainty 높지만 새로운 가능성이 있는 곳 -> search 갯수가 적더라도 효과적으로 찾아준다  
                      -> 따라서, hyper parameter optimization에 쓰인다. 

In [None]:
# model selection is also hyper parameter ex) RandomForest vs SVM vs Neural Network
import sklearn.svm

def objective(trial):
    iris = sklearn.datasets.load_iris()

    classifier = trial.suggest_categorical('classifier', ['RandomForest', 'SVC']) # SVC: SVM Classifier
    
    if classifier == 'RandomForest':
        # search hyper parameter - int
        n_estimators = trial.suggest_int('n_estimators', 2, 20) 
        max_depth = int(trial.suggest_float('max_depth', 1, 32, log=True))

        # build classifier
        clf = sklearn.ensemble.RandomForestClassifier(
            n_estimators=n_estimators, max_depth=max_depth)
    else:
        # search hyper parameter - float
        c = trial.suggest_float('svc_c', 1e-10, 1e10, log=True)
        
        # build classifier
        clf = sklearn.svm.SVC(C=c, gamma='auto')

    return sklearn.model_selection.cross_val_score(
        clf, iris.data, iris.target, n_jobs=-1, cv=3).mean()

study = optuna.create_study(direction='maximize') # maximize accuracy
study.optimize(objective, n_trials=100)

trial = study.best_trial

print('Accuracy: {}'.format(trial.value))
print("Best hyperparameters: {}".format(trial.params))
# SVC의 성능이 RandomForest에 비해 압도적으로 높기 떄문에, trials의 뒤로 갈수록 SVC 위주로 searching

### Plotting the study

Plotting the optimization history of the study.

In [None]:
# trials의 앞쪽일수록 성능이 낮은 점이 많다 -> uncertainty가 높아 random search를 하면서 실패한 일이 많은 것
# trials의 뒷쪽일수록 uncertainty가 대부분 낮아진 상태이므로 실패를 거의 하지 않았다
optuna.visualization.plot_optimization_history(study)

Plotting the accuracies for each hyperparameter for each trial.

In [None]:
# SVM이 RandomForest보다 성능이 높다
optuna.visualization.plot_slice(study)

Plotting the accuracy surface for the hyperparameters involved in the random forest model.

In [None]:
# loss curve, contour plot -> 최저점에서 최대점으로 갈수록 성능이 좋아진다(색깔이 진해짐)
# recent trend: visualizing the loss landscape of neural nets
# smoother loss landscape = less local minimum = stable learning
optuna.visualization.plot_contour(study, params=['n_estimators', 'max_depth'])