<img src="https://github.com/pfnet-research/optuna-hands-on/blob/master/en/files/logo.jpg?raw=1"/>

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

- GitHub: https://github.com/pfnet/optuna
- Document: https://optuna.readthedocs.io/en/stable/

This notebook describes the basic usage of Optuna with simple optimization tasks of quadratic function and linear regression.

## Installation
First of all, install Optuna by running the following cell.

In [1]:
!pip install optuna

Collecting optuna
[?25l  Downloading https://files.pythonhosted.org/packages/2b/21/d13081805e1e1afc71f5bb743ece324c8bd576237c51b899ecb38a717502/optuna-2.7.0-py3-none-any.whl (293kB)
[K     |█▏                              | 10kB 14.7MB/s eta 0:00:01[K     |██▎                             | 20kB 16.5MB/s eta 0:00:01[K     |███▍                            | 30kB 11.0MB/s eta 0:00:01[K     |████▌                           | 40kB 9.1MB/s eta 0:00:01[K     |█████▋                          | 51kB 5.2MB/s eta 0:00:01[K     |██████▊                         | 61kB 5.7MB/s eta 0:00:01[K     |███████▉                        | 71kB 5.9MB/s eta 0:00:01[K     |█████████                       | 81kB 6.4MB/s eta 0:00:01[K     |██████████                      | 92kB 6.4MB/s eta 0:00:01[K     |███████████▏                    | 102kB 4.9MB/s eta 0:00:01[K     |████████████▎                   | 112kB 4.9MB/s eta 0:00:01[K     |█████████████▍                  | 122kB 4.9MB/s eta 0:

## Basic Usage
Below is the basic usage of Optuna. You can immediately start an optimization task just filling the following template with your machine learning logic and the number of trials.

```python
import optuna

def objective(trial):  # `trial` is an object passed by Optuna.
    some_machine_learning_logic(trial)  # Write your machine learning logic here.
    return evaluation_score  # Return the evaluation score of the trained model.

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=N_TRIALS)  # Specify the number of trials. 
```

## Optimize Quadratic Function
Before optimizing a machine learning model, let's see how Optuna solves a very simple task that minimizes the output of $f(x) = (x - 2)^2$.
Although the answer is obviously $f(x) = 0$ when $x = 2$, Optuna doesn't know how to solve that.

In [2]:
import optuna  # Remember to install optuna with `!pip install optuna` first.

def objective(trial):
    x = trial.suggest_uniform('x', -100, 100)
    return (x - 2) ** 2

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)

[32m[I 2021-05-11 12:16:20,254][0m A new study created in memory with name: no-name-1c3333e5-1d54-40d1-9c99-2ff940259462[0m
[32m[I 2021-05-11 12:16:20,261][0m Trial 0 finished with value: 130.9826012005561 and parameters: {'x': -9.444763046937936}. Best is trial 0 with value: 130.9826012005561.[0m
[32m[I 2021-05-11 12:16:20,264][0m Trial 1 finished with value: 117.55878805560563 and parameters: {'x': 12.842453046041086}. Best is trial 1 with value: 117.55878805560563.[0m
[32m[I 2021-05-11 12:16:20,266][0m Trial 2 finished with value: 9031.231279042751 and parameters: {'x': -93.03279054643588}. Best is trial 1 with value: 117.55878805560563.[0m
[32m[I 2021-05-11 12:16:20,269][0m Trial 3 finished with value: 8736.81288303605 and parameters: {'x': -91.47091998603656}. Best is trial 1 with value: 117.55878805560563.[0m
[32m[I 2021-05-11 12:16:20,273][0m Trial 4 finished with value: 856.499727969174 and parameters: {'x': 31.266016605769465}. Best is trial 1 with value: 117.

Executing the cell above, you should see 100 lines of execution log. Optuna calls `objective` 100 times changing the value of `x`, where the range of `x` is specified as $[-100, 100)$ in `trial.suggest_uniform('x', -100, 100)`. A `trial` is an object passed by Optuna, corresponds to a single call of `objective`, and provides interfaces to get next hyperparameter to be tried.

Note that `objective` is a blackbox function for Optuna. The library only observes the input, `x`, and the output of the function.  The library gradually improves `x` with a smart internal algorithm (Bayesian optimization).

You can access to the best result with `study.best_value` and `study.best_params`, which should be near $x = 2$.

In [3]:
print('Minimum objective value: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

Minimum objective value: 0.0037056432530147457
Best parameter: {'x': 1.9391260051170063}


In summary, you need the following steps to set up the optimization.

- Define the objective function that calculates the minimization/maximization target.
- Inside the objective function, set the hyperparameters to be optimized with `suggest` methods.
- Instantiate the `study` object.
- Start the optimization with `study.optimize`, specifying the number of trials with `n_trials`.

## Optimize Machine Learning Models

Let's optimize the following machine learning logic, where a linear regression model (Lasso) is trained for the Boston Housing dataset.

In [5]:
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

# hyperparameter setting
alpha = 1.0

# data loading and train-test split
X, y = sklearn.datasets.load_boston(return_X_y=True)
X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)

# model training and evaluation
model = sklearn.linear_model.Lasso(alpha=alpha)
model.fit(X_train, y_train)
y_pred = model.predict(X_val)
error = sklearn.metrics.mean_squared_error(y_val, y_pred)

# output: evaluation score
print('Mean squared error: ' + str(error))

Mean squared error: 36.63182007429979


Performance of Lasso regression is sensitive to the L1 constant, `alpha`, and it's tiresome for humans to manually search for the appropriate value. With Optuna, you can search for `alpha` as follows. Note that you just need to wrap the machine learning logic in the previous cell with `objective` and to set up `study` object.

In [6]:
import optuna
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

def objective(trial):
    # hyperparameter setting
    alpha = trial.suggest_uniform('alpha', 0.0, 2.0)
    
    # data loading and train-test split
    X, y = sklearn.datasets.load_boston(return_X_y=True)
    X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)
    
    # model training and evaluation
    model = sklearn.linear_model.Lasso(alpha=alpha)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    error = sklearn.metrics.mean_squared_error(y_val, y_pred)

    # output: evaluation score
    return error

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=20)

[32m[I 2021-05-11 12:18:05,372][0m A new study created in memory with name: no-name-f3c4d668-34dc-437f-b280-63d6864205d7[0m
[32m[I 2021-05-11 12:18:05,383][0m Trial 0 finished with value: 32.42122355648804 and parameters: {'alpha': 0.1560741952063125}. Best is trial 0 with value: 32.42122355648804.[0m
[32m[I 2021-05-11 12:18:05,397][0m Trial 1 finished with value: 33.68211342686745 and parameters: {'alpha': 0.5530928636676258}. Best is trial 0 with value: 32.42122355648804.[0m
[32m[I 2021-05-11 12:18:05,408][0m Trial 2 finished with value: 30.095883002309282 and parameters: {'alpha': 0.010135813642137226}. Best is trial 2 with value: 30.095883002309282.[0m
[32m[I 2021-05-11 12:18:05,420][0m Trial 3 finished with value: 32.71802593550103 and parameters: {'alpha': 0.2984572330128732}. Best is trial 2 with value: 30.095883002309282.[0m
[32m[I 2021-05-11 12:18:05,432][0m Trial 4 finished with value: 41.24822872893029 and parameters: {'alpha': 1.979482703691161}. Best is tr

Let's see the best result among 20 trials.

In [7]:
print('Minimum mean squared error: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

Minimum mean squared error: 30.095883002309282
Best parameter: {'alpha': 0.010135813642137226}


To access the results of all trials, you can use `study.trials_dataframe`, which shows the details of trials as a pandas dataframe.

In [8]:
study.trials_dataframe()

Unnamed: 0,number,value,datetime_start,datetime_complete,duration,params_alpha,state
0,0,32.421224,2021-05-11 12:18:05.375570,2021-05-11 12:18:05.383662,0 days 00:00:00.008092,0.156074,COMPLETE
1,1,33.682113,2021-05-11 12:18:05.386106,2021-05-11 12:18:05.396963,0 days 00:00:00.010857,0.553093,COMPLETE
2,2,30.095883,2021-05-11 12:18:05.398558,2021-05-11 12:18:05.408643,0 days 00:00:00.010085,0.010136,COMPLETE
3,3,32.718026,2021-05-11 12:18:05.410797,2021-05-11 12:18:05.420454,0 days 00:00:00.009657,0.298457,COMPLETE
4,4,41.248229,2021-05-11 12:18:05.422158,2021-05-11 12:18:05.431939,0 days 00:00:00.009781,1.979483,COMPLETE
5,5,40.485274,2021-05-11 12:18:05.433528,2021-05-11 12:18:05.443208,0 days 00:00:00.009680,1.775391,COMPLETE
6,6,33.653372,2021-05-11 12:18:05.445233,2021-05-11 12:18:05.455901,0 days 00:00:00.010668,0.547315,COMPLETE
7,7,39.754666,2021-05-11 12:18:05.457684,2021-05-11 12:18:05.468363,0 days 00:00:00.010679,1.557534,COMPLETE
8,8,40.154693,2021-05-11 12:18:05.471269,2021-05-11 12:18:05.483665,0 days 00:00:00.012396,1.679884,COMPLETE
9,9,38.824082,2021-05-11 12:18:05.486125,2021-05-11 12:18:05.495831,0 days 00:00:00.009706,1.234119,COMPLETE


## Imperative Interface: Search Conditional Hyperparameters

Optuna deals with conditional hyperparameters with its imperative (define-by-run) interace.
Suppose that you are wondering which regularization method is better: `Ridge` or `Lasso`. You also want to optimize the regularization constant of each method.
In this case, you have three hyperparameters to be optimized.

- `regression_method`: `'ridge'` or `'lasso'`
- `ridge_alpha`: the regularization constant of `ridge`
- `lasso_alpha`: the regularization constant of `lasso`

Note that `ridge_alpha` and `lasso_alpha` are conditional hyperparameters:
`ridge_alpha` appears in the search space only when `regression_method` is `ridge`; and `lasso_alpha` does only when `regression_method` is `lasso`.

In [9]:
import optuna
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

def objective(trial):
    # hyperparameter setting
    regression_method = trial.suggest_categorical('regression_method', ('ridge', 'lasso'))
    if regression_method == 'ridge':
        ridge_alpha = trial.suggest_uniform('ridge_alpha', 0.0, 2.0)
        model = sklearn.linear_model.Ridge(alpha=ridge_alpha)
    else:
        lasso_alpha = trial.suggest_uniform('lasso_alpha', 0.0, 2.0)
        model = sklearn.linear_model.Lasso(alpha=lasso_alpha)
    
    # data loading and train-test split
    X, y = sklearn.datasets.load_boston(return_X_y=True)
    X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)

    # model training and evaluation
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    error = sklearn.metrics.mean_squared_error(y_val, y_pred)
  
    # output: evaluation score
    return error

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=20)

[32m[I 2021-05-11 12:18:36,153][0m A new study created in memory with name: no-name-d983cd16-9b35-4657-bf41-08968d040c84[0m
[32m[I 2021-05-11 12:18:36,198][0m Trial 0 finished with value: 30.744666831147693 and parameters: {'regression_method': 'ridge', 'ridge_alpha': 1.5943692531509752}. Best is trial 0 with value: 30.744666831147693.[0m
[32m[I 2021-05-11 12:18:36,213][0m Trial 1 finished with value: 40.86245935582949 and parameters: {'regression_method': 'lasso', 'lasso_alpha': 1.8787930328582632}. Best is trial 0 with value: 30.744666831147693.[0m
[32m[I 2021-05-11 12:18:36,239][0m Trial 2 finished with value: 32.95664981547832 and parameters: {'regression_method': 'lasso', 'lasso_alpha': 0.37760367272116535}. Best is trial 0 with value: 30.744666831147693.[0m
[32m[I 2021-05-11 12:18:36,255][0m Trial 3 finished with value: 33.62561568447709 and parameters: {'regression_method': 'lasso', 'lasso_alpha': 0.5417186682692967}. Best is trial 0 with value: 30.744666831147693.

Let's see the optimization results.

In [10]:
print('Minimum mean squared error: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

study.trials_dataframe()

Minimum mean squared error: 29.785789658362532
Best parameter: {'regression_method': 'ridge', 'ridge_alpha': 0.003617832783764343}


Unnamed: 0,number,value,datetime_start,datetime_complete,duration,params_lasso_alpha,params_regression_method,params_ridge_alpha,state
0,0,30.744667,2021-05-11 12:18:36.156783,2021-05-11 12:18:36.197972,0 days 00:00:00.041189,,ridge,1.594369,COMPLETE
1,1,40.862459,2021-05-11 12:18:36.200466,2021-05-11 12:18:36.213495,0 days 00:00:00.013029,1.878793,lasso,,COMPLETE
2,2,32.95665,2021-05-11 12:18:36.215501,2021-05-11 12:18:36.238998,0 days 00:00:00.023497,0.377604,lasso,,COMPLETE
3,3,33.625616,2021-05-11 12:18:36.241734,2021-05-11 12:18:36.254582,0 days 00:00:00.012848,0.541719,lasso,,COMPLETE
4,4,40.577762,2021-05-11 12:18:36.257312,2021-05-11 12:18:36.269495,0 days 00:00:00.012183,1.801082,lasso,,COMPLETE
5,5,30.571829,2021-05-11 12:18:36.272239,2021-05-11 12:18:36.288398,0 days 00:00:00.016159,,ridge,1.143756,COMPLETE
6,6,34.560505,2021-05-11 12:18:36.291137,2021-05-11 12:18:36.300163,0 days 00:00:00.009026,0.712536,lasso,,COMPLETE
7,7,30.341155,2021-05-11 12:18:36.302340,2021-05-11 12:18:36.310927,0 days 00:00:00.008587,,ridge,0.700342,COMPLETE
8,8,30.595682,2021-05-11 12:18:36.313096,2021-05-11 12:18:36.321468,0 days 00:00:00.008372,,ridge,1.19864,COMPLETE
9,9,30.699234,2021-05-11 12:18:36.324244,2021-05-11 12:18:36.332549,0 days 00:00:00.008305,,ridge,1.462936,COMPLETE


## Conclusion
This notebook summarized the basic usage of Optuna and its imperative interface. To optimize an ML model, you just need to define an objective function that includes a usual logic of training and evaluation. See also the official [document](https://optuna.readthedocs.io/en/stable/index.html) and [tutorial](https://optuna.readthedocs.io/en/stable/tutorial/index.html) for more details.