<img src="https://github.com/pfnet-research/optuna-hands-on/blob/master/en/files/logo.jpg?raw=1"/>

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

- GitHub: https://github.com/pfnet/optuna
- Document: https://optuna.readthedocs.io/en/stable/

This notebook describes the basic usage of Optuna with simple optimization tasks of quadratic function and linear regression.

## Installation
First of all, install Optuna by running the following cell.

In [None]:
!pip install optuna

## Basic Usage
Below is the basic usage of Optuna. You can immediately start an optimization task just filling the following template with your machine learning logic and the number of trials.

```python
import optuna

def objective(trial):  # `trial` is an object passed by Optuna.
    some_machine_learning_logic(trial)  # Write your machine learning logic here.
    return evaluation_score  # Return the evaluation score of the trained model.

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=N_TRIALS)  # Specify the number of trials. 
```

## Optimize Quadratic Function
Before optimizing a machine learning model, let's see how Optuna solves a very simple task that minimizes the output of $f(x) = (x - 2)^2$.
Although the answer is obviously $f(x) = 0$ when $x = 2$, Optuna doesn't know how to solve that.

In [None]:
import optuna  # Remember to install optuna with `!pip install optuna` first.

def objective(trial):
    x = trial.suggest_uniform('x', -100, 100)
    return (x - 2) ** 2

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=100)

Executing the cell above, you should see 100 lines of execution log. Optuna calls `objective` 100 times changing the value of `x`, where the range of `x` is specified as $[-100, 100)$ in `trial.suggest_uniform('x', -100, 100)`. A `trial` is an object passed by Optuna, corresponds to a single call of `objective`, and provides interfaces to get next hyperparameter to be tried.

Note that `objective` is a blackbox function for Optuna. The library only observes the input, `x`, and the output of the function.  The library gradually improves `x` with a smart internal algorithm (Bayesian optimization).

You can access to the best result with `study.best_value` and `study.best_params`, which should be near $x = 2$.

In [None]:
print('Minimum objective value: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

In summary, you need the following steps to set up the optimization.

- Define the objective function that calculates the minimization/maximization target.
- Inside the objective function, set the hyperparameters to be optimized with `suggest` methods.
- Instantiate the `study` object.
- Start the optimization with `study.optimize`, specifying the number of trials with `n_trials`.

## Optimize Machine Learning Models

Let's optimize the following machine learning logic, where a linear regression model (Lasso) is trained for the Boston Housing dataset.

In [None]:
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

# hyperparameter setting
alpha = 1.0

# data loading and train-test split
X, y = sklearn.datasets.load_boston(return_X_y=True)
X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)

# model training and evaluation
model = sklearn.linear_model.Lasso(alpha=alpha)
model.fit(X_train, y_train)
y_pred = model.predict(X_val)
error = sklearn.metrics.mean_squared_error(y_val, y_pred)

# output: evaluation score
print('Mean squared error: ' + str(error))

Performance of Lasso regression is sensitive to the L1 constant, `alpha`, and it's tiresome for humans to manually search for the appropriate value. With Optuna, you can search for `alpha` as follows. Note that you just need to wrap the machine learning logic in the previous cell with `objective` and to set up `study` object.

In [None]:
import optuna
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

def objective(trial):
    # hyperparameter setting
    alpha = trial.suggest_uniform('alpha', 0.0, 2.0)
    
    # data loading and train-test split
    X, y = sklearn.datasets.load_boston(return_X_y=True)
    X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)
    
    # model training and evaluation
    model = sklearn.linear_model.Lasso(alpha=alpha)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    error = sklearn.metrics.mean_squared_error(y_val, y_pred)

    # output: evaluation score
    return error

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=20)

Let's see the best result among 20 trials.

In [None]:
print('Minimum mean squared error: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

To access the results of all trials, you can use `study.trials_dataframe`, which shows the details of trials as a pandas dataframe.

In [None]:
study.trials_dataframe()

## Imperative Interface: Search Conditional Hyperparameters

Optuna deals with conditional hyperparameters with its imperative (define-by-run) interace.
Suppose that you are wondering which regularization method is better: `Ridge` or `Lasso`. You also want to optimize the regularization constant of each method.
In this case, you have three hyperparameters to be optimized.

- `regression_method`: `'ridge'` or `'lasso'`
- `ridge_alpha`: the regularization constant of `ridge`
- `lasso_alpha`: the regularization constant of `lasso`

Note that `ridge_alpha` and `lasso_alpha` are conditional hyperparameters:
`ridge_alpha` appears in the search space only when `regression_method` is `ridge`; and `lasso_alpha` does only when `regression_method` is `lasso`.

In [None]:
import optuna
import sklearn.datasets
import sklearn.linear_model
import sklearn.metrics

def objective(trial):
    # hyperparameter setting
    regression_method = trial.suggest_categorical('regression_method', ('ridge', 'lasso'))
    if regression_method == 'ridge':
        ridge_alpha = trial.suggest_uniform('ridge_alpha', 0.0, 2.0)
        model = sklearn.linear_model.Ridge(alpha=ridge_alpha)
    else:
        lasso_alpha = trial.suggest_uniform('lasso_alpha', 0.0, 2.0)
        model = sklearn.linear_model.Lasso(alpha=lasso_alpha)
    
    # data loading and train-test split
    X, y = sklearn.datasets.load_boston(return_X_y=True)
    X_train, X_val, y_train, y_val = sklearn.model_selection.train_test_split(X, y, random_state=0)

    # model training and evaluation
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    error = sklearn.metrics.mean_squared_error(y_val, y_pred)
  
    # output: evaluation score
    return error

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=20)

Let's see the optimization results.

In [None]:
print('Minimum mean squared error: ' + str(study.best_value))
print('Best parameter: ' + str(study.best_params))

study.trials_dataframe()

## Conclusion
This notebook summarized the basic usage of Optuna and its imperative interface. To optimize an ML model, you just need to define an objective function that includes a usual logic of training and evaluation. See also the official [document](https://optuna.readthedocs.io/en/stable/index.html) and [tutorial](https://optuna.readthedocs.io/en/stable/tutorial/index.html) for more details.