# Introduction to the *Optuna* framework

*Optuna* is an automatic hyperparameter optimization software framework, designed for machine learning. It implements an imperative *define-by-run* API, meaning that the user of *Optuna* can dynamically construct the search spaces for the hyperparameters.

In [None]:
# Optuna library import
import optuna

# Additional imports
import numpy
from bokeh.io import output_notebook
from bokeh.plotting import (figure, 
                            show)

# Disable default logging on standard error
optuna.logging.disable_default_handler()

# Redirect Bokeh output to the notebook
output_notebook()

## Optimization 101: minimizing a quadratic function

To get in touch with the *Optuna* framework, let's try to exploit the hyperparameters optimization engine for minimizing the simple, analytically defined, quadratic function *y = (x - 2)<sup>2</sup>*, which is minimized for *x = 2*

*Optuna* is designed to optimize the outcome of a user-defined objective function, performing the number of iterations requested by the user:

In [None]:
# Objective function to be minimized
def objective(trial):
    x = trial.suggest_uniform('x', -20, 20)
    
    return (x - 2) ** 2

study = optuna.create_study()

First of all, let's clarify the terminology *Optuna* follows:
- **Study**: An optimization session, i.e., a set of trials.
- **Trial**: A single call of the objective function.
- **Parameter**: A variable whose value is to be optimized, e.g., *x* in the above example.

Our goal is to find out *x* that minimizes the output of the objective function. During the optimization, *Optuna* repeatedly invokes and evaluates the objective function with different values of *x*. The suggest APIs (e.g., `optuna.trial.Trial.suggest_uniform()`) are called inside `objective` to obtain parameters for a trial. To start the optimization, we create a study object and pass the objective function to method `optuna.study.Study.optimize()` as follows:

In [None]:
study.optimize(objective,
               n_trials=5)

Done! *Optuna* performed the requested number of trials trying to guess the proper value for the parameter to be optimized, looking at the output of the `objective` function.

The `optuna.study.Study` class provides methods for inspecting the optimization outcome:
- `optuna.study.Study.best_params` returns the best hyperparameter values discovered so far.

In [None]:
study.best_params

- `optuna.study.Study.best_value` returns the value of the objective function computed with the best parameters found.

In [None]:
study.best_value

- `optuna.study.Study.best_trial` reports exaustive information about the best trial done so far.

In [None]:
study.best_trial

- `optuna.study.Study.trials` reports exaustive information about all trials the study have done so far.

In [None]:
study.trials

- `optuna.study.Study.trials_dataframe()` provides the same information as `optuna.study.Study.trials` but exports them as a `pandas.DataFrame`

In [None]:
study.trials_dataframe()

## Optimization process control

The optimization procedure offers many handles for controlling its behaviour.

### Controlling the amount of time for which the optimizer run 

As long as real-world hyperparameter procedures, especially those involving machine learning, may require a huge amount of time, `optuna.study.Study.optimize()` allows tuning the maximum time for which the optimization engine is allowed to run:
- The `n_trials` parameter sets the maximum number of trials the study object can do.
- The `timeout` parameter sets the maximum number of seconds for which the study can run.

If both parameters are `None`, the optimizer continues running until the `SIGTERM` signal is received. In the following snippet, the optimizer is forced to stop after 1 second.

In [None]:
def objective(trial):
    x = trial.suggest_uniform('x', -20, 20)
    
    return (x - 2) ** 2

study = optuna.create_study()
study.optimize(objective,
               timeout=1)
study.trials_dataframe()

### Maximizing objective functions

Any `optuna.study.Study` object can be customized for solving both minimization and maximization problems, changing the value of the `direction` keyword argument whenever a new study is created:

In [None]:
def objective(trial):
    x = trial.suggest_uniform('x', -20, 20)
    
    return (x - 2) ** 2

study = optuna.create_study(direction='maximize')
study.optimize(objective,
               timeout=1)
print(f'Maximum value for x: {study.best_params["x"]}')
study = optuna.create_study()
study.optimize(objective,
               timeout=1)
print(f'Minimum value for x: {study.best_params["x"]}')

### Defining parameter spaces

It is possible to choose five different settings for suggesting the optimizer the search space for each hyperparameter to be optimized:
- `optuna.trials.Trial.suggest_categorical` for choosing the parameter among a list of categoricals.
- `optuna.trials.Trial.suggest_int` for integer parameters within a range.
- `optuna.trials.Trial.suggest_uniform` for real parameters within a range.
- `optuna.trials.Trial.suggest_loguniform` for real parameters within a range, using a logarithmic scale.
- `optuna.trials.Trial.suggest_discrete_uniform` for real parameters within a range with a fixed step.

Let's try to run again the same quadratic function minimization, forcing the *x* parameter to be an integer:

In [None]:
def objective_real(trial):
    x = trial.suggest_uniform('x', -20, 20)
    
    return (x - 2) ** 2

def objective_int(trial):
    x = trial.suggest_int('x', -20, 20)
    
    return (x - 2) ** 2

study_real = optuna.create_study()
study_real.optimize(objective_real,
                    n_trials=15)
print(f'Best value for x: {study_real.best_params["x"]}')
study_int = optuna.create_study()
study_int.optimize(objective_int,
                   n_trials=15)
print(f'Best value for x: {study_int.best_params["x"]}')

## Exercises: minimization of analytic functions

Once the basic functionalities of the *Optuna* library are clear, it may be interesting to test the optimization engine on real-world problems. In the following section we will try to find the minimum of two analytical functions, often used as a benchmark for general-purpose optimization engine: the *Rosembrock*'s function and the *Branin RCOS* function. While the former has a single minimization point, the latter has three global minima meaning that it can be used as a test-case for checking the behaviour of *Optuna* in spaces where multiple points minimize the objective function.

### Minimize the *Rosenbrock*'s function

*Rosenbrock*’s banana function is continuous, differentiable, non separable, scalable and unimodal. It is famous for being commonly used as a test case for optimization software. It can be defined with an arbitrary number of input variables but for the sake of the following exercise we will use it in its 2-inputs form: 
$$f(x_0, x_1) = (1 - x_0)^2 + 100(x_1 - x_0^2)^2$$
evaluated in the region such that
$$x_0 \in [-30,30], x_1 \in [-30,30]$$
where a single global minima is present in
$$P(1, 1)$$ 
where 
$$f(P) = 0$$

In [None]:
# Compute Rosembrock's function outcome
def rosenbrock(x0, x1):
    return (1 - x0) ** 2 + 100 * (x1 - x0 ** 2) ** 2

# Define optimization cost function
def objective(trial):
    x0 = trial.suggest_uniform('x0', -30, 30)
    x1 = trial.suggest_uniform('x1', -30, 30)
    
    return rosenbrock(x0, x1)

# Run optimization
study = optuna.create_study()
study.optimize(objective,
               timeout=30)
print(f'Trials performed : {len(study.trials)}')
print(f'Best value       : {study.best_value}')
print(f'Best parameters  : x0={study.best_params["x0"]}, x1={study.best_params["x1"]}')

We can analyze the way *Optuna* guesses the trial parameters, plotting each trial point $P_t(x_0, x_1)$ in the seach space.

In [None]:
# Configure plot
plot = figure(width=900,
              height=450)
plot.circle(x=[trial.params['x0'] 
               for trial in study.trials],
            y=[trial.params['x1']
               for trial in study.trials],
            color='green',
            line_color='black',
            alpha=0.5,
            size=6,
            legend='Trials')
plot.circle(x=[1],
            y=[1],
            color='red',
            line_color='black',
            size=10,
            legend='Minimum')
plot.xaxis.axis_label='x0'
plot.yaxis.axis_label='x1'

# Plot
show(plot)

### Minimize the *Branin RCOS* function

*Branin RCOS* function is a continuous, differentiable, non-deparable, non-scalable, multimodal function. It's analytical formulation is:
$$f(x_0, x_1) = (x_1 - \frac{5.1x_0^2}{4\pi^2} + \frac{5x_0}{\pi} - 6)^2 + 10(1 - \frac{1}{8\pi})\cos({x_0}) + 10$$
evaluated in the region such that
$$x_0 \in [-5,10], x_1 \in [0,15]$$
where three global minima are present in
$$P_0(-\pi, 12.275), P_1(\pi, 2.275), P_2(3\pi, 2.425)$$ 
where 
$$f(P_0) = f(P_1) = f(P_2) = 0.3978873$$

In [None]:
# Compute Rosembrock's function outcome
def branin_rcos(x0, x1):
    f0 = (x1 - 5.1 * (x0 ** 2) / (4 * numpy.pi ** 2) + 5 * x0 / numpy.pi - 6) ** 2
    f1 = 10 * (1 - 1 / (8 * numpy.pi)) * numpy.cos(x0)
    return f0 + f1 + 10

# Define optimization cost function
def objective(trial):
    x0 = trial.suggest_uniform('x0', -5, 10)
    x1 = trial.suggest_uniform('x1', 0, 15)
    
    return branin_rcos(x0, x1)

# Run optimization
study = optuna.create_study()
study.optimize(objective,
               timeout=30)
print(f'Trials performed : {len(study.trials)}')
print(f'Best value       : {study.best_value}')
print(f'Best parameters  : x0={study.best_params["x0"]}, x1={study.best_params["x1"]}')

Again, let's analyze the minimization process plotting each trial point $P_t(x_0, x_1)$ in the seach space.

In [None]:
# Configure plot
plot = figure(width=900,
              height=450,
             x_range=(-6, 11),
             y_range=(-1, 16))
plot.circle(x=[trial.params['x0'] 
               for trial in study.trials],
            y=[trial.params['x1']
               for trial in study.trials],
            color='green',
            line_color='black',
            alpha=0.5,
            size=6,
            legend='Trials')
plot.circle(x=[-numpy.pi, numpy.pi, 3 * numpy.pi],
            y=[12.275, 2.275, 2.425],
            color='red',
            line_color='black',
            size=10,
            legend='Minimum')
plot.xaxis.axis_label='x0'
plot.yaxis.axis_label='x1'

# Plot
show(plot)