# Tuning hyperparameters with Optuna

All our solvers have got a lot of hyperparameters.
And of course, the optimization result can change significantly according to them.

In this notebook, we will see how we can make use of [Optuna](https://optuna.readthedocs.io/en/stable/) to tune them for a given problem (or family of problems).

Some work has been done in the library to ease this tuning:

- main hyperparameters of each solver have been identified, with default values and possible ranges registered;
- some utility methods have been coded to get default hyperparameters and to make use of optuna hyperparameters auto-suggestion with as little work as possible from the user.

After applying this to tune hyperparameters of a solver, further examples will show you that

- we can also use optuna to select the solver class itself as another meta-hyperparameter;
- some solvers are meta-solvers having themselves subsolvers as hyperparameters with their own set of hyperparameters, that can also be tuned.

To illustrate it, we will use the [coloring problem](https://en.wikipedia.org/wiki/Graph_coloring): it consists in coloring vertices of a graph with the minimal number of colors, such that 2 adjacent vertices do not have the same color.

<img src="https://upload.wikimedia.org/wikipedia/commons/9/90/Petersen_graph_3-coloring.svg" alt="Petersen graph 3-coloring.svg"  width="280">


## Prerequisites

Concerning the python kernel to use for this notebook:
- If running locally, be sure to use an environment with discrete-optimization, minizinc, and optuna (and optionally optuna-dashboard);
- If running on colab, the next cell does it for you;
- If running on binder, the environment should be ready.


In [None]:
# On Colab: install the library
on_colab = "google.colab" in str(get_ipython())
if on_colab:
    import os
    import sys  # noqa: avoid having this import removed by pycln

    !{sys.executable} -m pip install -U pip

    # uninstall google protobuf conflicting with ray and sb3
    ! pip uninstall -y protobuf

    # install dev version for dev doc, or release version for release doc
    !{sys.executable} -m pip install git+https://github.com/airbus/discrete-optimization@master#egg=discrete-optimization

    # install and configure minizinc
    !curl -o minizinc.AppImage -L https://github.com/MiniZinc/MiniZincIDE/releases/download/2.8.5/MiniZincIDE-2.8.5-x86_64.AppImage
    !chmod +x minizinc.AppImage
    !./minizinc.AppImage --appimage-extract
    os.environ["PATH"] = f"{os.getcwd()}/squashfs-root/usr/bin/:{os.environ['PATH']}"
    os.environ["LD_LIBRARY_PATH"] = (
        f"{os.getcwd()}/squashfs-root/usr/lib/:{os.environ['LD_LIBRARY_PATH']}"
    )

    # install optuna and optuna-dashboard
    !{sys.executable} -m pip install optuna optuna-dashboard

### Imports

In [None]:
from __future__ import annotations

import logging
import socket

import nest_asyncio
import optuna
from optuna.storages import JournalFileStorage, JournalStorage
from optuna.trial import TrialState

from discrete_optimization.coloring.parser import get_data_available, parse_file
from discrete_optimization.coloring.solvers.cpsat import (
    CpSatColoringSolver,
    ModelingCpSat,
)
from discrete_optimization.coloring.solvers.greedy import NxGreedyColoringMethod
from discrete_optimization.coloring.solvers.lns_cp import LnsCpColoringSolver
from discrete_optimization.coloring.solvers_map import look_for_solver
from discrete_optimization.datasets import fetch_data_from_coursera
from discrete_optimization.generic_tools.callbacks.loggers import ObjectiveLogger
from discrete_optimization.generic_tools.callbacks.optuna import OptunaCallback
from discrete_optimization.generic_tools.cp_tools import CpSolverName, ParametersCp
from discrete_optimization.generic_tools.do_problem import ModeOptim
from discrete_optimization.generic_tools.sequential_metasolver import (
    SequentialMetasolver,
)

# patch asyncio so that applications using async functions can run in jupyter
nest_asyncio.apply()

# set logging level
logging.basicConfig(level=logging.WARNING, format="%(asctime)s:%(message)s")

### Download datasets

If not yet available, we import the datasets from [coursera](https://github.com/discreteoptimization/assignment).

In [None]:
needed_datasets = ["gc_70_9"]
download_needed = False
try:
    files_available_paths = get_data_available()
    for dataset in needed_datasets:
        if len([f for f in files_available_paths if dataset in f]) == 0:
            download_needed = True
            break
except:
    download_needed = True

if download_needed:
    fetch_data_from_coursera()

In [None]:
file = [f for f in get_data_available() if "gc_70_9" in f][0]
problem = parse_file(file)
print(type(problem))

## Hyperparameters presentation

Each solver has some hyperparameters that can be tuned. In this section, we will see how to get the list of them.
For recall the hyperparameters are here the keyword arguments to put in a `kwargs` dictionary,
that can be used to initialize and run the solver as follows:

```python
solver = solver_class(problem=problem, **kwargs)
solver.init_model(**kwargs)
res = solver.solve(**kwargs)
```


Let us take a look to solvers available for the chosen problem.

In [None]:
solver_classes = look_for_solver(problem)
solver_classes

### Example: CpSatColoringSolver
We can list the hyperparameters available for `CpSatColoringSolver` with their definition:

In [None]:
CpSatColoringSolver.hyperparameters

You remark that there are several types of hyperparameters that are partially in par with how Optuna classify the hyperparameters (integer, float, and categorical). Here is a (non-exhaustive) list of the possible types:

- IntegerHyperparameter: taking integer values within a range;
- FloatHyperparameter: taking float values within a range;
- CategoricalHyperparameter: taking categorical values within a list of choices, that should be (for Optuna) either strings, booleans, integers, or floats;
- EnumHyperparameter: extension of categorical hyperparameters, taking value from an enumeration;
- SubBrickHyperparameter: whose value is a subbrick with its own hyperparameters,
  generally a subsolver for meta-solver iterating over a wrapped solver (like [LNS solvers](https://airbus.github.io/discrete-optimization/master/api/discrete_optimization.generic_tools.html#discrete_optimization.generic_tools.lns_cp.LNS_CP)),
  but also potentially an other brick like a constraint handler, an initial solution provider, or a post-processer (also present in LNS solvers). This will be further explained in the [example on LNS](#LNS) below.
- ListHyperparameter: whose value is a list of values of a given hyperparameter template. This is used by the sequential metasolver which is chaining subsolvers (which can be generated by a subbrick hyperparameter), as shown [below](#Sequential-metasolver).
- Variants of ListHyperparameter exist to avoid repetition in the list, or to ignore the order of it so that Optuna does not suggest a permutation of an already suggested list as a different one.

See the [documentation for `discrete_optimization.generic_tools.hyperparameters.hyperparameter` module](https://airbus.github.io/discrete-optimization/master/api/discrete_optimization.generic_tools.hyperparameters.html#module-discrete_optimization.generic_tools.hyperparameters.hyperparameter) for more details.

As it can be a bit confusing to have all these details, one can also list only their names:

In [None]:
CpSatColoringSolver.get_hyperparameters_names()

We can create a dictionary with their default values to be used to initialize a solver.

In [None]:
kwargs = CpSatColoringSolver.get_default_hyperparameters()
kwargs

In [None]:
solver = CpSatColoringSolver(problem=problem, **kwargs)
solver.init_model(**kwargs)

Before solving, we add a timeout parameter:

In [None]:
kwargs["time_limit"] = 20
res = solver.solve(**kwargs)
print(f"Found {len(res)} solution(s)")

### Meta-solver examples

#### LNS
Meta-solvers have sub-brick hyperparameters:

In [None]:
LnsCpColoringSolver.get_hyperparameters_by_name()

A SubBrickHyperparameter returns a `SubBrick` value which is a wrapper around
- a subbclass of Hyperparametrizable,
- a dictionary `kwargs` to be used as keyword arguments for `__init__()`, `init_model()`, and `solve()`.

Let us have a look onto a suggested value for the subsolver. We use here the method `suggest_with_optuna()` which is using an optuna trial and its different suggest methods `suggest_int()`, `suggest_float()`, `suggest_categorical()`:

In [None]:
# select susbsolver hyperparameter
hp_subsolver = LnsCpColoringSolver.get_hyperparameter("subsolver")

# generate an optuna trial
study = optuna.create_study()
trial = study.ask()

# get the suggested subbrick
subbrick = LnsCpColoringSolver.get_hyperparameter("subsolver").suggest_with_optuna(
    trial
)
print("class:", subbrick.cls)
print("kwargs:", subbrick.kwargs)

We can use this subbrick to instantiate a coloring solver, initialize its model and solve the problem with it.

In [None]:
# update cp_solver_name to ensure using "chuffed"
# this is to avoid issues if you have not installed all possible minizinc backends on your machine
subbrick.kwargs["cp_solver_name"] = CpSolverName.CHUFFED

# init the subsolver with it
subsolver = subbrick.cls(problem=problem, **subbrick.kwargs)
subsolver.init_model(**subbrick.kwargs)

# solve with it
res = subsolver.solve(time_limit=5, **subbrick.kwargs)

#### Sequential metasolver
We take another example to show the use of a list hyperparameter.



In [None]:
SequentialMetasolver.get_hyperparameters_by_name()

The list hyperparameter will generated a list of hyperparameters that are suggested from copies of a specific hyperparameter, the hyperparameter template.
The list length have lower and upper bounds that are used to decide the actual length by Optuna.


In [None]:
next_subsolvers_hp = SequentialMetasolver.get_hyperparameter("next_subsolvers")
print("hyperparameter template:", next_subsolvers_hp.hyperparameter_template)
print("minimal number of subbricks:", next_subsolvers_hp.length_low)
print("maximal number of subbricks:", next_subsolvers_hp.length_high)

Here, `SequentialMetasolver` will chain several subbricks, warmstarting them with the result of the previous one.
As it is a generic solver, we have still to specify the hyperparameter template choices (subsolvers possible classes), 
and we could also update the list lengths bounds according to the need of the problem at hand.

See the [dedicated tutorial](./sequential_metasolver.ipynb) for more details.

## Example using Optuna

### Without discrete-optimization help

To use optuna, we need to define an `objective()` function that returns an objective value to optimize with hyperparameters defined and suggested by optuna on the fly, 
thanks to methods of the optuna trial passed in argument.

In [None]:
parameters_cp = ParametersCp.default_cpsat()
time_limit = 20


def objective(trial: optuna.Trial) -> float:
    # make optuna suggest hyperparameters (and define them doing so)
    warmstart = trial.suggest_categorical(name="warmstart", choices=[True, False])
    value_sequence_chain = trial.suggest_categorical(
        name="value_sequence_chain", choices=[True, False]
    )
    used_variable = trial.suggest_categorical(
        name="used_variable", choices=[True, False]
    )
    symmetry_on_used = trial.suggest_categorical(
        name="symmetry_on_used", choices=[True, False]
    )
    modeling_str = trial.suggest_categorical(
        name="modeling", choices=[m.name for m in ModelingCpSat]
    )
    greedy_method_str = trial.suggest_categorical(
        name="greedy_method", choices=[m.name for m in NxGreedyColoringMethod]
    )

    # convert optuna values into proper format for d-o
    modeling = ModelingCpSat[modeling_str]
    greedy_method = NxGreedyColoringMethod[greedy_method_str]

    print(f"Launching trial {trial.number} with parameters: {trial.params}")

    # init solver
    kwargs = dict(
        warmstart=warmstart,
        value_sequence_chain=value_sequence_chain,
        used_variable=used_variable,
        symmetry_on_used=symmetry_on_used,
        modeling=modeling,
        greedy_method=greedy_method,
    )
    solver = CpSatColoringSolver(problem=problem, **kwargs)
    solver.init_model(**kwargs)

    # solve
    sol, fit = solver.solve(
        parameters_cp=parameters_cp, time_limit=time_limit, **kwargs
    ).get_best_solution_fit()

    return fit

Then we create an optuna study and optimize it. Here we choose a limited number of trials for practical reasons but one should allow much more trials to browse the domain of the hyperparameters.

In [None]:
objective_register = problem.get_objective_register()
if objective_register.objective_sense == ModeOptim.MINIMIZATION:
    direction = "minimize"
else:
    direction = "maximize"

study = optuna.create_study(
    direction=direction,
)
study.optimize(objective, n_trials=4)

Some statistics on the study.

In [None]:
pruned_trials = study.get_trials(deepcopy=False, states=[TrialState.PRUNED])
complete_trials = study.get_trials(deepcopy=False, states=[TrialState.COMPLETE])
print("Study statistics: ")
print(f"  Number of finished trials: {len(study.trials)}")
print(f"  Number of pruned trials: {len(pruned_trials)}")
print(f"  Number of complete trials: {len(complete_trials)}")
print("")
print("Best trial:")
print(f"  value={study.best_trial.value}")
print(f"  params={study.best_trial.params}")

We can convert trials into a dataframe to visualize them.

In [None]:
df = study.trials_dataframe()
df.sort_values("value", ascending=False)

### Taking advantage of discrete-optimization integration of Optuna to choose the hyperparameters

Even though the use of optuna is quite easy, typing all hyperparameters can be tedious and prone to errors. 
And we have seen that some hyperparameters require conversion before being passed to the solver.

Discrete-optimization integrate some utility methods that handle it.
 - Each hyperparameter has a method `suggest_with_optuna()` that calls the appropriate optuna method, potentially with choices/ranges restrictions, and makes the conversion if needed.
 - Each solver has a method `suggest_hyperparameters_with_optuna()` that suggests directly all (or some) hyperparameters, with the options available for above methods.

This lead to the simplified script below:

In [None]:
parameters_cp = ParametersCp.default_cpsat()
time_limit = 20


def objective(trial: optuna.Trial) -> float:
    # make optuna suggest hyperparameters (and define them doing so)
    kwargs = CpSatColoringSolver.suggest_hyperparameters_with_optuna(trial)

    print(f"Launching trial {trial.number} with parameters: {trial.params}")

    # init solver
    solver = CpSatColoringSolver(problem=problem, **kwargs)
    solver.init_model(**kwargs)

    # solve
    sol, fit = solver.solve(
        parameters_cp=parameters_cp, time_limit=time_limit, **kwargs
    ).get_best_solution_fit()

    return fit


study = optuna.create_study(
    direction=problem.get_optuna_study_direction(),
)
study.optimize(objective, n_trials=4)

In [None]:
df = study.trials_dataframe()
df.sort_values("value", ascending=False)

As we did not fixed the seed, the result may vary compared to the previous study

### Making Optuna prune unpromising trials and visualize intermediate values with optuna-dashboard

Optuna is also able to prune unpromising trials if we provide the intermediate objective values at the end of each optimization step.
We can achieve this easily by using a dedicated callback during the solve. See the [tutorials on callbacks](./callbacks.ipynb) for more information about how it works.

Moreover, we can make use of this reporting to see the study progress "live" with optuna-dashboard.

In [None]:
parameters_cp = ParametersCp.default_cpsat()
time_limit = 20


def objective(trial: optuna.Trial) -> float:
    # make optuna suggest hyperparameters (and define them doing so)
    kwargs = CpSatColoringSolver.suggest_hyperparameters_with_optuna(trial)

    print(f"Launching trial {trial.number} with parameters: {trial.params}")

    # init solver
    solver = CpSatColoringSolver(problem=problem, **kwargs)
    solver.init_model(**kwargs)

    # optuna callback
    callbacks = [
        OptunaCallback(trial=trial),
        ObjectiveLogger(
            step_verbosity_level=logging.WARNING
        ),  # here we set a warning level because `logging` has been set above to display only warning messages.
    ]

    # solve
    sol, fit = solver.solve(
        parameters_cp=parameters_cp,
        time_limit=time_limit,
        callbacks=callbacks,
        **kwargs,
    ).get_best_solution_fit()

    return fit

To allow visualizing the study (even during the optimization) with optuna-dashboard, we set a storage for the optuna study.
We choose a file-based storage but this could be also a database. If you choose a file on NFS, it allows you to launch parallel optuna instances to speed up the tuning.

If the study is already existing (because you already run this notebook for instance), you can either:
- set the option `load_if_exists=True`, and the study will add the new trials to the already existing study (and thus use the knowledge of previous trials)
- change the name of the study to keep the previous results but not reuse them
- delete the study to overwrite it

In [None]:
optuna_journal_filepath = "optuna-journal.log"
study_name = "optuna-coloring-with-pruning"
overwrite = False

storage = JournalStorage(JournalFileStorage(optuna_journal_filepath))
if overwrite:
    try:
        optuna.delete_study(study_name=study_name, storage=storage)
    except:
        pass
    load_if_exists = False
else:
    load_if_exists = True

study = optuna.create_study(
    study_name=study_name,
    direction=problem.get_optuna_study_direction(),
    storage=storage,
    load_if_exists=load_if_exists,
)

While the study runs, we can watch the optimization progress thanks to optuna-dashboard with

    optuna-dashboard optuna-journal.log

The next cell do it according to your jupyter environment:
- if running locally, we need to install optuna-dashboard and run it (in a separate process);
- if running on colab, we make use of `google.colab.output` as suggested [here](https://stackoverflow.com/a/76033378);
- if running on binder, we sadly did not succed in using `jupyter-server-proxy` to access to the served dashboard, as done for tensorboard [here](https://github.com/binder-examples/tensorboard).


In [None]:
on_colab = "google.colab" in str(get_ipython())  # running on colab?
on_binder = socket.gethostname().startswith(
    "jupyter-"
)  # running on binder? (not 100% sure but rather robust)


def start_optuna_dashboard(port=1234):
    import threading
    import time
    from wsgiref.simple_server import make_server

    from optuna_dashboard import wsgi

    app = wsgi(storage)
    httpd = make_server("localhost", port, app)
    thread = threading.Thread(target=httpd.serve_forever)
    thread.start()
    time.sleep(3)  # Wait until the server startup
    return port


if on_colab:
    port = start_optuna_dashboard()
    from google.colab import output

    print("Visit optuna-dashboard on:")
    output.serve_kernel_port_as_window(port, path="/dashboard/")

elif on_binder:
    print("Not yet working on binder...")
else:
    try:
        import optuna_dashboard  # nopycln: import
    except ImportError:
        !pip install optuna-dashboard
    port = start_optuna_dashboard()
    print(f"Visit optuna-dashboard on http://localhost:{port}/")

We set a greater number of trials to see pruning in action. (By default, optuna is using a [MedianPruner](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.pruners.MedianPruner.html).)

In [None]:
study.optimize(objective, n_trials=100)

## Full example selecting solver classes with their hyperparameters

If we see the main solver class as a categorical hyperparameter itself (like in meta-solver example), we can let Optuna also choose it.

A full example can be found as a script in the repository that
- chooses the solving method
- chooses the related hyperparameters
- specifies some fixed parameters like timeout limits
- freezes some hyperparameters
- restrict the choices for some hyperparameters
- stores optuna results in a file
  - potentially distributed on NFS for parallel tuning
  - allowing real-time visualization with optuna-dashboard
- prunes unpromising trials according to the computation time (instead of steps)
  as we compare different solvers between them that have different notions of optimization step


This is "examples/coloring/optuna_full_example_all_solvers_timed_pruning.py" ([local link](../../examples/coloring/optuna_full_example_all_solvers_timed_pruning.py), [github link](https://github.com/airbus/discrete-optimization/tree/master/examples/coloring/optuna_full_example_all_solvers_timed_pruning.py)).

To make life easier for the user, all these features are wrapped into a utility function that create and launch the optuna study `generic_optuna_experiment_monoproblem`. The aforementioned example actually use it.

In [None]:
from discrete_optimization.generic_tools.optuna.utils import (  # noqa: avoid having this import removed by pycln
    generic_optuna_experiment_monoproblem,
)

generic_optuna_experiment_monoproblem?