# Sequential Parameter Optimization: Using `scipy` Optimizers {#sec-scipy-optimizers}

As a default optimizer, `spotPython` uses `differential_evolution` from the `scipy.optimize` package. Alternatively, any other optimizer from the `scipy.optimize` package can be used. This chapter describes how different optimizers from the `scipy optimize` package can be used on the surrogate.
The optimization algorithms are available from [https://docs.scipy.org/doc/scipy/reference/optimize.html](https://docs.scipy.org/doc/scipy/reference/optimize.html)


In [1]:
import numpy as np
from math import inf
from spotPython.fun.objectivefunctions import analytical
from spotPython.spot import spot
from scipy.optimize import shgo
from scipy.optimize import direct
from scipy.optimize import differential_evolution
from scipy.optimize import dual_annealing
from scipy.optimize import basinhopping
from spotPython.utils.init import fun_control_init, design_control_init, optimizer_control_init, surrogate_control_init

Seed set to 123


## The Objective Function Branin

The `spotPython` package provides several classes of objective functions. We will use an analytical objective function, i.e., a function that can be described by a (closed) formula. Here we will use the Branin function. The 2-dim Branin function is

    $$y = a * (x2 - b * x1**2 + c * x1 - r) ** 2 + s * (1 - t) * cos(x1) + s,$$ 
    where values of a, b, c, r, s and t are: 
    $a = 1, b = 5.1 / (4*pi**2), c = 5 / pi, r = 6, s = 10$ and $t = 1 / (8*pi)$.

* It has three global minima:
    
    $f(x) = 0.397887$ at $(-\pi, 12.275)$, $(\pi, 2.275)$, and $(9.42478, 2.475)$.

* Input Domain: This function is usually evaluated on the square  x1 in  [-5, 10] x x2 in [0, 15].


In [2]:
from spotPython.fun.objectivefunctions import analytical
lower = np.array([-5,-0])
upper = np.array([10,15])
fun = analytical(seed=123).fun_branin

## The Optimizer

Differential Evolution (DE) from the `scikit.optimize` package, see [https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution) is the default optimizer for the search on the surrogate.
Other optimiers that are available in `spotPython`, see [https://docs.scipy.org/doc/scipy/reference/optimize.html#global-optimization](https://docs.scipy.org/doc/scipy/reference/optimize.html#global-optimization).

  * `dual_annealing`
  *  `direct`
  * `shgo`
  * `basinhopping`

These optimizers can be selected as follows:

  `surrogate_control = "model_optimizer": differential_evolution`

As noted above, we will use `differential_evolution`. The optimizer can use `1000` evaluations. This value will be passed to the `differential_evolution` method, which has the argument `maxiter` (int). It defines the maximum number of generations over which the entire differential evolution population is evolved, see [https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution)


:::{.callout-note}
#### TensorBoard

Similar to the one-dimensional case, which is discussed in @sec-visualizing-tensorboard-01, we can use TensorBoard to monitor the progress of the optimization. We will use a similar code, only the prefix is different:


In [3]:
fun_control=fun_control_init(
                    lower = lower,
                    upper = upper,
                    fun_evals = 20,
                    PREFIX = "04_DE_"
                    )
surrogate_control=surrogate_control_init(
                    n_theta=len(lower))

Seed set to 123


Created spot_tensorboard_path: runs/spot_logs/04_DE__p040025_2024-01-09_20-00-19 for SummaryWriter()


:::


In [4]:
spot_de = spot.Spot(fun=fun,
                    fun_control=fun_control,
                    surrogate_control=surrogate_control)
spot_de.run()

spotPython tuning: 3.8004580634289518 [######----] 55.00% 


spotPython tuning: 3.8004580634289518 [######----] 60.00% 


spotPython tuning: 3.158983526047736 [######----] 65.00% 


spotPython tuning: 3.1338444083542836 [#######---] 70.00% 


spotPython tuning: 2.917653611971259 [########--] 75.00% 


spotPython tuning: 0.40458354637529226 [########--] 80.00% 


spotPython tuning: 0.40458354637529226 [########--] 85.00% 


spotPython tuning: 0.3987528104146545 [#########-] 90.00% 


spotPython tuning: 0.3987528104146545 [##########] 95.00% 


spotPython tuning: 0.3987528104146545 [##########] 100.00% Done...



<spotPython.spot.spot.Spot at 0x2d353b050>

### TensorBoard

If the `prefix` argument in `fun_control_init()`is not `None` (as above, where the `prefix` was set to `04_DE_`) , we can start TensorBoard in the background with the following command:



```{raw}
tensorboard --logdir="./runs"
```



We can access the TensorBoard web server with the following URL:



```{raw}
http://localhost:6006/
```



The TensorBoard plot illustrates how `spotPython` can be used as a microscope for the internal mechanisms of the surrogate-based optimization process. Here, one important parameter, the learning rate $\theta$ of the Kriging surrogate is plotted against the number of optimization steps.

![TensorBoard visualization of the spotPython optimization process and the surrogate model.](figures_static/05_tensorboard_01.png){width="100%"}


## Print the Results


In [5]:
spot_de.print_results()

min y: 0.3987528104146545
x0: 3.14748607975711
x1: 2.2968413897617554


[['x0', 3.14748607975711], ['x1', 2.2968413897617554]]

## Show the Progress


In [6]:
spot_de.plot_progress(log_y=True)

<Figure size 2700x1800 with 1 Axes>

In [7]:
spot_de.surrogate.plot()

<Figure size 2700x1800 with 6 Axes>

## Exercises


### `dual_annealing`

* Describe the optimization algorithm, see [scipy.optimize.dual_annealing](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.dual_annealing.html).
* Use the algorithm as an optimizer on the surrogate.

:::{.callout-tip}
##### Tip: Selecting the Optimizer for the Surrogate

We can run spotPython with the `dual_annealing` optimizer as follows:


In [8]:
spot_da = spot.Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=dual_annealing,
                    surrogate_control=surrogate_control)
spot_da.run()
spot_da.print_results()
spot_da.plot_progress(log_y=True)
spot_da.surrogate.plot()

spotPython tuning: 3.8004506180745494 [######----] 55.00% 


spotPython tuning: 3.8004506180745494 [######----] 60.00% 


spotPython tuning: 3.159026259576751 [######----] 65.00% 


spotPython tuning: 3.1343634581305215 [#######---] 70.00% 


spotPython tuning: 2.8965876127956935 [########--] 75.00% 


spotPython tuning: 0.41905758842574414 [########--] 80.00% 


spotPython tuning: 0.4020256285473973 [########--] 85.00% 


spotPython tuning: 0.39921903284476734 [#########-] 90.00% 


spotPython tuning: 0.39921903284476734 [##########] 95.00% 


min y: 0.39921903284476734
x0: 3.1508062869062012
x1: 2.2982248293554903


<Figure size 2700x1800 with 1 Axes>

<Figure size 2700x1800 with 6 Axes>

:::


### `direct`

* Describe the optimization algorithm
* Use the algorithm as an optimizer on the surrogate

:::{.callout-tip}
##### Tip: Selecting the Optimizer for the Surrogate

We can run spotPython with the `direct` optimizer as follows:


In [9]:
spot_di = spot.Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=direct,
                    surrogate_control=surrogate_control)
spot_di.run()
spot_di.print_results()
spot_di.plot_progress(log_y=True)
spot_di.surrogate.plot()

spotPython tuning: 3.78192024900577 [######----] 55.00% 


spotPython tuning: 3.78192024900577 [######----] 60.00% 


spotPython tuning: 3.1707843299428866 [######----] 65.00% 


spotPython tuning: 3.1253295886690413 [#######---] 70.00% 


spotPython tuning: 2.6673899789334117 [########--] 75.00% 


spotPython tuning: 0.48037397889434175 [########--] 80.00% 


spotPython tuning: 0.41903636460779303 [########--] 85.00% 


spotPython tuning: 0.4025655271878108 [#########-] 90.00% 


spotPython tuning: 0.4010733027914206 [##########] 95.00% 


spotPython tuning: 0.4010733027914206 [##########] 100.00% Done...



min y: 0.4010733027914206
x0: 3.1561499771376322
x1: 2.3102423411065387


<Figure size 2700x1800 with 1 Axes>

<Figure size 2700x1800 with 6 Axes>

:::

### `shgo`

* Describe the optimization algorithm
* Use the algorithm as an optimizer on the surrogate

:::{.callout-tip}
##### Tip: Selecting the Optimizer for the Surrogate

We can run spotPython with the `direct` optimizer as follows:


In [10]:
spot_sh = spot.Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=shgo,
                    surrogate_control=surrogate_control)
spot_sh.run()
spot_sh.print_results()
spot_sh.plot_progress(log_y=True)
spot_sh.surrogate.plot()

spotPython tuning: 30.69410528614059 [######----] 55.00% 


spotPython tuning: 3.8670090115148232 [######----] 60.00% 


spotPython tuning: 3.8060289138706764 [######----] 65.00% 


spotPython tuning: 3.8060289138706764 [#######---] 70.00% 


spotPython tuning: 2.355799878130849 [########--] 75.00% 


spotPython tuning: 1.3905725351665694 [########--] 80.00% 


spotPython tuning: 1.3905725351665694 [########--] 85.00% 



Values in x were outside bounds during a minimize step, clipping to bounds



spotPython tuning: 1.3864169812254215 [#########-] 90.00% 


spotPython tuning: 1.3386472543572232 [##########] 95.00% 


spotPython tuning: 0.4315864524692383 [##########] 100.00% Done...



min y: 0.4315864524692383
x0: 3.087070911243333
x1: 2.457298105703394


<Figure size 2700x1800 with 1 Axes>

<Figure size 2700x1800 with 6 Axes>

:::



### `basinhopping`

* Describe the optimization algorithm
* Use the algorithm as an optimizer on the surrogate

:::{.callout-tip}
##### Tip: Selecting the Optimizer for the Surrogate

We can run spotPython with the `direct` optimizer as follows:


In [11]:
spot_bh = spot.Spot(fun=fun,
                    fun_control=fun_control,
                    optimizer=basinhopping,
                    surrogate_control=surrogate_control)
spot_bh.run()
spot_bh.print_results()
spot_bh.plot_progress(log_y=True)
spot_bh.surrogate.plot()

spotPython tuning: 3.800453417609317 [######----] 55.00% 


spotPython tuning: 3.800453417609317 [######----] 60.00% 


spotPython tuning: 3.1590142664203835 [######----] 65.00% 


spotPython tuning: 3.1341409786773404 [#######---] 70.00% 


spotPython tuning: 2.8923708554157326 [########--] 75.00% 


spotPython tuning: 0.4484530687273516 [########--] 80.00% 


spotPython tuning: 0.404489521852879 [########--] 85.00% 


spotPython tuning: 0.39929273099035534 [#########-] 90.00% 


min y: 0.39929273099035534
x0: 3.152060522318641
x1: 2.2965036205182314


<Figure size 2700x1800 with 1 Axes>

<Figure size 2700x1800 with 6 Axes>

:::


### Performance Comparison

Compare the performance and run time of the 5 different optimizers:

  * `differential_evolution`
  * `dual_annealing`
  *  `direct`
  * `shgo`
  * `basinhopping`.

The Branin function has three global minima:

* $f(x) = 0.397887$  at 
  * $(-\pi, 12.275)$, 
  * $(\pi, 2.275)$, and 
  * $(9.42478, 2.475)$.    
* Which optima are found by the optimizers?
* Does the `seed` argument in `fun = analytical(seed=123).fun_branin` change this behavior?

## Jupyter Notebook

:::{.callout-note}

* The Jupyter-Notebook of this lecture is available on GitHub in the [Hyperparameter-Tuning-Cookbook Repository](https://github.com/sequential-parameter-optimization/Hyperparameter-Tuning-Cookbook/blob/main/004_spot_sklearn_optimization.ipynb)

:::