In [1]:
import numpy as np
import pandas as pd

# Numerical optimization

This tutorial shows how to do an optimization with estimagic with simple examples. More details on the topics covered here can be found in the [how to guides](../how_to_guides/index.rst).


## Basic usage of `minimize`

In [2]:
from estimagic import minimize


def sphere(params):
    return params @ params


res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
)

res["solution_params"].round(5)

array([ 0., -0., -0., -0., -0.])

## `params` do not have to be vectors

In estimagic, params can by arbitrary [pytrees](https://jax.readthedocs.io/en/latest/pytrees.html). Examples are (nested) dictionaries of numbers, arrays and pandas objects. 

In [3]:
def dict_sphere(params):
    return params["a"] ** 2 + params["b"] ** 2 + (params["c"] ** 2).sum()


res = minimize(
    criterion=dict_sphere,
    params={"a": 0, "b": 1, "c": pd.Series([2, 3, 4])},
    algorithm="scipy_neldermead",
)

res["solution_params"]

{'a': 3.885293415171424e-09,
 'b': -9.144916295625622e-09,
 'c': 0   -6.348281e-09
 1   -4.599364e-09
 2   -2.724007e-09
 dtype: float64}

## There are many optimizers

If you install some optional dependencies, you can choose from a large (and growing) set of optimization algorithms -- all with the same interface!

For example, we wrap optimizers from `scipy.optimize`, `nlopt`, `cyipopt`, `pygmo`, `fides`, `tao` and others. 

We also have some optimizers that are not part of other packages. Examples are a `parallel Nelder-Mead` algorithm, The `BHHH` algorithm and a `parallel Pounders` algorithm.

The full list is [here](../how_to_guides/optimization/how_to_specify_algorithm_and_algo_options.rst)

## You can add bounds

In [4]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    lower_bounds=np.arange(5) - 2,
    upper_bounds=np.array([10, 10, 10, np.inf, np.inf]),
)

res["solution_params"].round(5)

array([0., 0., 0., 1., 2.])

## You can fix parameters 

In [5]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    constraints=[{"loc": [1, 3], "type": "fixed"}],
)

res["solution_params"].round(5)

array([0., 1., 0., 3., 0.])

## Or impose other constraints

As an example, let's impose the constraint that the first three parameters are valid probabilities, i.e. they are between zero and one and sum to one:

In [6]:
res = minimize(
    criterion=sphere,
    params=np.array([0.1, 0.5, 0.4, 4, 5]),
    algorithm="scipy_lbfgsb",
    constraints=[{"loc": [0, 1, 2], "type": "probability"}],
)

res["solution_params"].round(5)

array([ 0.33334,  0.33333,  0.33333, -0.     ,  0.     ])

For a full overview of the constraints we support and the syntax, check out [the documentation](../how_to_guides/optimization/how_to_specify_constraints.rst).

Note that `"scipy_lbfgsb"` is not a constrained optimizer. If you want to know how we achieve this, check out [the explanations](../explanations/optimization/implementation_of_constraints.rst).

## There is also maximize

If you ever forgot to switch back the sign of your criterion function after doing a maximization with `scipy.optimize.minimize`, there is good news:

In [7]:
from estimagic import maximize


def upside_down_sphere(params):
    return -params @ params


res = maximize(
    criterion=upside_down_sphere,
    params=np.arange(5),
    algorithm="scipy_bfgs",
)

res["solution_params"].round(5)

array([ 0., -0., -0.,  0., -0.])

## You can provide closed form derivatives

In [8]:
def sphere_gradient(params):
    return 2 * params


res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    derivative=sphere_gradient,
)
res["solution_params"].round(5)

array([ 0., -0., -0., -0., -0.])

## Or use parallelized numerical derivatives

In [9]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    numdiff_options={"n_cores": 6},
)

res["solution_params"].round(5)

array([ 0., -0., -0., -0., -0.])

## Turn local optimizers global with multistart

In [10]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    soft_lower_bounds=np.full(5, -10),
    soft_upper_bounds=np.full(5, 10),
    multistart=True,
)

res["solution_params"]

array([0., 0., 0., 0., 0.])

## Exploit the structure of your optimization problem

Many estimation problems have a least-squares structure. If so, specialized optimizers that exploit this structure can be faster than standard optimizers. Other problems have at least a sum-structure that can be exploited by optimizers (e.g. likelihood functions).

If you defined your criterion function a bit differently, you can seamlessly switch between least-squares, sum-structure and standard optimizers.

In [11]:
def general_sphere(params):
    contribs = params**2
    out = {
        # root_contributions are the least squares residuals.
        # if you square and sum them, you get the criterion value
        "root_contributions": params,
        # if you sum up contributions, you get the criterion value
        "contributions": contribs,
        # this is the standard output
        "value": contribs.sum(),
    }
    return out


res = minimize(
    criterion=general_sphere,
    params=np.arange(5),
    algorithm="pounders",
)
res["solution_params"].round(5)

array([ 0.,  0., -0.,  0., -0.])

## Using and reading persistent logging

For long running and difficult optimizations, it can be good to store the progress in a persistent log file. You can do this providing a path as `logging` argument:

In [12]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    logging="my_log.db",
)

You can read the entries in the log file (while the optimization is still running or afterwards) as follows:

In [13]:
from estimagic.logging.read_log import read_optimization_iteration

# the second argument works like an index to a list, i.e.
# -1 gives the last entry
# read_optimization_iteration("my_log.db", -1)

The persistent log file is always instantly synchronized when the optimizer tries a new parameter vector. This is very handy if an optimization has to be aborted and you want to extract the current status. It is also used by the  [estimagic dashboard](../how_to_guides/optimization/how_to_use_the_dashboard.rst). 

## Customize your optimizer

Most algorithms have a few optional arguments. Examples are convergence criteria or tuning parameters. You can find an overview of supported arguments [here](../how_to_guides/optimization/how_to_specify_algorithm_and_algo_options.rst).

In [14]:
algo_options = {
    "convergence.relative_criterion_tolerance": 1e-9,
    "stopping.max_iterations": 100_000,
}

res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    algo_options=algo_options,
)
res["solution_params"].round(5)

array([ 0., -0., -0., -0., -0.])