# How to do multistart optimizations

Sometimes you want to make sure that your optimization is robust to the initial
parameter values, i.e. that it does not get stuck at a local optimum. This is where
multistart comes in handy.


## What does multistart (not) do

In short, multistart iteratively runs local optimizations from different initial
conditions. If enough local optimization report convergence, it stops. Importantly, it
cannot guarantee that the result is the global optimum, but it can increase your
confidence in the result.

## TL;DR

To activate multistart at the default options, pass `multistart=True` to the `minimize`
or `maximize` function, as well as finite soft bounds on the parameters (which are used
to sample the initial points). The default options are discussed below.

In [None]:
import numpy as np
import optimagic as om


def fun(x):
    return x @ x


bounds = om.Bounds(
    soft_lower=np.full(5, -5),
    soft_upper=np.full(5, 10),
)


res = om.minimize(
    fun=fun,
    x0=np.arange(5),
    algorithm="scipy_lbfgsb",
    bounds=bounds,
    multistart=True,
)

## What does multistart mean in optimagic?

Our multistart optimizations are inspired by the [TikTak algorithm](https://github.com/serdarozkan/TikTak) and consist of the following steps:

1. Draw a large exploration sample of parameter vectors randomly or using a
   low-discrepancy sequence.
1. Evaluate the objective function in parallel on the exploration sample.
1. Sort the parameter vectors from best to worst according to their objective function
   values. 
1. Run local optimizations iteratively. That is, the first local optimization is started
   from the best parameter vector in the sample. All subsequent ones are started from a
   convex combination of the currently best known parameter vector and the next sample
   point. 

## Visualizing multistart results

To visualize the individual local optimization, you can call one of optimagics
visualization function on the optimization result object.

In [None]:
om.criterion_plot(res)

In the above figure you can see optimization history of two local optimizations. This
means that the first two local optimizations were successfull, and the multistart
optimization stopped after that.

## How to configure multistart?

Configuration of multistart can be done by passing an instance of
`optimagic.MultistartOptions` to `maximize` or `minimize`. Let's look at an extreme
example where we manually set everything to it's default value:

In [None]:
options = om.MultistartOptions(
    # n_samples: The number of points at which the objective function is evaluated
    #     during the exploration phase. If None, n_samples is set to 100 times the
    #     number of parameters.
    n_samples=100 * len(res.x),
    # share_optimizations: The fraction of the exploration sample that is used to
    #     run the optimization (relative to n_samples).
    share_optimizations=0.1,
    # sampling_distribution: The distribution from which the exploration sample is
    #     drawn. Allowed are "uniform" and "triangular".
    sampling_distribution="uniform",
    # sampling_method: The method used to draw the exploration sample. Allowed are
    #     "sobol", "random", "halton", and "latin_hypercube".
    sampling_method="sobol",
    # sample: A sequence of PyTrees that are used as the initial parameters for the
    #     optimization. If None, a sample is drawn from the sampling distribution.
    sample=None,
    # mixing_weight_method: The method used to determine the mixing weight, i,e, how
    #     start parameters for local optimizations are calculated. Allowed are
    #     "tiktak" and "linear", or a custom callable.
    mixing_weight_method="tiktak",
    # mixing_weight_bounds: The lower and upper bounds for the mixing weight.
    mixing_weight_bounds=(0.1, 0.995),
    # convergence_max_discoveries: The maximum number of discoveries for convergence.
    #     Determines after how many re-descoveries of the currently best local
    #     optima the multistart algorithm stops.
    convergence_max_discoveries=2,
    # convergence_relative_params_tolerance: The relative tolerance in parameters
    #     for convergence. Determines the maximum relative distance two parameter
    #     vecctors can have to be considered equal.
    convergence_relative_params_tolerance=0.01,
    # n_cores: The number of cores to use for parallelization.
    n_cores=1,
    # batch_evaluator: The evaluator to use for batch evaluation. Allowed are "joblib"
    #     and "pathos", or a custom callable.
    batch_evaluator="joblib",
    # batch_size: The batch size for batch evaluation. Must be larger than n_cores
    #     or None.
    batch_size=None,
    # seed: The seed for the random number generator.
    seed=None,
    # exploration_error_handling: The error handling for exploration errors. Allowed
    #     are "raise" and "continue".
    exploration_error_handling="continue",
    # optimization_error_handling: The error handling for optimization errors. Allowed
    #     are "raise" and "continue".
    optimization_error_handling="continue",
)

res = om.minimize(
    fun=fun,
    x0=np.arange(5),
    algorithm="scipy_lbfgsb",
    bounds=bounds,
    multistart=options,
)

## Understanding multistart results

When activating multistart, the optimization result object has the additional attribute
`multistart_info`. It is a dictionary with the following keys:
    
- `local_optima`: A list with the results from all local optimizations that were performed.
- `start_parameters`: A list with the start parameters from those optimizations 
- `exploration_sample`: A list with parameter vectors at which the objective function was evaluated in an initial exploration phase. 
- `exploration_results`: The corresponding objective values.

### Start parameters

The start parameters are the parameter vectors from which the local optimizations were
started. Since the default number of `convergence_max_discoveries` is 2, and both
local optimizations were successfull, the start parameters have 2 rows.

In [None]:
res.multistart_info["start_parameters"]

### Local Optima

The local optima are the results from the local optimizations. Since in this example
only two local optimizations were run, the local optima list has two elements, each of
which is an optimization result object.

In [None]:
len(res.multistart_info["local_optima"])

### Exploration sample

The exploration sample is a list of parameter vectors at which the objective function
was evaluated. Since the parameter dimension is 5, and the default number of samples is
100 times the number of parameters, the exploration sample has 500 elements.

In [None]:
np.row_stack(res.multistart_info["exploration_sample"]).shape

### Exploration results

The exploration results are the objective function values at the exploration sample.

In [None]:
len(res.multistart_info["exploration_results"])