# Optimizer Convergence and Comparison

This notebook is dedicated to additional insights into the optimization process, be it convergence properties or comparisons of two different results.

**After this notebook you can...**

- Use visualization methods to compare optimizers in terms of performance
- check convergence of a multistart optimization
- get insights into the convergence history
- restart an optimization from a history file

## Imports and constants

In [None]:
import pypesto.optimize as optimize
import pypesto.visualize as visualize
import petab
import pypesto.petab
import logging
import tempfile
import numpy as np
from IPython.display import Markdown, display

# name of the model that will also be the name of the python module
model_name = "boehm_JProteomeRes2014"

np.random.seed(3142)

## Creating optimization results

For the sake of this notebook, we create two optimization results done with different optimizers.

In [None]:
%%capture
# create problem
petab_yaml = f"./{model_name}/{model_name}.yaml"

petab_problem = petab.Problem.from_yaml(petab_yaml)
importer = pypesto.petab.PetabImporter(petab_problem)
problem = importer.create_problem(verbose=False)

In [None]:
# optimizer
scipy_optimizer = optimize.ScipyOptimizer()

In [None]:
# create temporary storagefile...
f_scipy = tempfile.NamedTemporaryFile(suffix=".hdf5", delete=False)
fn_scipy = f_scipy.name

# ... and corresponding history option
history_options_scipy = pypesto.HistoryOptions(
    trace_record=True, storage_file=fn_scipy
)

**Note to the above:**

In practice you should not use a temporary file, as it is removed after the run, while still creating overhead. There are two options you might choose from instead:

- If you do not plan to save the optimization result, you can use a `MemoryHistory` by removing the `storage_file`-argument. This creates no overhead but is more demanding on the memory.
- If you want to save your results (**recommended**) for any form of reusability, you can remove `f_$optimizer` and replace the `fn_$optimizer`assignment with `fn_$optimizer = "filename_of_choice.hdf5"`

In [None]:
n_starts = 10

# run optimization
result_scipy = optimize.minimize(
    problem=problem,
    optimizer=scipy_optimizer,
    n_starts=n_starts,
    history_options=history_options_scipy,
    filename=fn_scipy,
)

In a first step we compare the optimizers in terms of final objective function and robustness through a waterfall plot.

## Optimizer convergence

First we want to check convergence of a single result. For this a summary and general visualizations such as waterfall-plots can be helpfull, but also specific optimizer_convergence visualization as well as history tracing.

In [None]:
display(Markdown(result_scipy.summary()))

In [None]:
# waterfall plot
visualize.waterfall(result_scipy);

The waterfall plot is an overview of the final objective function values. They are ordered from best to worst. Similar colors indicate similar function values and potential local optima/mannifolds. In the best case scenario all values are assigned to a plateau indicating local optima and the best value is found more then once. Additionally we might want to check whether the gradients converged as well and whether we can find a pattern in specific reasons to stop:

In [None]:
visualize.optimizer_convergence(result_scipy);

We usually want the gradients to be very low, in order to actually ensure we are in a local optimum. If the results do not seem entirely promising, we might want to switch optimizers altogether, as different optimizers sometimes perform better for other problems. Additionally one can try changing the optimizer.

In [None]:
# switch to fides optimizer
fides_optimizer = optimize.FidesOptimizer(
    verbose=logging.WARN
)
f_fides = tempfile.NamedTemporaryFile(suffix=".hdf5", delete=False)
fn_fides = f_fides.name
history_options_fides = pypesto.HistoryOptions(
    trace_record=True, storage_file=fn_fides
)
result_fides = optimize.minimize(
    problem=problem,
    optimizer=fides_optimizer,
    n_starts=n_starts,
    history_options=history_options_fides,
    filename=fn_fides,
)

In [None]:
visualize.waterfall([result_fides, result_scipy], legends=["Fides Optimizer", "Scipy Optimizer"]);

We can also compare various metrics of the results, such as time and number of evaluations

In [None]:
visualize.optimization_run_properties_per_multistart(
        [result_fides, result_scipy], properties_to_plot=['time', 'n_fval'],
        legends=["Fides Optimizer", "Scipy Optimizer"] 
);

We might want to check how close the estimated guesses are together, for this we can employ the parameter visualization.

In [None]:
visualize.parameters(result_fides);

We can also check how the optimization trajectory looks like during the different runs, getting other reasons such as very flat landscapes, that can be additonal reasons for the optimization to stop. For this the we use the history:

In [None]:
visualize.optimizer_history(result_fides, trace_y="fval")
visualize.optimizer_history(result_fides, trace_y="gradnorm");

As we can see the function values are not monotonic. This is due to the optimization tracing line search evaluations as well. This allows us to investigate potential problems. Recurring patterns in the gradient norm together with miniscule to no changes in the function values indicate the optimizer to not be able to really find another next point or taking spiraling steps. In both cases the actual optimum is very hard to pinpoint.
Lowering tolerances, increasing startpoints (up to a certain point), switching optimizers are all valid strategies in trying to overcome such issues. However, there is no recipe for all models and thus it is always important to investigate the optimization in terms of convergence, termination reasons and function evaluations, to get ideas on what to do next.

## Reloading from History

Especially when running large models on clusters, the optimization sometimes my stop due to unfortunate reasons (e.g. timeouts). In these cases, the history serves yet another purpose: retrieving finished and unfinished optimizations. Sometimes out of 100 starts, 80 might have already been terminated, in this case, investigating those 80 might already yield good results. In other cases, we might want to continue optimization from where we left of.

In [None]:
# load result from history
result_from_history = optimize