# PyVBMC Example 3: Output diagnostics and saving results

In this notebook, we demonstrate extended usage of PyVBMC. We will take a brief look at PyVBMC's diagnostic output, and show you how to save the results of optimization to disk.

This notebook is Part 3 of a series of notebooks in which we present various example usages for VBMC with the PyVBMC package. The code used in this example is available as a script [here](https://github.com/acerbilab/pyvbmc/blob/main/examples/pyvbmc_example_3_full_code.py).

In [1]:
import numpy as np
import scipy.stats as scs
from scipy.optimize import minimize
from pyvbmc import VBMC
from pyvbmc.formatting import format_dict
import dill

## 1. Model definition and setup

For demonstration purposes, we will run PyVBMC with a restricted budget of function evaluations, insufficient to achieve convergence. Then we will inspect the output diagnostics, and resume optimization.

We use a higher-dimensional analogue of the same toy target function in Example 1, a broad [Rosenbrock's banana function](https://en.wikipedia.org/wiki/Rosenbrock_function) in $D = 4$.

In [2]:
D = 4  # A four-dimensional problem
prior_mu = np.zeros(D)
prior_var = 3 * np.ones(D)


def log_prior(theta):
    """Multivariate normal prior on theta."""
    cov = np.diag(prior_var)
    return scs.multivariate_normal(prior_mu, cov).logpdf(theta)

The likelihood function of your model will in general depend on the observed data. This data can be fixed as a global variable, as we did directly above for `prior_mu` and `prior_var`. It can also be defined by a default second argument: to PyVBMC there is no difference so long as the function can be called with only a single argument (the parameters `theta`):

In [3]:
def log_likelihood(theta, data=np.ones(D)):
    """D-dimensional Rosenbrock's banana function."""
    # In this simple demo the data just translates the parameters:
    theta = np.atleast_2d(theta)
    theta = theta + data

    x, y = theta[:, :-1], theta[:, 1:]
    return -np.sum((x**2 - y) ** 2 + (x - 1) ** 2 / 100, axis=1)


def log_joint(theta, data=np.ones(D)):
    """log-density of the joint distribution."""
    return log_likelihood(theta, data) + log_prior(theta)

In [4]:
LB = np.full(D, -np.inf)  # Lower bounds
UB = np.full(D, np.inf)  # Upper bounds
PLB = np.full(D, prior_mu - np.sqrt(prior_var))  # Plausible lower bounds
PUB = np.full(D, prior_mu + np.sqrt(prior_var))  # Plausible upper bounds

In a typical inference scenario, we recommend starting from a "good" point (i.e. one near the mode). We can run a  quick preliminary optimization, though a more extensive optimization would not harm.

In [5]:
np.random.seed(41)
x0 = np.random.uniform(PLB, PUB)  # Random point inside plausible box
x0 = minimize(
    lambda t: -log_joint(t),
    x0,
    bounds=[
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
    ],
).x
np.random.seed(42)

In [6]:
# Limit number of function evaluations
options = {
    "max_fun_evals": 10 * D,
}
# We can specify either the log-joint, or the log-likelihood and log-prior.
# In other words, the following lines are equivalent:
vbmc = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    options=options,
    log_prior=log_prior,
)
# vbmc = VBMC(
#     log_joint,
#     x0, LB, UB, PLB, PUB, options=options,
# )

Reshaping x0 to row vector.
Reshaping lower bounds to (1, 4).
Reshaping upper bounds to (1, 4).
Reshaping plausible lower bounds to (1, 4).
Reshaping plausible upper bounds to (1, 4).


(PyVBMC expects the bounds to be `(1, D)` row vectors, and the initial point(s) to be of shape `(n, D)`, but it will accept and re-shape vectors of shape `(D,)` as well.)

## 2. Running the model and checking convergence diagnostics

Now we run PyVBMC with a very small budget of 40 function evaluations:

In [7]:
vp, results = vbmc.optimize()

Beginning variational optimization assuming EXACT observations of the log-joint.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     0         10          -4.28         2.15    275005.97        2        inf     start warm-up
     1         15          -3.02         2.60         4.41        2        inf     
     2         20          -4.39         1.24         8.14        2        144     
     3         25          -4.65         0.62         3.26        2       57.3     
     4         30          -4.79         0.71         1.12        2       21.6     
     5         35          -5.00         0.23         1.33        2       23.7     
     6         40          -4.82         0.13         0.05        2       1.89     
   inf         40          -4.58         0.15         0.14       50       1.89     finalize
Inference terminated: reached maximum number of function evaluations options.max_fun_evals.
Estimated ELBO: -4.581 +/-0.152.
Caution: Re

PyVBMC is warning us that convergence is doubtful. We can look at the output for more information and diagnostics.

In [8]:
print(results["success_flag"])

False


`False` means that PyVBMC has not converged to a stable solution within the given number of function evaluations.

In [9]:
print(format_dict(results))

{
    'function': '<function VBMC.__init__.<locals>.log_joint at 0x7f824704ed30>',
    'problem_type': 'unconstrained',
    'iterations': 6,
    'func_count': 40,
    'best_iter': 6,
    'train_set_size': 40,
    'components': 50,
    'r_index': 1.8901718949158166,
    'convergence_status': 'no',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: reached maximum number of function evaluations options.max_fun_evals.',
    'elbo': -4.580586512183974,
    'elbo_sd': 0.15249730360797745,
    'success_flag': False,
}


In the `info` dictionary:
- the `convergence_status` field says 'no' (probable lack of convergence);
- the reliability index `r_index` is 3.68, (should be less than 1).
Our diagnostics tell us that this run has not converged, suggesting to increase the budget.

Note that convergence to a solution does not mean that it is a _good_ solution. You should always check the returned variational posteriors, and ideally should compare across multiple runs of PyVBMC.

## 3. Saving results

We can also save the `VBMC` instance to disk and reload it later, in order to check the results and convergence diagnostics, sample from the posterior, or resume the optimization from checkpoint etc. If you are only interested in the final (best) variational solution, as opposed to the full iteration history of the optimization, then you may wish to pickle only the final `VariationalPosterior` instead.

In [10]:
with open("vbmc_test_save.pkl", "wb") as f:
    dill.dump(vbmc, f)

with open("vbmc_test_save.pkl", "rb") as f:
    vbmc = dill.load(f)

## 4. Resume the optimization process

Now we could resume the optimization process of PyVBMC by increasing maximum number of function evaluations. This default is $50(d+2)$, where $d$ is the dimension of the parameter space (though this example will not use the full budget):

In [11]:
iteration = (
    vbmc.iteration
)  # continue from specified iteration, here it's the last iteration
assert 1 <= iteration <= vbmc.iteration

## Set states for VBMC right before the specified iteration
vbmc.gp = vbmc.iteration_history["gp"][iteration - 1]
vbmc.vp = vbmc.iteration_history["vp"][iteration - 1]
vbmc.function_logger = vbmc.iteration_history["function_logger"][iteration - 1]
vbmc.optim_state = vbmc.iteration_history["optim_state"][iteration - 1]
vbmc.hyp_dict = vbmc.optim_state["hyp_dict"]
vbmc.iteration = iteration - 1

for k, v in vbmc.iteration_history.items():
    try:
        vbmc.iteration_history[k] = vbmc.iteration_history[k][:iteration]
    except TypeError:
        pass
# (Optionally) Set random state for reproducibility, note that it can only
# reproduce exactly the same optimization process when VBMC's options are not updated
random_state = vbmc.iteration_history["random_state"][iteration - 1]
np.random.set_state(random_state)

We need to update the options. Here we increase `max_fun_evals` and resume the optimization process.

In [12]:
options = {
    "max_fun_evals": 50 * (D + 2),
}
vbmc.options.is_initialized = (
    False  # Temporarily set to False for updating options
)
vbmc.options.update(options)
vbmc.options.is_initialized = True

vbmc.is_finished = False
vp, results = vbmc.optimize()

Beginning variational optimization assuming EXACT observations of the log-joint.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     6         40          -4.72         0.12         0.09        2       2.78     
     7         45          -4.70         0.05         0.01        2      0.465     
     8         50          -4.66         0.05         0.05        2       1.09     end warm-up
     9         55          -4.64         0.02         0.01        2      0.306     
    10         60          -4.56         0.01         0.01        2      0.399     
    11         65          -4.55         0.01         0.04        5      0.731     
    12         70          -4.40         0.01         0.06        8       1.48     rotoscale, undo rotoscale
    13         75          -4.33         0.01         0.01        9      0.476     
    14         80          -4.30         0.00         0.00       12      0.172     
    15         85          -4.25     

In [13]:
print(format_dict(results))

{
    'function': '<function VBMC.__init__.<locals>.log_joint at 0x7f824704e700>',
    'problem_type': 'unconstrained',
    'iterations': 18,
    'func_count': 100,
    'best_iter': 18,
    'train_set_size': 99,
    'components': 50,
    'r_index': 0.07375842196832269,
    'convergence_status': 'probable',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: variational solution stable for options.tol_stable_count fcn evaluations.',
    'elbo': -4.145182203012939,
    'elbo_sd': 0.002220654574979916,
    'success_flag': True,
}


With the default budget of function evaluations, we can see that the `convergence_status` is 'probable' and the `r_index` is much less than 1, suggesting convergence has been acheived. We can save the result to file and load it at anytime for verification etc.

In [14]:
with open("vbmc_test_save.pkl", "wb") as f:
    dill.dump(vbmc, f)

with open("vbmc_test_save.pkl", "rb") as f:
    vbmc = dill.load(f)

samples, components = vbmc.vp.sample(5)
# `samples` are samples drawn from the variational posterior.
# `components` are the index of the mixture components each
#  sample was drawn from.
print(samples)
print(components)

[[-0.09373324 -0.20273329 -0.37971214 -0.92123077]
 [ 0.37903486  0.60764692  0.78428635  1.76909208]
 [-0.35520708 -0.21939562  0.06508829 -0.08745725]
 [-1.16850346 -0.30246786 -1.02857926 -0.29353836]
 [-1.55871302 -0.47776638 -0.89148552 -1.31746064]]
[24 23 34  1  1]


## 5. Conclusions

In this notebook, we have given a brief overview of PyVBMC's output diagnostics, and shown how to save and load results and resume optimization from a specific iteration.

In the next notebook, we will illustrate running PyVBMC multiple times in order to validate the results.

## Acknowledgments

Work on the PyVBMC package was funded by the [Finnish Center for Artificial Intelligence FCAI](https://fcai.fi/).