# Example 3: Extended usage and output diagnostics

In this notebook, we demonstrate extended usage of PyVBMC. We will take a brief look at PyVBMC's diagnostic output, and show you how to save the results of optimization to disk.

This notebook is Part 3 of a series of notebooks in which we present various example usages for VBMC with the PyVBMC package.

In [1]:
import numpy as np
import scipy.stats as scs
from scipy.optimize import minimize
from pyvbmc.vbmc import VBMC
from pyvbmc.formatting import format_dict
import dill

## 1. Model definition and setup

For demonstration purposes, we will run PyVBMC with a restricted budget of function evaluations, insufficient to achieve convergence. Then we will inspect the output diagnostics, and resume optimization.

We use a higher-dimensional analogue of the same toy target function in Example 1, a broad [Rosenbrock's banana function](https://en.wikipedia.org/wiki/Rosenbrock_function) in $D = 4$.

In [2]:
D = 4  # A four-dimensional problem
prior_mu = np.zeros(D)
prior_var = 3 * np.ones(D)


def log_prior(theta):
    """Multivariate normal prior on theta."""
    cov = np.diag(prior_var)
    return scs.multivariate_normal(prior_mu, cov).logpdf(theta)

The likelihood function of your model will in general depend on the observed data. This data can be fixed as a global variable, as we did directly above for `prior_mu` and `prior_var`. It can also be defined by a default second argument: to PyVBMC there is no difference so long as the function can be called with only a single argument (the parameters `theta`):

In [3]:
def log_likelihood(theta, data=np.ones(D)):
    """D-dimensional Rosenbrock's banana function."""
    # In this simple demo the data just translates the parameters:
    theta = np.atleast_2d(theta)
    theta = theta + data

    x, y = theta[:, :-1], theta[:, 1:]
    return -np.sum((x**2 - y) ** 2 + (x - 1) ** 2 / 100, axis=1)


def log_joint(theta, data=np.ones(D)):
    """log-density of the joint distribution."""
    return log_likelihood(theta, data) + log_prior(theta)

In [4]:
LB = np.full(D, -np.inf)  # Lower bounds
UB = np.full(D, np.inf)  # Upper bounds
PLB = np.full(D, prior_mu - np.sqrt(prior_var))  # Plausible lower bounds
PUB = np.full(D, prior_mu + np.sqrt(prior_var))  # Plausible upper bounds

In a typical inference scenario, we recommend starting from a "good" point (i.e. one near the mode). We can run a  quick preliminary optimization, though a more extensive optimization would not harm.

In [5]:
np.random.seed(41)
x0 = np.random.uniform(PLB, PUB)  # Random point inside plausible box
x0 = minimize(
    lambda t: -log_joint(t),
    x0,
    bounds=[
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
    ],
).x
np.random.seed(42)

In [6]:
# Limit number of function evaluations
options = {
    "max_fun_evals": 10 * D,
}
# We can specify either the log-joint, or the log-likelihood and log-prior.
# In other words, the following lines are equivalent:
vbmc = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    user_options=options,
    log_prior=log_prior,
)
# vbmc = VBMC(
#     log_joint,
#     x0, LB, UB, PLB, PUB, user_options=options,
# )

Reshaping x0 to row vector.
Reshaping lower bounds to (1, 4).
Reshaping upper bounds to (1, 4).
Reshaping plausible lower bounds to (1, 4).
Reshaping plausible upper bounds to (1, 4).


(PyVBMC expects the bounds to be `(1, D)` row vectors, and the initial point(s) to be of shape `(n, D)`, but it will accept and re-shape vectors of shape `(D,)` as well.)

## 2. Running the model and checking convergence diagnostics

Now we run PyVBMC with a very small budget of 40 function evaluations:

In [7]:
vp, elbo, elbo_sd, success_flag, info = vbmc.optimize()

Beginning variational optimization assuming EXACT observations of the log-joint.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     0         10          -4.30         2.18    261321.64        2        inf     start warm-up
     1         15          -4.75         1.16         0.27        2        inf     
     2         20          -4.71         1.27        22.18        2        374     
     3         25          -5.21         0.38         3.80        2       66.3     
     4         30          -5.24         0.44         0.66        2       12.6     
     5         35          -4.35         2.62         3.98        2       78.1     
     6         40          -4.86         0.12         2.53        2       44.3     
   inf         40          -4.48         0.17         2.07       50       44.3     finalize
Inference terminated: reached maximum number of function evaluations options.max_fun_evals.
Estimated ELBO: -4.479 +/-0.169.
Caution: Re

PyVBMC is warning us that convergence is doubtful. We can look at the output for more information and diagnostics.

In [8]:
print(success_flag)

False


`False` means that PyVBMC has not converged to a stable solution within the given number of function evaluations.

In [9]:
print(format_dict(info))

{
    'function': '<function VBMC.__init__.<locals>.log_joint at 0x7fc7a2277d30>',
    'problem_type': 'unconstrained',
    'iterations': 6,
    'func_count': 40,
    'best_iter': 6,
    'train_set_size': 40,
    'components': 50,
    'r_index': 44.25350292432569,
    'convergence_status': 'no',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: reached maximum number of function evaluations options.max_fun_evals.',
    'elbo': -4.478762937046817,
    'elbo_sd': 0.16871359698848737,
}


In the `info` dictionary:
- the `convergence_status` field says 'no' (probable lack of convergence);
- the reliability index `r_index` is 3.68, (should be less than 1).
Our diagnostics tell us that this run has not converged, suggesting to increase the budget.

Note that convergence to a solution does not mean that it is a _good_ solution. You should always check the returned variational posteriors, and ideally should compare across multiple runs of PyVBMC.

Now we will re-run PyVBMC using the default maximum number of function evaluations. This default is $50(d+2)$, where $d$ is the dimension of the parameter space (though this example will not use the full budget):

In [10]:
np.random.seed(42)
vbmc_2 = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    log_prior=log_prior,
)
vp_2, elbo_2, elbo_sd_2, success_flag_2, info_2 = vbmc_2.optimize()

Reshaping x0 to row vector.
Reshaping lower bounds to (1, 4).
Reshaping upper bounds to (1, 4).
Reshaping plausible lower bounds to (1, 4).
Reshaping plausible upper bounds to (1, 4).
Beginning variational optimization assuming EXACT observations of the log-joint.
 Iteration  f-count    Mean[ELBO]    Std[ELBO]    sKL-iter[q]   K[q]  Convergence  Action
     0         10          -4.30         2.18    261321.64        2        inf     start warm-up
     1         15          -5.04         1.13         4.33        2        inf     
     2         20          -5.61         0.70         3.32        2       59.6     
     3         25          -5.01         1.21         0.64        2       16.6     
     4         30          -5.22         0.41         0.65        2       12.9     
     5         35          -4.94         0.38         0.41        2       8.96     
     6         40          -4.79         0.12         0.16        2       3.62     
     7         45          -4.75         0.0

In [11]:
print(format_dict(info_2))

{
    'function': '<function VBMC.__init__.<locals>.log_joint at 0x7fc79df38550>',
    'problem_type': 'unconstrained',
    'iterations': 18,
    'func_count': 100,
    'best_iter': 18,
    'train_set_size': 99,
    'components': 50,
    'r_index': 0.10006016667866248,
    'convergence_status': 'probable',
    'overhead': nan,
    'rng_state': 'rng',
    'algorithm': 'Variational Bayesian Monte Carlo',
    'version': '0.1.0',
    'message': 'Inference terminated: variational solution stable for options.tol_stable_count fcn evaluations.',
    'elbo': -4.156188182738892,
    'elbo_sd': 0.002838216226534726,
}


With the default budget of function evaluations, we can see that the `convergence_status` is 'probable' and the `r_index` is much less than 1, suggesting convergence has been acheived.

## 3. Saving results

We can also save the `VBMC` instance to disk and reload it later, in order to check the results and convergence diagnostics, sample from the posterior, etc. If you are only interested in the final (best) variational solution, as opposed to the full iteration history of the optimization, then you may wish to pickle only the final `VariationalPosterior` instead.

In [12]:
with open("vbmc_test_save.pkl", "wb") as f:
    dill.dump(vbmc_2, f)
    # To pickle just the VP:
    # dill.dump(vp_2, f)

In [13]:
with open("vbmc_test_save.pkl", "rb") as f:
    vbmc_3 = dill.load(f)
samples, components = vbmc_3.vp.sample(5)
# `samples` are samples drawn from the variational posterior.
# `components` are the index of the mixture components each
#  sample was drawn from.
print(samples)
print(components)

[[-1.28266157 -0.27594897  0.24020049 -0.07582188]
 [ 0.37839969 -1.31150667 -1.53031151 -0.45366665]
 [-0.25085486  0.60567925  0.18645779  0.88174226]
 [-1.58706305 -0.06215183 -0.45124499 -1.44755624]
 [ 0.70967592  0.51295552  0.45439654  1.67490708]]
[44 21  2  0  2]


## 4. Conclusions

In this notebook, we have given a brief overview of PyVBMC's output diagnostics, and shown how to save and load results.

In the next notebook, we will illustrate running PyVBMC multiple times in order to validate the results.

## Example 3: full code

The following cell includes in a single place all the code used in Example 3, without the extra fluff.

In [14]:
assert False  # skip this cell

import numpy as np
import scipy.stats as scs
from scipy.optimize import minimize
from pyvbmc.vbmc import VBMC
from pyvbmc.formatting import format_dict
import dill


D = 4  # A four-dimensional problem
prior_mu = np.zeros(D)
prior_var = 3 * np.ones(D)


def log_prior(theta):
    """Multivariate normal prior on theta."""
    cov = np.diag(prior_var)
    return scs.multivariate_normal(prior_mu, cov).logpdf(theta)


def log_likelihood(theta, data=np.ones(D)):
    """D-dimensional Rosenbrock's banana function."""
    # In this simple demo the data just translates the parameters:
    theta = np.atleast_2d(theta)
    theta = theta + data

    x, y = theta[:, :-1], theta[:, 1:]
    return -np.sum((x**2 - y) ** 2 + (x - 1) ** 2 / 100, axis=1)


def log_joint(theta, data=np.ones(D)):
    """log-density of the joint distribution."""
    return log_likelihood(theta, data) + log_prior(theta)


LB = np.full(D, -np.inf)  # Lower bounds
UB = np.full(D, np.inf)  # Upper bounds
PLB = np.full(D, prior_mu - np.sqrt(prior_var))  # Plausible lower bounds
PUB = np.full(D, prior_mu + np.sqrt(prior_var))  # Plausible upper bounds


np.random.seed(41)
x0 = np.random.uniform(PLB, PUB)  # Random point inside plausible box
x0 = minimize(
    lambda t: -log_joint(t),
    x0,
    bounds=[
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
        (-np.inf, np.inf),
    ],
).x
np.random.seed(42)


# Limit number of function evaluations
options = {
    "max_fun_evals": 10 * D,
}
# We can specify either the log-joint, or the log-likelihood and log-prior.
# In other words, the following lines are equivalent:
vbmc = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    user_options=options,
    log_prior=log_prior,
)
# vbmc = VBMC(
#     log_joint,
#     x0, LB, UB, PLB, PUB, user_options=options,
# )


vp, elbo, elbo_sd, success_flag, info = vbmc.optimize()


print(success_flag)


print(format_dict(info))


np.random.seed(42)
vbmc_2 = VBMC(
    log_likelihood,
    x0,
    LB,
    UB,
    PLB,
    PUB,
    log_prior=log_prior,
)
vp_2, elbo_2, elbo_sd_2, success_flag_2, info_2 = vbmc_2.optimize()


print(format_dict(info_2))


with open("vbmc_test_save.pkl", "wb") as f:
    dill.dump(vbmc_2, f)
    # To pickle just the VP:
    # dill.dump(vp_2, f)


with open("vbmc_test_save.pkl", "rb") as f:
    vbmc_3 = dill.load(f)
samples, components = vbmc_3.vp.sample(5)
# `samples` are samples drawn from the variational posterior.
# `components` are the index of the mixture components each
#  sample was drawn from.
print(samples)
print(components)

AssertionError: 

## Acknowledgments

Work on the PyVBMC package was funded by the [Finnish Center for Artificial Intelligence FCAI](https://fcai.fi/).