# Variational Inference using Pathfinder

Stan supports the Pathfinder algorithm
([Zhang, 2022](https://jmlr.org/papers/v23/21-0889.html)).
Pathfinder is a variational method for approximately
sampling from differentiable log densities.  Starting from a random
initialization, Pathfinder locates normal approximations to the target
density along a quasi-Newton optimization path, with local covariance
estimated using the negative inverse Hessian estimates produced by the
LBFGS optimizer.  Pathfinder returns draws from the Gaussian approximation
with the lowest estimated Kullback-Leibler (KL) divergence to the true
posterior.

There are two Stan implementations of the Pathfinder algorithm:
single-path Pathfinder and multi-path Pathfinder.
Single-path Pathfinder generates a set of approximate draws from one run of the basic Pathfinder algorithm.
Multi-path Pathfinder uses importance resampling over the draws from multiple runs of Pathfinder.
This better matches non-normal target densities and also mitigates
the problem of L-BFGS getting stuck at local optima or in saddle points on plateaus.

### Example: variational inference with Pathfinder for model ``bernoulli.stan``

The [CmdStanModel pathfinder](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanModel.pathfinder ) method
wraps the CmdStan [pathfinder ](https://mc-stan.org/docs/cmdstan-guide/pathfinder-config.html) method.

By default, CmdStanPy runs multi-path Pathfinder which returns an importance-resampled set of draws over the outputs of 4 independent single-path Pathfinders.

In [1]:
import os
from cmdstanpy.model import CmdStanModel, cmdstan_path

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
bernoulli_dir = os.path.join(cmdstan_path(), 'examples', 'bernoulli')
stan_file = os.path.join(bernoulli_dir, 'bernoulli.stan')
data_file = os.path.join(bernoulli_dir, 'bernoulli.data.json')
# instantiate, compile bernoulli model
model = CmdStanModel(stan_file=stan_file)
# run CmdStan's pathfinder method, returns object `CmdStanPathfinder`
pathfinder = model.pathfinder(data=data_file)

14:43:21 - cmdstanpy - INFO - Chain [1] start processing


14:43:21 - cmdstanpy - INFO - Chain [1] done processing


In [3]:
print(pathfinder)
print(pathfinder.metadata)

CmdStanPathfinder: model=bernoulli['method=pathfinder']
 csv_files:
	/tmp/tmpe58v_i7q/bernoullih8v8dm3c/bernoulli-20231003144321.csv
 output_files:
	/tmp/tmpe58v_i7q/bernoullih8v8dm3c/bernoulli-20231003144321_0-stdout.txt
Metadata:
{'stan_version_major': 2, 'stan_version_minor': 33, 'stan_version_patch': 0, 'model': 'bernoulli_model', 'start_datetime': '2023-10-03 14:43:21 UTC', 'method': 'pathfinder', 'init_alpha': 0.001, 'tol_obj': 1e-12, 'tol_rel_obj': 10000, 'tol_grad': 1e-08, 'tol_rel_grad': 10000000, 'tol_param': 1e-08, 'history_size': 5, 'num_psis_draws': 1000, 'num_paths': 4, 'save_single_paths': 0, 'max_lbfgs_iters': 1000, 'num_draws': 1000, 'num_elbo_draws': 25, 'id': 1, 'data_file': '/home/runner/.cmdstan/cmdstan-2.33.1/examples/bernoulli/bernoulli.data.json', 'init': 2, 'seed': 56891, 'diagnostic_file': '', 'refresh': 100, 'sig_figs': -1, 'profile_file': 'profile.csv', 'num_threads': 1, 'raw_header': 'lp_approx__,lp__,theta', 'column_names': ('lp_approx__', 'lp__', 'theta')

The `pathfinder` method returns a [CmdStanPathfinder](https://mc-stan.org/cmdstanpy/api.html#cmdstanpathfinder) object,
which provides access to the disparate information from the Stan CSV files.


- The [stan_variable](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanPathfinder.stan_variable) and
[stan_variables](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanPathfinder.stan_variables) methods 
return a Python [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy.ndarray)
containing all draws from the sample where the structure of each draw corresponds to the structure of the
Stan variable.

- The [draws](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanPathfinder.draws) method returns the sample as a numpy.ndarray.

In [4]:
pathfinder.stan_variable("theta").shape

(1000,)

In [5]:
pathfinder.column_names

('lp_approx__', 'lp__', 'theta')

In [6]:
pathfinder.draws().shape

(1000, 3)

### Pathfinders as initialization for the MCMC sampler

The method [create_inits](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanPathfinder.create_inits) returns a Python Dict containing a set of per-chain initializations for the model parameters.  Each set of initializations is a random draw from the Pathfinder sample.   These initializations can be used as the initial parameter values for Stan's NUTS-HMC sampler, which will reduce the number of warmup iterations needed.

In [7]:
inits = pathfinder.create_inits()
print(inits)

[{'theta': array(0.161636)}, {'theta': array(0.22604)}, {'theta': array(0.174072)}, {'theta': array(0.222407)}]


The `create_inits` takes two arguments:

* `seed` - used for random selection.
* `chains` - the number of draws to return, default is 4.  This should match the number of sampler chains to run.

In [8]:
inits = pathfinder.create_inits(chains=3)
print(inits)

[{'theta': array(0.0678106)}, {'theta': array(0.177306)}, {'theta': array(0.151628)}]
