http://pyro.ai/examples/svi_part_i.html

To turn this into a probabilistic model we encode heads and tails as 1s and 0s. We encode the fairness of the coin as a real number f, where f satisfies f∈[0.0,1.0] and f=0.50 corresponds to a perfectly fair coin. Our prior belief about f will be encoded by a beta distribution, specifically Beta(10,10), which is a symmetric probability distribution on the interval [0.0,1.0] that is peaked at f=0.5.

In [42]:
import numpy as np
import torch
import pyro
import pyro.distributions as dist

def model(data):
    # define the hyperparameters that control the beta prior
    alpha0 = torch.tensor(10.0)
    beta0 = torch.tensor(10.0)
    # sample f from the beta prior
    f = pyro.sample("latent_fairness", dist.Beta(alpha0, beta0))
    # loop over the observed data
    for i in range(len(data)):
        # observe datapoint i using the bernoulli
        # likelihood Bernoulli(f)
        pyro.sample("obs_{}".format(i), dist.Bernoulli(f), obs=data[i])

Here we have a single latent random variable ('latent_fairness'), which is distributed according to Beta(10,10). Conditioned on that random variable, we observe each of the datapoints using a bernoulli likelihood. Note that each observation is assigned a unique name in Pyro.

Our next task is to define a corresponding guide, i.e. an appropriate variational distribution for the latent random variable f. The only real requirement here is that q(f) should be a probability distribution over the range [0.0,1.0], since f doesn’t make sense outside of that range. A simple choice is to use another beta distribution parameterized by two trainable parameters αq and βq. Actually, in this particular case this is the ‘right’ choice, since conjugacy of the bernoulli and beta distributions means that the exact posterior is a beta distribution.

In [43]:
import torch.distributions.constraints as constraints

def guide(data):
    # register the two variational parameters with Pyro.
    alpha_q = pyro.param("alpha_q", torch.tensor(15.0), constraint=constraints.positive)
    beta_q = pyro.param("beta_q", torch.tensor(15.0), constraint=constraints.positive)
    
    # sample latent_fairness from the distribution Beta(alpha_q, beta_q)
    pyro.sample("latent_fairness", dist.Beta(alpha_q, beta_q))

There are a few things to note here:

- We’ve taken care that the names of the random variables line up exactly between the model and guide.

- `model(data)` and `guide(data)` take the same arguments.

- The variational parameters are `torch.tensors`. The requires_grad flag is automatically set to True by pyro.param.

- We use `constraint=constraints.positive` to ensure that `alpha_q` and `beta_q` remain non-negative during optimization.

Now we can proceed to do stochastic variational inference.

In [44]:
from pyro.optim import Adam

# set up the optimizer
adam_params = {"lr": 0.0005, "betas": (0.90, 0.999)}
optimizer = Adam(adam_params)

In [45]:
def run_and_report(mymodel):
    from pyro.infer import SVI, Trace_ELBO

    # setup the inference algorithm
    svi = SVI(mymodel, guide, optimizer, loss=Trace_ELBO())

    data = torch.tensor([1.]*10 + [0.]*5)
    print(data)

    n_steps = 5000
    # do gradient steps
    for step in range(n_steps):
        svi.step(data)

    # grab the learned variational parameters
    alpha_q = pyro.param("alpha_q").item()
    beta_q = pyro.param("beta_q").item()

    print(f'From SVI: alpha_q = {alpha_q:.2f}  beta_q = {beta_q:.2f}')

    # here we use some facts about the beta distribution
    # compute the inferred mean of the coin's fairness
    inferred_mean = alpha_q / (alpha_q + beta_q)
    # compute inferred standard deviation
    factor = beta_q / (alpha_q * (1.0 + alpha_q + beta_q))
    inferred_std = inferred_mean * np.sqrt(factor)

    print("\nbased on the data and our prior belief, the fairness " +
          "of the coin is %.3f +- %.3f" % (inferred_mean, inferred_std))

In [46]:
run_and_report(model)

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0.])
From SVI: alpha_q = 19.01  beta_q = 14.11

based on the data and our prior belief, the fairness of the coin is 0.574 +- 0.085


---
http://pyro.ai/examples/svi_part_ii.html

Making Conditional Independence

In [33]:
# first model used above
def model(data):
    # define the hyperparameters that control the beta prior
    alpha0 = torch.tensor(10.0)
    beta0 = torch.tensor(10.0)
    # sample f from the beta prior
    f = pyro.sample("latent_fairness", dist.Beta(alpha0, beta0))
    # loop over the observed data
    for i in range(len(data)):
        # observe datapoint i using the bernoulli
        # likelihood Bernoulli(f)
        pyro.sample("obs_{}".format(i), dist.Bernoulli(f), obs=data[i])

**Sequential plate.**
For this model the observations are conditionally independent given the latent random variable `latent_fairness`. To explicitly mark this in Pyro we basically just need to replace the Python builtin `range` with the Pyro construct `plate`:

We see that `pyro.plate` is very similar to `range` with one main difference: each invocation of `plate` requires the user to provide a unique name. The second argument is an integer just like for `range`.

In [53]:
def model(data):
    alpha0, beta0 = torch.tensor(10.0), torch.tensor(10.0)
    # sample f from the beta prior
    f = pyro.sample("latent_fairness", dist.Beta(alpha0, beta0))
    # loop over the observed data [WE ONLY CHANGE THE NEXT LINE]
    for i in pyro.plate("data_loop", len(data)):
        # observe datapoint i using the bernoulli likelihood
        pyro.sample("obs_{}".format(i), dist.Bernoulli(f), obs=data[i])

In [54]:
run_and_report(model)

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0.])
From SVI: alpha_q = 19.43  beta_q = 14.38

based on the data and our prior belief, the fairness of the coin is 0.575 +- 0.084


**Vectorized plate**
Conceptually vectorized plate is the same as sequential plate except that it is a vectorized operation (as torch.arange is to range). As such it potentially enables large speed-ups compared to the explicit for loop that appears with sequential plate.

In [51]:
def model(data):
    alpha0, beta0 = torch.tensor(10.0), torch.tensor(10.0)
    # sample f from the beta prior
    f = pyro.sample("latent_fairness", dist.Beta(alpha0, beta0))
    # loop over the observed data [WE ONLY CHANGE THE NEXT LINE]
    with pyro.plate("data_loop"):  # vectorized plate. No indexing is required.
        pyro.sample("obs", dist.Bernoulli(f), obs=data)
#

In [52]:
run_and_report(model)

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0.])
From SVI: alpha_q = 19.02  beta_q = 14.37

based on the data and our prior belief, the fairness of the coin is 0.570 +- 0.084
