James、Tiger和Jason每次去餐厅吃饭的时候，都通过玩轮盘赌的方式来决定谁来付款。吃完20顿饭后，James付了4次、Tiger付了9次、Jason付了7次。

Tiger现在非常生气，他开始怀疑是不是有什么问题，因为他几乎付了一半的钱，现在他开始建立一个模型来看是否有犯规行为。

In [11]:
import torch
import pyro
import pyro.distributions as dist
from pyro.infer import SVI, Trace_ELBO
from pyro.optim import Adam
from pyro.distributions import constraints

In [3]:
# performed some good Pyro hygiene

pyro.enable_validation(True)  # 提供NaNs、正确的参数范围等的检查，可能会降低效率
pyro.clear_param_store()

In [5]:
# 数据，是一个categorical数据，分别表示Tiger、Jason和James
data = torch.cat((torch.zeros(9), torch.ones(7), torch.full((4,), 2)))
data

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 2., 2.,
        2., 2.])

现在我们的数据由一个Categorical分布生成，其拥有3个参数（但3个参数的和为1），现在假设这3个参数服从Beta分布。则我们有以下model：

In [13]:
def model(data):
    alpha = torch.tensor(6.0)
    beta = torch.tensor(10.0)
    with pyro.plate("plate1", 3):
        pay_probs = pyro.sample("pay_probs", dist.Beta(alpha, beta))
    normalized_pay_probs = pay_probs / torch.sum(pay_probs)
    with pyro.plate("plate2"):
        pyro.sample("obs", dist.Categorical(probs=normalized_pay_probs), obs=data)

In [8]:
def guide(data):
    alphas = pyro.param("alphas", torch.tensor(6.).expand(3), constraint=constraints.positive)
    betas = pyro.param("betas", torch.tensor(10.).expand(3), constraint=constraints.positive)
    with pyro.plate("plate3", 3):
        pyro.sample("pay_probs", dist.Beta(alphas, betas))

In [9]:
def print_progress():
    alphas = pyro.param("alphas")
    betas = pyro.param("betas")

    if torch.cuda.is_available():
        alphas.cuda()
        betas.cuda()
    
    means = alphas / (alphas + betas)
    normalized_means = means / torch.sum(means)
    factors = betas / (alphas * (1.0 + alphas + betas))
    stdevs = normalized_means * torch.sqrt(factors)
    tiger_pays_string = "probability Tiger pays: {0:.3f} +/- {1:.2f}".format(normalized_means[0], stdevs[0])
    jason_pays_string = "probability Jason pays: {0:.3f} +/- {1:.2f}".format(normalized_means[1], stdevs[1])
    james_pays_string = "probability James pays: {0:.3f} +/- {1:.2f}".format(normalized_means[2], stdevs[2])
    print("[", step, "|", tiger_pays_string, "|", jason_pays_string, "|", james_pays_string, "]")


In [14]:
adam_params = {"lr": 0.0005}
optimizer = Adam(adam_params)
svi = SVI(model, guide, optimizer, loss=Trace_ELBO())

n_steps = 2501
for step in range(n_steps):
    svi.step(data)
    if step % 100 == 0:
        print_progress()

[ 0 | probability Tiger pays: 0.333 +/- 0.10 | probability Jason pays: 0.333 +/- 0.10 | probability James pays: 0.333 +/- 0.10 ]
[ 100 | probability Tiger pays: 0.340 +/- 0.10 | probability Jason pays: 0.336 +/- 0.10 | probability James pays: 0.324 +/- 0.10 ]
[ 200 | probability Tiger pays: 0.346 +/- 0.10 | probability Jason pays: 0.338 +/- 0.10 | probability James pays: 0.316 +/- 0.10 ]
[ 300 | probability Tiger pays: 0.353 +/- 0.11 | probability Jason pays: 0.338 +/- 0.10 | probability James pays: 0.309 +/- 0.10 ]
[ 400 | probability Tiger pays: 0.359 +/- 0.11 | probability Jason pays: 0.339 +/- 0.10 | probability James pays: 0.301 +/- 0.10 ]
[ 500 | probability Tiger pays: 0.363 +/- 0.11 | probability Jason pays: 0.336 +/- 0.10 | probability James pays: 0.301 +/- 0.10 ]
[ 600 | probability Tiger pays: 0.364 +/- 0.11 | probability Jason pays: 0.338 +/- 0.10 | probability James pays: 0.299 +/- 0.10 ]
[ 700 | probability Tiger pays: 0.369 +/- 0.11 | probability Jason pays: 0.338 +/- 0.