Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] MCMC.marginal() doesn't return joint marginals. #1727

Closed
adam-coogan opened this issue Jan 30, 2019 · 4 comments
Closed

[FR] MCMC.marginal() doesn't return joint marginals. #1727

adam-coogan opened this issue Jan 30, 2019 · 4 comments

Comments

@adam-coogan
Copy link
Contributor

adam-coogan commented Jan 30, 2019

I am confused about how to convert an MCMC run into an empirical distribution for multiple parameters. I would've thought MCMC.marginal() is the way to get this distribution, but the MCMCMarginals it returns does not allow sampling from joint distributions. Consider the following model and MCMC run:

pyro.set_rng_seed(18)
def model(obs=None):
    x = pyro.sample("x", dist.Normal(0., 5.))
    y = pyro.sample("y", dist.Normal(0., 1.))
    pyro.sample("obs", dist.Normal(x-y, 0.1), obs=obs)

obs = torch.tensor(0.)
kernel = NUTS(model)
mcmc_run = MCMC(kernel, num_samples=100, warmup_steps=20).run(obs)

I want to sample from the MCMC posterior for x and y. Running mcmc_run.marginal(["x", "y"]) gives me an MCMCMarginals object for x and y, but I can only sample from the empirical distributions for x and y separately:

posterior_x = mcmc_run.marginal(["x", "y"]).empirical["x"]
posterior_y = mcmc_run.marginal(["x", "y"]).empirical["y"]

This is confusing to me since the MCMCMarginals/Marginals object can take multiple sites as an argument.

I would instead expect Marginals to wrap a call to EmpiricalMarginal(trace_posterior=mcmc_run, sites=["x", "y"]). Or am I misunderstanding the point of MCMCMarginals/Marginals?

@neerajprad
Copy link
Member

MCMCMarginals returns a dictionary with individual marginals for the sample sites. The reason for this restriction is that we are yet to establish what the behavior should be when the sample sites have different shapes.

I would instead expect Marginals to wrap a call to EmpiricalMarginal(trace_posterior=mcmc_run, sites=["x", "y"]).

Currently, the Empirical class takes in a tensor of samples, and this tensor can contain concatenated samples from multiple sites. When you have samples with the same shape, you can call EmpiricalMarginal(trace_posterior=mcmc_run, sites=["x", "y"]) directly and generate samples from the joint. The problem comes when they have different shapes, and that's the reason why we cannot delegate to EmpiricalMarginal in the general case.

That said, this is definitely far from ideal behavior, and we are yet to finalize on a good design for these classes. One design choice could be as follows:

  • Empirical does not take in samples directly but only indices and weights. It scores a sample of indices and returns a set of indices when .sample is called.
  • The EmpiricalMarginal distribution can then use these indices to index into the samples for each of the sites and return a concatenated tensor if sites have same shapes or a dictionary of tensors when sites have different shapes. The only issue with this design would be figuring out how to efficiently score a sample from the joint. As a first pass, we can choose to only support this when sites have the same shape (same as now), and throw an exception otherwise. This basically requires that the user only interact with the EmpiricalMarginal class to access any distribution primitives.

If you have some ideas on improving the current design, we would love to hear your thoughts! In the meantime, we can probably do a better job of documenting these restrictions.

@neerajprad neerajprad changed the title MCMC.marginal() doesn't return joint marginals [feature request] [bug]? [FR] MCMC.marginal() doesn't return joint marginals. Jan 30, 2019
@neerajprad
Copy link
Member

cc. @fehiepsi, @eb8680 in case they have any thoughts on how best to support this.

@eb8680
Copy link
Member

eb8680 commented Jan 31, 2019

For now, it's easy enough to aggregate the values at the sites of interest manually from the joint posterior samples stored in MCMC(...).exec_traces.

In the long run, maybe rather than collecting the values into a new distribution, what we'd want is to have collapse (#1722 ) support MCMC so that we could collapse every random variable in a model except the sites of interest and grab samples by traceing the collapsed model.

@fritzo fritzo added this to the 0.4 release milestone May 6, 2019
@neerajprad
Copy link
Member

I am closing this issue. The MCMCMarginals class is deprecated and the MCMC interface would return a dictionary of samples from the latent sites directly. These can then be combined by the user, stored in Empirical and then sampled from, or simply iterated over to yield samples from the joint distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants