Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient sampling from mixture models #547

Closed
wants to merge 14 commits into from
Closed

Conversation

kadeng
Copy link

@kadeng kadeng commented Jun 6, 2014

The changes in this pull request are the following

Metropolis Hastings Step Method ===

A new step method, called MetropolisHastings which allows arbitrary (non-symmetric) proposal distributions. Implemented wrappers for the existing Proposals as well as explicit support for Categorical proposal distributions. This can easily be extended to cover more proposals (how about HamiltonianMC Proposals, or Swendsen-Wang ?).

The implementation could be made more efficient (some potentially complex terms get added to the acceptance logp, and then subtracted again later on), but it's hard to do that in a general manner. Maybe one could teach Theano to optimize this out.

Efficient sampling from Mixture Models

Using that Metropolis Hastings step method, it's easy to create an efficient sampler for Mixture models. So I wrote a first implementation for a Multivariate Normal Mixture distribution. Seems to work well. This could easily be extended to allow flexible mixture kernels.

Small Examples

For both, the new Step Method and the Mixture Model, I provided examples.

Small fixes

I noticed that the "bound" function from distributions/dist_math.py fails to work correctly in some cases. Especially in the logp of the Categorical distribution, it failed not to evaluate at disallowed values, which
could then lead to index out of bounds exceptions. I fixed that by introducing a new "bound_scalar" function
which uses theano.ifelse.ifelse(..), which has the advantage over theano.switch(..) of not evaluating inactive parts of the expression
graph.

Another small fix had to do with TransformedDistributions. In one example I used a transformed Dirichlet distribution where one of the hyperparameters was a 1. This led to an invalid mode for the distribution. That's no problem, unless you transform that Dirichlet, in which case the mode becomes the starting point for the sampler. Since it's invalid (contains NaN values), that leads to problems.

Potential uses

The GMMs can be used as flexible prior densities. They can approximate any function.

Another use case for efficient GMMs is to use them to represent posterior densities. Since it's possible to
efficiently sample from products of mixtures of gaussians, these can be used to enable distributed scalable MCMC, both in the form of embarassingly parallel MCMC, as well as in distributed Nonparametric Belief Propagation (NBP) inference algorithms. I plan to write more on that later. See also my PyPGMc project on GitHub.

def bound(logp, *conditions):
impossible = constant(-inf, dtype=float64) # We re-use that constant

def bound_scalar(logp, *conditions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove bound_scalar and replace it with bound or is there something bound below can't do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to write a generic version of bound that can deal with all the kinds of different inputs that bound is actually fed with. Notable variants of arguments to bound included:

logp: Can be theano scalar or theano vector (in practice, not by contract)
conditions: If logp is a vector, these can be (mixed) vectors or scalars. If logp is a scalar, these are, too.

In the case that logp is a vector, I tried to write a general version which used scan and theano.ifelse to handle every case correctly. It's hard to exactly mimic theano.switch behaviour using scan and ifelse, though,

So in order not to introduce any new bugs, I fell back to the old solution using switch, except for the case where I know logp to be a scalar. Which is the only specific case where I knew using switch introduced a bug.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theano.switch evaluates both alternatives it swiches between, regardless of the conditions. Unlike ifelse, which evaluates only the expression which gets returned by it.

Consequently, the case that bound using switch doesn't handle correctly is if I feed it a logp which leads to an exception if one of the conditions is invalid. This can happen in Categorical.logp, since the logp is created by looking up an index in a weights vector. If the index is out of bounds, an exception gets raised. This should be prevented by the conditions to bound (which check if the index is out of bound), but since switch evaluates the (invalid) logp anyway, an exception gets thrown.

Apart from that, since switch evaluates dead expression branches, it's inefficient as hell.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this, since most users will be using Bounded and not bound.

@twiecki
Copy link
Member

twiecki commented Jun 7, 2014

This looks really promising.

I wonder if this could replace the previous Metropolis sampler? We had some odd problems and I wonder if this might solve it: #358.

Also, could it help here: #443 and #452 ?

Also, how difficult do you think it would be to write some basic tests? That would definitely go a long way.

@kadeng
Copy link
Author

kadeng commented Jun 7, 2014

Hi Thomas,

Yes, tests are essential. I don't think they are difficult to write, just
time consuming :) pep8 compliance shouldn't be a problem either. I plan to
do that, maybe next week or so.

I think it can replace the previous Metropolis sampler in time, once we
copy the Proposal implementations over and have a custom "BinaryProposal"
for binary variables
(as in BinaryMetropolis).

BTW - I'm not an expert in HMC, but as far as I remember, it involves a
Metropolis accept/reject step. Do you think it would be possible to cast
Hamiltonian Monte Carlo as a Proposal for a Metropolis Hastings sampler,
for example to combine it with a Categorical Proposal ? It would be great
if it would be possible to use HMC or even NUTS to efficiently sample from
arbitrary distributions which are conditional on discrete variables with
possibly disconnected areas of probability mass. The approach of sampling
from all of them, and then using a Deterministic to switch on the discrete
var is a bit wasteful..

best,

Kai

2014-06-07 11:11 GMT+02:00 Thomas Wiecki notifications@github.com:

This looks really promising.

I wonder if this could replace the previous Metropolis sampler? We had
some odd problems and I wonder if this might solve it: #358
#358.

Also, could it help here: #443
#443 and #452
#452 ?

Also, how difficult do you think it would be to write some basic tests?
That would definitely go a long way.


Reply to this email directly or view it on GitHub
#547 (comment).

@twiecki
Copy link
Member

twiecki commented Jun 7, 2014

That's an interesting idea. I think HMC and NUTS could definitely be
converted to be MH proposals.

On Sat, Jun 7, 2014 at 11:50 AM, Kai Londenberg notifications@github.com
wrote:

Hi Thomas,

Yes, tests are essential. I don't think they are difficult to write, just
time consuming :) pep8 compliance shouldn't be a problem either. I plan to
do that, maybe next week or so.

I think it can replace the previous Metropolis sampler in time, once we
copy the Proposal implementations over and have a custom "BinaryProposal"
for binary variables
(as in BinaryMetropolis).

BTW - I'm not an expert in HMC, but as far as I remember, it involves a
Metropolis accept/reject step. Do you think it would be possible to cast
Hamiltonian Monte Carlo as a Proposal for a Metropolis Hastings sampler,
for example to combine it with a Categorical Proposal ? It would be great
if it would be possible to use HMC or even NUTS to efficiently sample from
arbitrary distributions which are conditional on discrete variables with
possibly disconnected areas of probability mass. The approach of sampling
from all of them, and then using a Deterministic to switch on the discrete
var is a bit wasteful..

best,

Kai

2014-06-07 11:11 GMT+02:00 Thomas Wiecki notifications@github.com:

This looks really promising.

I wonder if this could replace the previous Metropolis sampler? We had
some odd problems and I wonder if this might solve it: #358
#358.

Also, could it help here: #443
#443 and #452
#452 ?

Also, how difficult do you think it would be to write some basic tests?
That would definitely go a long way.


Reply to this email directly or view it on GitHub
#547 (comment).


Reply to this email directly or view it on GitHub
#547 (comment).

Thomas Wiecki
PhD candidate, Brown University
Quantitative Researcher, Quantopian Inc, Boston

"""
class MvGaussianMixture(Continuous):

def __init__(self, shape, cat_var, mus, taus, model=None, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend using singular mu and tau rather than plurals. It should be obvious that they are vectors, since its explicitly a multivariate variable.

@jsalvatier
Copy link
Member

Thanks for the addition!

@twiecki
Copy link
Member

twiecki commented Aug 30, 2014

There are many different parts to this PR. E.g. the MH sampler is pretty straight-forward (and required by #599) while we don't seem absolutely clear on some others. @kadeng what are you thoughts on breaking this PR up into more distinct parts?

@fonnesbeck
Copy link
Member

Status update? This hasn't been updated in over a year.

@springcoil
Copy link
Contributor

@kadeng Is there any chance you could break this into more distinct parts. Since the MH sampler is a dependency. Could you create another branch and copy that in. Is the MH sampler ready to go?

@fonnesbeck
Copy link
Member

I am going to go ahead and close this. Please open a new request if there is ongoing interest.

@fonnesbeck fonnesbeck closed this Jan 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants