Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variance reduction for reparametrized ELBo #63

Closed
ngoodman opened this issue Jul 21, 2017 · 18 comments
Closed

Variance reduction for reparametrized ELBo #63

ngoodman opened this issue Jul 21, 2017 · 18 comments
Milestone

Comments

@ngoodman
Copy link
Collaborator

ngoodman commented Jul 21, 2017

For the reparameterizable case (courtesy @eb8680):

@ngoodman
Copy link
Collaborator Author

splitting this off from #42 in order to keep issues somewhat approachable.

@null-a
Copy link
Collaborator

null-a commented Aug 7, 2017

Sticking the Landing is also relevant, and might be worth considering. Perhaps best implemented as a control variate.

(I had a quick play with this in WebPPL a while back.)

@null-a
Copy link
Collaborator

null-a commented Aug 16, 2017

I've put some thought into what it would take to support reparameterized accept/reject samplers. (I'm not sure if anyone else has looked at this already?)

I think the key change will be to add support for the idea of a partially reparameterized distribution. i.e. one in which the base distribution retains a dependency on the parameters. (Reparameterizing a accept/reject sampler produces a distribution of this type.)

Inference algorithms that assume distributions are fully reparameterized will need updating to correctly handle the partially reparameterized case. For the ELBO estimator, a partially reparameterized choice will have both reinforce and path-wise terms. The reinforce term requires us to compute the log density of the base sample under the base distribution. (Something we don't have to do for fully reparameterized distributions.) AFAICT this isn't possible with pyro's current distribution interface, so we may need to tweak that.

To help get a feel for this, I made an initial attempt at adding the reparameterized gamma sampler to webppl. The way this works is that I added two new methods to distributions: sampleReparam generates a sample from the base distribution and passes it through the
(auto-differentiable) transform (following current pyro), and returns a pair of the transformed sample and the base sample. The method baseScore can then be used to compute the log density of the base sample, if required. The code is here.

This seems to works OK on a super simple model. An obvious next step would be to extend this to other distributions to test the interfaces.

@null-a
Copy link
Collaborator

null-a commented Aug 16, 2017

Note: Related comment here.

@eb8680
Copy link
Member

eb8680 commented Aug 17, 2017

@null-a Cool stuff! In your opinion how important/valuable is this? Is the variance reduction worth the effort? Is it something we should consider adding to Pyro before release, or before splitting off the distributions library?

@null-a
Copy link
Collaborator

null-a commented Aug 17, 2017

Is it something we should consider adding to Pyro before release

@eb8680 I guess it depends on our goals, so perhaps we should let the anchor models drive this.

If models with Gamma/Beta/Dirichlet choices end up in that set, then I imagine that having something other than vanilla reinforce for these will be worth the effort.

The two most promising approaches I know of are this and the use of a transformed Gaussian (or other fully reparameterizable distribution) as a guide. I don't know whether one of these is strictly superior or both have their place.

ETA: Section 4 of the paper mentions one reason to think that reparam for accept/reject might be better than the transformation approach, and least in some settings. i.e. The transformation approach can't accurately approximate densities with singularities.

@null-a
Copy link
Collaborator

null-a commented Aug 17, 2017

The two most promising approaches I know of are this and the use of a transformed Gaussian

For reference, The Generalized Reparameterization Gradient is another, but reparameterized accept/reject probably has lower variance.

@ngoodman
Copy link
Collaborator Author

ngoodman commented Aug 17, 2017

this is neat. i suspect that partially reparametrized distributions are the more general case, and therefore a good move anyhow. i don't have a super strong opinion about rejection based samplers, but i think it is promising enough to try out.

my suggestion would be for you to add one partially reparametrized dist (eg gamma) and make an extended version of the elbo inference method that uses it. then we can all think through the code and interfaces to see if there are any tweaks we want to make before converting all the dists to this style.

(btw i think there might be a more general idiom for making distributions compositionally out of sampling pieces, deterministic pieces, and scorers. with a contract something like having a complete scorer after every composition step....)

@ngoodman
Copy link
Collaborator Author

Note: but in terms of prioritizing this wrt other extensions, I agree that being guided by anchor models is probably best at this point!

@null-a
Copy link
Collaborator

null-a commented Aug 17, 2017

my suggestion would be for you to add one partially reparametrized dist (eg gamma) and make an extended version of the elbo inference method that uses it

Yeah, I was thinking I might come back and do that once #64 is merged and the anchor models demand it.

btw i think there might be a more general idiom for making distributions compositionally out of sampling pieces, deterministic pieces, and scorers

I agree. I poked around with this for a while, but didn't arrive at anything satisfactory, so went with what I have here. Implicit models might fit here too.

@martinjankowiak
Copy link
Collaborator

@null-a cool, interesting stuff. in the context of accept/reject sampling wouldn't it be sufficient to do something like the following?

-- give the distribution class a score_function_term() method
-- by default it reverts to log_pdf
-- for a distribution of the accept/reject type, override score_function_term() with log q - log r
-- when constructing the elbo in the presence of non-reparameterizable distributions (which would include
accept/reject distributions) use score_function_term() when constructing the gradient estimator

as far as i can tell, this would construct the right estimator. (although one might need to take care that certain gradients are being blocked, but i think this would basically be automatic.)

or am i missing something?

one can imagine that other complex distributions with hidden/unexposed RVs could fit into the same framework, at least in certain cases.

@null-a
Copy link
Collaborator

null-a commented Aug 24, 2017

@martinjankowiak Thanks!

I don't think I fully understand the suggestion, since I don't see where logq-logr comes from? (I would have expected just logr perhaps.)

That aside, it seems that score_function_term() would need to take the value sampled from the base distribution in order to compute logr(), and adding a way of getting hold of that brings you back to something similar my approach? (But maybe I'm missing something!)

@martinjankowiak
Copy link
Collaborator

martinjankowiak commented Aug 24, 2017

@null-a we're probably ultimately thinking along the same lines (modulo possible difference in interface). when i have a chance i'll see about implementing a v0. it'll be easier to discuss adequacy/shortcomings with something concrete

@jpchen
Copy link
Member

jpchen commented Dec 17, 2017

@martinjankowiak what is the status of this? i dont know if youve looked back at this since your fancy variance-reduced estimators.. otherwise can we close this in favor of concrete tasks?

@martinjankowiak martinjankowiak added this to the 0.2 release milestone Jan 10, 2018
@cavaunpeu
Copy link

FWIW, I'm working on a PR for almost this in Edward. Perhaps I can do a port to Pyro when done. Would be 🌴 💯 ☀️ to be back working in PyTorch land...

@fritzo
Copy link
Member

fritzo commented Jan 21, 2018

@cavaunpeu we've implemented RSVI in #659 which should be merged within the next week or two (just needs some clean-up and tests).

@cavaunpeu
Copy link

Super!

@fritzo
Copy link
Member

fritzo commented Apr 20, 2018

RSVI and Sticking the Landing are already in Pyro. I'm closing this issue in favor of more targeted issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants