-
-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variance reduction for reparametrized ELBo #63
Comments
splitting this off from #42 in order to keep issues somewhat approachable. |
Sticking the Landing is also relevant, and might be worth considering. Perhaps best implemented as a control variate. (I had a quick play with this in WebPPL a while back.) |
I've put some thought into what it would take to support reparameterized accept/reject samplers. (I'm not sure if anyone else has looked at this already?) I think the key change will be to add support for the idea of a partially reparameterized distribution. i.e. one in which the base distribution retains a dependency on the parameters. (Reparameterizing a accept/reject sampler produces a distribution of this type.) Inference algorithms that assume distributions are fully reparameterized will need updating to correctly handle the partially reparameterized case. For the ELBO estimator, a partially reparameterized choice will have both reinforce and path-wise terms. The reinforce term requires us to compute the log density of the base sample under the base distribution. (Something we don't have to do for fully reparameterized distributions.) AFAICT this isn't possible with pyro's current distribution interface, so we may need to tweak that. To help get a feel for this, I made an initial attempt at adding the reparameterized gamma sampler to webppl. The way this works is that I added two new methods to distributions: This seems to works OK on a super simple model. An obvious next step would be to extend this to other distributions to test the interfaces. |
Note: Related comment here. |
@null-a Cool stuff! In your opinion how important/valuable is this? Is the variance reduction worth the effort? Is it something we should consider adding to Pyro before release, or before splitting off the distributions library? |
@eb8680 I guess it depends on our goals, so perhaps we should let the anchor models drive this. If models with Gamma/Beta/Dirichlet choices end up in that set, then I imagine that having something other than vanilla reinforce for these will be worth the effort. The two most promising approaches I know of are this and the use of a transformed Gaussian (or other fully reparameterizable distribution) as a guide. I don't know whether one of these is strictly superior or both have their place. ETA: Section 4 of the paper mentions one reason to think that reparam for accept/reject might be better than the transformation approach, and least in some settings. i.e. The transformation approach can't accurately approximate densities with singularities. |
For reference, The Generalized Reparameterization Gradient is another, but reparameterized accept/reject probably has lower variance. |
this is neat. i suspect that partially reparametrized distributions are the more general case, and therefore a good move anyhow. i don't have a super strong opinion about rejection based samplers, but i think it is promising enough to try out. my suggestion would be for you to add one partially reparametrized dist (eg gamma) and make an extended version of the elbo inference method that uses it. then we can all think through the code and interfaces to see if there are any tweaks we want to make before converting all the dists to this style. (btw i think there might be a more general idiom for making distributions compositionally out of sampling pieces, deterministic pieces, and scorers. with a contract something like having a complete scorer after every composition step....) |
Note: but in terms of prioritizing this wrt other extensions, I agree that being guided by anchor models is probably best at this point! |
Yeah, I was thinking I might come back and do that once #64 is merged and the anchor models demand it.
I agree. I poked around with this for a while, but didn't arrive at anything satisfactory, so went with what I have here. Implicit models might fit here too. |
@null-a cool, interesting stuff. in the context of accept/reject sampling wouldn't it be sufficient to do something like the following? -- give the distribution class a as far as i can tell, this would construct the right estimator. (although one might need to take care that certain gradients are being blocked, but i think this would basically be automatic.) or am i missing something? one can imagine that other complex distributions with hidden/unexposed RVs could fit into the same framework, at least in certain cases. |
@martinjankowiak Thanks! I don't think I fully understand the suggestion, since I don't see where That aside, it seems that |
@null-a we're probably ultimately thinking along the same lines (modulo possible difference in interface). when i have a chance i'll see about implementing a v0. it'll be easier to discuss adequacy/shortcomings with something concrete |
@martinjankowiak what is the status of this? i dont know if youve looked back at this since your fancy variance-reduced estimators.. otherwise can we close this in favor of concrete tasks? |
FWIW, I'm working on a PR for almost this in Edward. Perhaps I can do a port to Pyro when done. Would be 🌴 💯 ☀️ to be back working in PyTorch land... |
@cavaunpeu we've implemented RSVI in #659 which should be merged within the next week or two (just needs some clean-up and tests). |
Super! |
RSVI and Sticking the Landing are already in Pyro. I'm closing this issue in favor of more targeted issues. |
For the reparameterizable case (courtesy @eb8680):
The text was updated successfully, but these errors were encountered: