-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variational Inference for TuringLang #775
Comments
Regarding point 2, @rpinsler gave a very interesting presentation on developments in this area recently, that we should maybe look at to decide what to target. |
I suspect that mean-field VI is probably going to be the easiest to implement first, and it'll be easier to build up the infrastructure to support the more complex VI stuff like ADVI and stochastic VI. I think the infrastructure part will be very important to do in a reasonable way, because it seems like there's a fair amount of heterogeneity in VI methods. We should highlight this to the JSoC people: @sharanry and @torfjelde. Welcome to both of you! I really enjoyed reading Variational Inference: A Review for Statisticians for an overview of the VI-verse. Figured I should throw this paper in here while we're tossing knowledge around. |
Thank you @cpfiffer ! I would also like to add Operator Variational Inference (OPVI) [1] to the bunch. It's basically a more general description of VI which a lot of standard VI methods are instantiations of, e.g. AVDI, Stein Variational Gradient Descent. This seems like a promising design, and is also the approach taken by Worth noting that ADVI and Normalizing flows will depend heavily on I'll be done with exams on Friday, so then I'll start looking more into this.
|
Thanks for the welcome @cpfiffer! As discussed in last week's meeting, I believe that we need a lot of discussion on the design of this project. Especially in order to be able to have a common framework for these methods which will enable us to add new methods/techniques in the future with relative ease. @xukai92 suggested Mnih and Rezende (2016) as a good starting point to design abstraction of VI which would support the commonly used ELBO and IWAE. Mean Field VI keeping the abstractions for other techniques in mind might be a good starting point? I have exams till Tuesday, 14th. After which I will start with the suggested literature and hopefully contribute to this discussion. |
Is there still demand for Also, I think having Kernel Stein Discrepancy for MCMC diagnostic would be useful. |
That's great, and there is indeed! Feel free to give it a go and just ask me if there's anything (you can find me on the Julia slack as torfjelde). As VI is still heavily under development, the interface is yet not documented. If you want, I can point you to a blog post of mine where I go through a rather contrived example of implementing CAVI in this interface. Or you can just checkout the implementation of
Funny you mention that; I'm fairly familiar with the KSD and implemented some stuff over at https://github.com/torfjelde/KernelGoodnessOfFit.jl. For example I have an implementation of the KSD as a goodness-of-fit test. Seems like what you're looking for? I wasn't sure if people in the Bayesian community were actually using the KSD as a diagnostic, though I recall it being mentioned as a use-case. Development of that package halted for a bit as I've been occupied with Turing for the moment, but my intention is to develop it further at some later point. And if there is an interest and it intersects with Turing.jl's interests as a MCMC diagnostic, the "later" quickly becomes "soon":) |
I'm personally not a statistician, and indeed KSD methods don't seem to be a popular tool at the moment. However, I think if KSD is right at the fingertip of the users of Turing.jl we might see a gain in popularity? Especially since there seem to be a distrust against Effective Sample Size (ESS) as a metric for goodness-of-fit. It could also be used to quantify goodness-of-fit of various variational inference models apart from the good old KL divergence.
Great! I'll get in touch once my current work is done. (I think it would take a few months though) |
We can consider adding support for the following variational inference methods to Turing
Comment: Project 3 should be interesting and also relatively easy since we have HMC support already.
Related projects:
[1]: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic differentiation variational inference. The Journal of Machine Learning Research, 18(1), 430-474.
[2]: Rezende, D. J., & Mohamed, S. (2015). Variational inference with normalizing flows. arXiv preprint arXiv:1505.05770.
[3]: Salimans, T., Kingma, D., & Welling, M. (2015, June). Markov chain Monte Carlo and variational inference: Bridging the gap. In International Conference on Machine Learning (pp. 1218-1226).
[4]: Kucukelbir, A., Ranganath, R., Gelman, A., & Blei, D. (2015). Automatic variational inference in Stan. In Advances in neural information processing systems (pp. 568-576).
[5]: Hernández-Lobato, J. M., Li, Y., Rowland, M., Hernández-Lobato, D., Bui, T., & Turner, R. (2016). Black-box α-divergence minimization.
cc @cpfiffer @xukai92 @willtebbutt @mohamed82008
The text was updated successfully, but these errors were encountered: