Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamicHMC posteriors fail tests #63

Closed
ChrisRackauckas opened this issue Dec 1, 2018 · 14 comments
Closed

DynamicHMC posteriors fail tests #63

ChrisRackauckas opened this issue Dec 1, 2018 · 14 comments

Comments

@ChrisRackauckas
Copy link
Member

As discussed extensively in #60 with @tpapp , DynamicHMC's sampler seems to be getting stuck in odd areas of parameter space and giving incorrect posteriors. For now the tests are set as @test_broken for this.

@tpapp
Copy link

tpapp commented Dec 1, 2018

I have not forgotten about this, but understanding why the sampler fails will require a major rewrite of DynamicHMC's core (to obtain more information) and currently I have no time for this.

@ChrisRackauckas
Copy link
Member Author

No worries. I just wanted to make sure it was kept documented so that users who run into it can find the relevant information.

@simonbyrne
Copy link

I just tried running the test cases, but it is way too slow (~9.3 seconds per step) to even figure out what is going on.

@tpapp
Copy link

tpapp commented Feb 10, 2019

@simonbyrne: I suspect that either

  1. the adaptation (for the stepsize ϵ),
  2. the rejection of divergent paths

is not robust enough, or possibly has a bug. Comparing

  1. adapted stepsize,
  2. usual tree depth & number of leapfrog steps
  3. average acceptance ratio

to output from Stan would be helpful.

The upcoming rewrite (tpapp/DynamicHMC.jl#30) of DynamicHMC will make things more modular, have a lot more validation and tests (incl comparisons with Stan), and should address this issue. Unfortunately, it is a major undertaking (I estimate at least 2 weeks) and I am not sure when I will have time for it.

@tpapp
Copy link

tpapp commented Aug 13, 2019

Thanks for your patience. The new (WIP) API of DynamicHMC contains much better diagnostic tools and I have been investigating this.

First, I created an MWE with the updated syntax and my current preferred coding style (using NamedTuples etc) for DynamicHMC problems:

https://gist.github.com/tpapp/569bdef475c0a44c46d5c9e6d5e9e493

IMPORTANT you need DynamicHMC#master and LogDensityProblems 0.9 for this to work.

Eventually, once this is issue is resolved, I will make a PR to this repo updating all the examples.

Back to the problem: the leapfrog integrator is encountering divergence, so it is adapting a tiny, tiny stepsize. This is while it stays in one place for most of the time. This is usually caused by incorrect gradients (or high curvature), so I investigated this and found that compared to numerical finite differences, the AD result is incorrect. Plots below, code can generate these.

ad_vs_numerical_rel
ad_vs_numerical

In the previous discussion of #60, I remember a comparison of the derivatives, and then they were not problematic. So I am not sure if this is coming from an incorrect application of the DiffEq tools.

If someone familiar with DiffEq could investigate just the correctness of AD derivatives, it would be appreciated.

@ChrisRackauckas
Copy link
Member Author

Are you sure it's a correctness issue with AD? It looks like it could be a correctness issue with finite differences, which would be unstable near the bifurcation. Is there an easy way to have it kick out what parameters are used at the trouble point?

@tpapp
Copy link

tpapp commented Aug 13, 2019

I will make a repository with the data and the observables saved into some externalized format.

@tpapp
Copy link

tpapp commented Aug 14, 2019

Update: the derivatives were a red herring, they are fine. The problem was that NUTS is not really good at coping with high curvature and low dimensions (specifically, 1), and while all the other backends (Stan, Turing) examples were also estimating the sigmas at the same time, the DynamicHMC test wasn't, just a.

This is now fixed and an MWE is available at https://github.com/tpapp/DiffEqBayesDHMC

I will make a PR against this repository once DynamicHMC 2.0 is released, fixing all the tests and adapting to the new interface.

@ChrisRackauckas
Copy link
Member Author

The problem was that NUTS is not really good at coping with high curvature and low dimensions (specifically, 1), and while all the other backends (Stan, Turing) examples were also estimating the sigmas at the same time, the DynamicHMC test wasn't, just a.

🤦‍♂

I will make a PR against this repository once DynamicHMC 2.0 is released, fixing all the tests and adapting to the new interface.

Thanks in advance!

@Vaibhavdixit02
Copy link
Member

That sounds great. Thanks @tpapp
I am excited to see the performance gains that DynamicHMC would give over Turing :)

@tpapp
Copy link

tpapp commented Sep 6, 2019

I ended up not making a PR since the Turing code in DiffEqBayes uses the old DynamicHMC interface, and got some precompilation problems with the dependency chain of DiffEqBayes that I have no time to investigate at the moment. Instead, I made a self-contained demo package (code and tests)

https://github.com/tpapp/DiffEqDynamicHMC.jl

and documented everything in detail. The code is written in a style I consider idiomatic for using DynamicHMC. I think it is a good starting point for a PR, ie essentially one would just need to copy-paste it as is. Please feel free to use it in any way you prefer.

I am using ForwardDiff for this since that is robust on CI, but plugging in Flux and other AD packages is of course allowed via an optional argument.

The code is in fact embarrassingly simple --- defining a callable is useful, but the dynamichmc_inference wrapper is not much simpler than just using the building blocks directly.

This makes me wonder whether for someone using DynamicHMC, a worked example of how to estimate a differential equation using the DiffEq ecosystem would be more useful than a packaged version: the wrapper is so thin that it may not be worth it, and it would allow more intuitive tweaking which I am sure is necessary for complicated models. Personally, that is what I would prefer, but YMMV.

@Vaibhavdixit02
Copy link
Member

Thanks @tpapp

@ChrisRackauckas
Copy link
Member Author

This makes me wonder whether for someone using DynamicHMC, a worked example of how to estimate a differential equation using the DiffEq ecosystem would be more useful than a packaged version: the wrapper is so thin that it may not be worth it, and it would allow more intuitive tweaking which I am sure is necessary for complicated models. Personally, that is what I would prefer, but YMMV.

This repo kind of is a bunch of worked examples with CI. We do use this for documenting how to interface with the packages (pointing people to the code if they want to do more) and as an easy way to benchmark the different inference methods. So I think it would be good to just replace our current implementation with this whenever that issue with Turing is fixed.

@ChrisRackauckas
Copy link
Member Author

Thanks! The packaged version was updated to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants