-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCMCSamples
methods for burn-in removal and Gelman--Rubin statistic
#258
Conversation
For the upcoming additions to the `MCMCSamples` class of a `remove_burn_in` and a `Gelman_Rubin` method we need good example data that actually visibly require the removal of a burn in. The new example data was generated with an artificially low proposal scale that ensures a slow and hence visible burn-in process.
Codecov Report
@@ Coverage Diff @@
## master #258 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 29 28 -1
Lines 2422 2442 +20
=========================================
+ Hits 2422 2442 +20
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
With the new I'd be happy to handle it that way. That makes it quite clear that always the full data is read in. And if you would like to get rid of burn-in samples, then that should be an active part of post-processing. It can even be done within the same line as the chain reading: Thoughts, @williamjameshandley? |
I'm running into new CI issues (unrelated to the changes of this PR...), making me think that it might be better to turn off parallel tests again... |
In the interests of sanity I have turned this off e9b0749 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! I presume this is functionality you've tested in the wild a little?
With the new remove_burn_in method, we could now ask ourselves whether we would like to make this the only way of getting rid of burn-in samples, i.e. remove the burn_in kwargs in the readers and remove the read/utils.py file. This would be a major change, so should be part of the 2.0.0 milestone.
I'd be happy to handle it that way. That makes it quite clear that always the full data is read in. And if you would like to get rid of burn-in samples, then that should be an active part of post-processing. It can even be done within the same line as the chain reading: mcmc = read_chain("root").remove_burn_in(100).
Providing this is clear in the examples then this is indeed conceptually neater. We would need to put in a temporary error which informs users in the initial release in how to update their code.
You should go ahead and remove the corresponding code for burn in at read-time
* Allow negative `burn_in` inputs, which specify the last samples to keep, as opposed to positive `burn_in` inputs, which specify the first samples to remove. * Fix list/array input to `burn_in`, to work for both `0<abs(burn_in)<1` and `1<abs(burn_in)`, matching the behaviour with scalar input.
Yes, I played a bit with this for the BeyondPlanck stuff.
Done in 0c65b44. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good -- please squash and merge!
This PR introduces new
MCMCSamples
methods that are particular to MCMC chains:remove_burn_in
: discards the first few samplesGelman_Rubin
: computes the Gelman--RubinRminus1
convergence statisticFixes #219
Checklist:
flake8 anesthetic tests
)pydocstyle --convention=numpy anesthetic
)python -m pytest
)