`MCMCSamples` methods for burn-in removal and Gelman--Rubin statistic #258

lukashergt · 2023-02-08T01:42:28Z

This PR introduces new MCMCSamples methods that are particular to MCMC chains:

remove_burn_in: discards the first few samples
Gelman_Rubin: computes the Gelman--Rubin Rminus1 convergence statistic

Fixes #219

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works

For the upcoming additions to the `MCMCSamples` class of a `remove_burn_in` and a `Gelman_Rubin` method we need good example data that actually visibly require the removal of a burn in. The new example data was generated with an artificially low proposal scale that ensures a slow and hence visible burn-in process.

codecov · 2023-02-08T02:06:50Z

Codecov Report

Merging #258 (7ef7fba) into master (0aeaded) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #258   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           29        28    -1     
  Lines         2422      2442   +20     
=========================================
+ Hits          2422      2442   +20

Impacted Files	Coverage Δ
anesthetic/read/chain.py	`100.00% <100.00%> (ø)`
anesthetic/read/cobaya.py	`100.00% <100.00%> (ø)`
anesthetic/read/getdist.py	`100.00% <100.00%> (ø)`
anesthetic/samples.py	`100.00% <100.00%> (ø)`
anesthetic/weighted_pandas.py	`100.00% <100.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…rn_in`

lukashergt · 2023-02-08T03:18:39Z

With the new remove_burn_in method, we could now ask ourselves whether we would like to make this the only way of getting rid of burn-in samples, i.e. remove the burn_in kwargs in the readers and remove the read/utils.py file. This would be a major change, so should be part of the 2.0.0 milestone.

I'd be happy to handle it that way. That makes it quite clear that always the full data is read in. And if you would like to get rid of burn-in samples, then that should be an active part of post-processing. It can even be done within the same line as the chain reading: mcmc = read_chain("root").remove_burn_in(100).

Thoughts, @williamjameshandley?

lukashergt · 2023-02-08T03:20:44Z

I'm running into new CI issues (unrelated to the changes of this PR...), making me think that it might be better to turn off parallel tests again...

anesthetic/samples.py

williamjameshandley · 2023-02-08T09:48:31Z

I'm running into new CI issues (unrelated to the changes of this PR...), making me think that it might be better to turn off parallel tests again...

In the interests of sanity I have turned this off e9b0749

williamjameshandley

Excellent work! I presume this is functionality you've tested in the wild a little?

With the new remove_burn_in method, we could now ask ourselves whether we would like to make this the only way of getting rid of burn-in samples, i.e. remove the burn_in kwargs in the readers and remove the read/utils.py file. This would be a major change, so should be part of the 2.0.0 milestone.
I'd be happy to handle it that way. That makes it quite clear that always the full data is read in. And if you would like to get rid of burn-in samples, then that should be an active part of post-processing. It can even be done within the same line as the chain reading: mcmc = read_chain("root").remove_burn_in(100).

Providing this is clear in the examples then this is indeed conceptually neater. We would need to put in a temporary error which informs users in the initial release in how to update their code.

You should go ahead and remove the corresponding code for burn in at read-time

anesthetic/samples.py

tests/test_samples.py

* Allow negative `burn_in` inputs, which specify the last samples to keep, as opposed to positive `burn_in` inputs, which specify the first samples to remove. * Fix list/array input to `burn_in`, to work for both `0<abs(burn_in)<1` and `1<abs(burn_in)`, matching the behaviour with scalar input.

…to new usage

…xception instead

lukashergt · 2023-02-09T07:32:31Z

Excellent work! I presume this is functionality you've tested in the wild a little?

Yes, I played a bit with this for the BeyondPlanck stuff.

Providing this is clear in the examples then this is indeed conceptually neater. We would need to put in a temporary error which informs users in the initial release in how to update their code.

You should go ahead and remove the corresponding code for burn in at read-time

Done in 0c65b44.

williamjameshandley

Looks good -- please squash and merge!

anesthetic/read/chain.py

lukashergt added 7 commits February 7, 2023 17:01

raise error if unused kwargs are being passed to cov method

1925f3c

add test to cover not passed kwargs to cov when weighted

81e0c23

add remove_burn_in and Gelman_Rubin methods for MCMCSamples class

d158266

fix flake8 white space issues

0a54d15

remove debugging print statement

84eec60

add tests for burn-in removal and Gelman--Rubin statistic

fb61f42

lukashergt added the enhancement New feature or request label Feb 8, 2023

lukashergt self-assigned this Feb 8, 2023

lukashergt added 3 commits February 7, 2023 17:45

fix flake8 issue

7a0a5d0

change test for cobaya data because new chains are a little shorter

36ef9ca

add latex labels to cb_single_chain.*.yaml files

3b1f552

lukashergt added 6 commits February 7, 2023 18:22

add option to reset the index to remove_burn_in method

a40bdf3

add tests for reset_index=True and for inplace=True in `remove_bu…

563098e

…rn_in`

relax burn_in checks to be more inclusive

03d0cfd

turn off parallel testing for now

e792868

turn on parallel testing again

d840ff9

fix tests for coverage

36cbc95

lukashergt requested a review from williamjameshandley February 8, 2023 03:20

lukashergt commented Feb 8, 2023

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

Turned of parallel testing

e9b0749

williamjameshandley requested changes Feb 8, 2023

View reviewed changes

anesthetic/samples.py Outdated Show resolved Hide resolved

anesthetic/samples.py Outdated Show resolved Hide resolved

anesthetic/samples.py Show resolved Hide resolved

tests/test_samples.py Outdated Show resolved Hide resolved

williamjameshandley and others added 4 commits February 8, 2023 13:25

Removed all parallel testing

6f8222e

simplify syntax

7b7af3d

extend tests to cover all cases of possible burn_in inputs

95b9026

lukashergt added 2 commits February 8, 2023 23:19

remove burn_in from read_chains, replace with Exception pointing …

0c65b44

…to new usage

adapt tests to removed burn_in keyword in read_chains, test for E…

7ef7fba

…xception instead

lukashergt requested a review from williamjameshandley February 9, 2023 07:38

williamjameshandley approved these changes Feb 9, 2023

View reviewed changes

anesthetic/read/chain.py Show resolved Hide resolved

lukashergt merged commit 56cc1c4 into handley-lab:master Feb 9, 2023

lukashergt mentioned this pull request Feb 9, 2023

WeightedDataFrame.groupby.mean does not produce weighted means #260

Closed

williamjameshandley mentioned this pull request Apr 5, 2023

MCMCSamples methods #277

Open

lukashergt deleted the mcmc_stats branch April 14, 2023 21:33

lukashergt restored the mcmc_stats branch April 14, 2023 21:33

lukashergt deleted the mcmc_stats branch April 14, 2023 21:33

williamjameshandley mentioned this pull request Jan 10, 2024

tracking convergence of the chain PolyChord/PolyChordLite#91

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`MCMCSamples` methods for burn-in removal and Gelman--Rubin statistic #258

`MCMCSamples` methods for burn-in removal and Gelman--Rubin statistic #258

lukashergt commented Feb 8, 2023 •

edited

Loading

codecov bot commented Feb 8, 2023 •

edited

Loading

lukashergt commented Feb 8, 2023

lukashergt commented Feb 8, 2023

williamjameshandley commented Feb 8, 2023

williamjameshandley left a comment •

edited

Loading

lukashergt commented Feb 9, 2023

williamjameshandley left a comment

MCMCSamples methods for burn-in removal and Gelman--Rubin statistic #258

MCMCSamples methods for burn-in removal and Gelman--Rubin statistic #258

Conversation

lukashergt commented Feb 8, 2023 • edited Loading

Checklist:

codecov bot commented Feb 8, 2023 • edited Loading

Codecov Report

lukashergt commented Feb 8, 2023

lukashergt commented Feb 8, 2023

williamjameshandley commented Feb 8, 2023

williamjameshandley left a comment • edited Loading

Choose a reason for hiding this comment

lukashergt commented Feb 9, 2023

williamjameshandley left a comment

Choose a reason for hiding this comment

`MCMCSamples` methods for burn-in removal and Gelman--Rubin statistic #258

`MCMCSamples` methods for burn-in removal and Gelman--Rubin statistic #258

lukashergt commented Feb 8, 2023 •

edited

Loading

codecov bot commented Feb 8, 2023 •

edited

Loading

williamjameshandley left a comment •

edited

Loading