You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have noticed recently that my sampling efficiency values obtained from my SampleContainer instances (ones without bug #441) using samples.sampling_efficiency()
is quite small. The sampling_efficiency (in smol.moca.sampler.container.SampleContainer) is defined as "total_accepted" divided by ("total MC steps" - "discarded samples"). Would it be more accurate to define sampling efficiency as "total_accepted" divided by ("number of samples" - "discarded samples") instead? The difference is that "total MC steps" includes the samples that were not saved from sample thinning during the MC run, so we do not know the acceptance rate of those discarded samples. What we only know for certain is the number of acceptances from the number of saved samples.
Code snippet
samples.sampling_efficiency()
Log output
No response
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
Yes, you are right! The current implementation is not correct when you thin the sampling by > 1. What you are proposing makes more sense, however, it is an "estimated" sampling efficiency for thin > 1.
If you get a chance can you please update this in a PR? and perhaps make a note of the above comment on estimation in the docstring.
Email (Optional)
kamronald@berkeley.edu
Version
v0.5.3
Which OS(es) are you using?
What happened?
Have noticed recently that my sampling efficiency values obtained from my SampleContainer instances (ones without bug #441) using
samples.sampling_efficiency()
is quite small. The sampling_efficiency (in smol.moca.sampler.container.SampleContainer) is defined as "total_accepted" divided by ("total MC steps" - "discarded samples"). Would it be more accurate to define sampling efficiency as "total_accepted" divided by ("number of samples" - "discarded samples") instead? The difference is that "total MC steps" includes the samples that were not saved from sample thinning during the MC run, so we do not know the acceptance rate of those discarded samples. What we only know for certain is the number of acceptances from the number of saved samples.
Code snippet
Log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: