Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining sampling efficiency: normalized by total steps or number of saved samples? #442

Closed
3 of 4 tasks
kamronald opened this issue Nov 27, 2023 · 2 comments
Closed
3 of 4 tasks
Assignees
Labels
bug Something isn't working triage

Comments

@kamronald
Copy link
Collaborator

Email (Optional)

kamronald@berkeley.edu

Version

v0.5.3

Which OS(es) are you using?

  • MacOS
  • Windows
  • Linux

What happened?

Have noticed recently that my sampling efficiency values obtained from my SampleContainer instances (ones without bug #441) using
samples.sampling_efficiency()
is quite small. The sampling_efficiency (in smol.moca.sampler.container.SampleContainer) is defined as "total_accepted" divided by ("total MC steps" - "discarded samples"). Would it be more accurate to define sampling efficiency as "total_accepted" divided by ("number of samples" - "discarded samples") instead? The difference is that "total MC steps" includes the samples that were not saved from sample thinning during the MC run, so we do not know the acceptance rate of those discarded samples. What we only know for certain is the number of acceptances from the number of saved samples.

Code snippet

samples.sampling_efficiency()

Log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kamronald kamronald added bug Something isn't working triage labels Nov 27, 2023
@lbluque
Copy link
Collaborator

lbluque commented Dec 1, 2023

Yes, you are right! The current implementation is not correct when you thin the sampling by > 1. What you are proposing makes more sense, however, it is an "estimated" sampling efficiency for thin > 1.

If you get a chance can you please update this in a PR? and perhaps make a note of the above comment on estimation in the docstring.

Thanks a lot @kamronald !

@lbluque
Copy link
Collaborator

lbluque commented Dec 18, 2023

addressed in #443

@lbluque lbluque closed this as completed Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants