Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning velocities in sampler state when propagating replicas #602

Merged
merged 6 commits into from Jul 14, 2022

Conversation

ijpulidos
Copy link
Contributor

@ijpulidos ijpulidos commented Jul 13, 2022

Description

Velocities were not being passed when returning the sampler state when propagating replicas.

Resolves #531

Note that the bug is only present when MPI (multiple GPUs) are being used. We need to test our software using MPI as well (issue #603 )

Todos

  • Implement feature / fix bug
  • Add tests
  • Update documentation as needed
  • Update changelogNotable points that this PR has either accomplished or will accomplish.

Status

  • Ready to go

@ijpulidos ijpulidos added this to the 0.21.5 milestone Jul 13, 2022
@ijpulidos
Copy link
Contributor Author

I was also missing to change the same option when the replicas get updated inside the same loop. Should be ready to go now.

@zhang-ivy I think it makes sense if you can test this with your system before we merge this. Can you please do that? Thanks!

@zhang-ivy
Copy link
Contributor

@ijpulidos : Just reviewed, looks good! I will test the branch out now.

@zhang-ivy
Copy link
Contributor

zhang-ivy commented Jul 13, 2022

I do not see the issue anymore! :) Resumed after 200 iterations.

I also checked the checkpoint velocities directly, and I'm no longer seeing that they are zero for every other replica.

image

Copy link
Contributor

@mikemhenry mikemhenry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikemhenry mikemhenry merged commit 478cc7b into main Jul 14, 2022
@mikemhenry mikemhenry deleted the fix-broadcast-velocities branch July 14, 2022 13:02
@jfennick
Copy link
Contributor

Oops, sorry about this. Looks like ignore_velocities=False slipped past me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Augment checkpoint storage to save velocities
4 participants