Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring of MC moves #21

Merged
merged 47 commits into from
Feb 27, 2024
Merged

Refactoring of MC moves #21

merged 47 commits into from
Feb 27, 2024

Conversation

wiederm
Copy link
Member

@wiederm wiederm commented Jan 18, 2024

Description

Currently, we have MC moves designed for displacement proposals, implementing the proposal in the MetropolisedDisplacementMove class that inherits from MetropolizedMove, which implements the computation for the acceptance criteria (proposal probability).
This lacks flexibility for other types of MC moves with different calculations for the acceptance criteria (like a MC Barostate).

Here, we want to consolidate the different approaches

Todos

  • [ ]

Status

  • Ready to go

@wiederm wiederm changed the base branch from main to multistage January 18, 2024 15:59
@wiederm wiederm linked an issue Jan 18, 2024 that may be closed by this pull request
wiederm and others added 9 commits January 18, 2024 22:36
# Conflicts:
#	chiron/mcmc.py
#	chiron/tests/test_multistate.py
… reporters and random number scheme. This primarily sketches out the new MCMove class (updates to the actual moves is forthcoming).
# Conflicts:
#	chiron/mcmc.py
#	chiron/tests/test_multistate.py
… takes variable to allow for velocity initialization. The Langevin ntegrator now returns a sampler state rather tha updating. States now has a velocity setter.
… uses this if users request velocities to be regenerated each time. The reaosn to split was to make it easy to define velocities outside of the langevin integrator. the option flag to generate each time we call the integrator because in a hybrid workflow, there would effectively be no connection between subsequent langevin moves separated by many MC transformations. furthermore, MC moves that changes thermodynamic states would require regeneration.
@chrisiacovella chrisiacovella self-assigned this Jan 25, 2024
…ach move to have custom/appropriate logging for what is being changed).
…e a subset of atoms: tests need to be updated/implemented still.
…led. This is different than reinitialize which will call reinit every time run is called. The velocity init function was spun off into the utils file. velocities can be set in the sampler state; code will throw and error if initialize_velocities or reinitialize_velocities are false, and velocities are not set in the sampler state.
…ne, since this PR was focused on revamping the MC moves. That can be tackled separately in the multistate PR.
…e with the functional programming model (passing and return a neighborlist).
…in to take into account the iteration not just the current step; reporters now also log the elapsed_step.
@chrisiacovella
Copy link
Member

chrisiacovella commented Jan 31, 2024

I think this is ready to go. The meat of this has been discussed in issue #20. I'll note that following #23 we no longer modify the sampler_state, thermodynamic_state and neighbor_list in place. These are now returned by functions.

This should make it much easier to keep track of current/proposed states, rather than trying reset/rollback changes made to a sampler_state. The one caveat is having to make sure we update the current_PRNG_key if we reject a state so we don't end up just proposing the same move over and over again. This is done in code that is in the base class, so users extending won't have to think about that.

For book keeping, note that this incorporates the MC barostat that was implemented in #14 , but uses the modified API.

@chrisiacovella
Copy link
Member

@wiederm Since you opened this, I can't suggest you as a reviewer, but could you review it?

…in to take into account the iteration not just the current step; reporters now also log the elapsed_step. Updated the examples
…hey were marked to be skipped because they take too long.
Copy link
Member Author

@wiederm wiederm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very happy with these changes! I left a view comments, but I think we can merge this!

.github/workflows/CI.yaml Show resolved Hide resolved
Examples/Idealgas.py Show resolved Hide resolved
n_particles = 216
temperature = 298 * unit.kelvin
pressure = 1 * unit.atmosphere
mass = unit.Quantity(39.9, unit.gram / unit.mole)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to obtain the mass from the IdealGas object to minimize our responsibility for ensuring that the mass is set to the same value?

It seems the topology instance of the ideal_gas instance should have this information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Good catch.

I'm wondering if it would be good to have examples be self contained (i.e., not require openmmtools). I think the openmmtools dependence is definitely good for tests/unit tests. It would definitely not be hard to have some simple routines for initializing particles. I don't think we need to do that at this time.

Examples/Idealgas.py Show resolved Hide resolved
Examples/Idealgas.py Outdated Show resolved Hide resolved
chiron/mcmc.py Outdated Show resolved Hide resolved
chiron/mcmc.py Outdated
proposed_sampler_state, proposed_nbr_list
)

# ⎡−β (ΔU + PΔV ) + N ln(V new /V old )⎤
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to give a reference for this, e.g.
eq.33, http://dx.doi.org/10.1155/2015/183918

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could go back to the original source: https://www.tandfonline.com/doi/abs/10.1080/00268977200100031

chiron/mcmc.py Outdated Show resolved Hide resolved
chiron/mcmc.py Outdated
Comment on lines 666 to 667
current_reduced_pot : float, required
Current reduced potential.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier if we are explicit here and add energy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my prior comment. I don't think we should add energy to the name.

chiron/mcmc.py Outdated
Comment on lines 849 to 859
def _update_stepsize(self):
"""
Update the volume_max_scale parameter to ensure our acceptance probability is within the range of 0.25 to 0.75.
The maximum volume_max_scale will be capped at 0.3.
"""
acceptance_ratio = self.n_accepted / self.n_proposed
if acceptance_ratio < 0.25:
self.volume_max_scale /= 1.1
elif acceptance_ratio > 0.75:
self.volume_max_scale = min(self.volume_max_scale * 1.1, 0.3)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to refactor this into one of the base classes? This seems to be super helpful for general MC moves. But, it might also be specific enough so that abstracting it becomes difficult, then we can just leave it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined it in the base class as an abstract method. I don’t think this will be general enough to put the algorithm in the base class (this is pretty specific to simple displacement and volume changes moves…would not apply to a insertion/deletion move). We could make this a class as I suggested in an issue #26 , that we pass to the init function to set up the protocol (and nothing is adjusted if not set).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #26, we discussed creating something like an Autotune object that handles auto-tuning behavior. That could provide an API for auto-tuning operations that adapt parameters like these, and could eventually implement asymptotically consistent approaches to make sure that we don't cause issues with sampling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think creating an Autotune class will be a good idea, especially for anything more advanced. I think that will be easy to implement in a separate PR so we can better discuss the API we want.

@wiederm wiederm mentioned this pull request Feb 8, 2024
3 tasks
Copy link
Member

@chrisiacovella chrisiacovella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think addressed all of the comments. I think we are ready to merge after @wiederm you have another look through.

@chrisiacovella
Copy link
Member

The Multistate reporter is failing after syncing with the multistate branch. I'm ok with merging without fixing that here (and punting to the main multistate branch), but I'll try to look at it more closely tomorrow.

Copy link
Member

@jchodera jchodera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress, folks!

Here are some comments on API consistency that I identified in reading across the examples and implementations.

Examples/Idealgas.py Show resolved Hide resolved
# initialize the displacement move
mc_barostat_move = MonteCarloBarostatMove(
volume_max_scale=0.2,
nr_of_moves=10,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we standardize abbreviations or avoid them altogether, such as n_moves or number_of_moves? Or maybe we can even say n_attempts_per_step?

Copy link
Member

@chrisiacovella chrisiacovella Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if we run out of bytes with such long names?! I think that is a great point; n_attempts_per_step would probably be very clear.

Edit: Since this will also apply to the langevin integrator move, we might not want to use the word "attempts" in the variable. number_of_moves might be more generally applicable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Langevin move is a special case of un-Metropolized move, where every step is accepted.
These moves are also distinct in that the Langevin move has "number of steps to take per move application" and a Monte Carlo barostat has a "number of volume adjustments to attempt per move application". In either case, something like n_steps and n_attempts is probably more appropriate for those move types.

volume_max_scale=0.2,
nr_of_moves=10,
reporter=reporter,
update_stepsize=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of update, we might want to refer to this as autotune.
frequency would refer to number of events per step, but interval or period refers to the number of steps between events.

Suggest:

  • update_stepsize=True -> autotune=True
  • update_stepsize_frequency=100 -> autotune_interval=100

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be much clearer. I was struggling with coming up with a clear/general name for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also updated the name of the internal class function to be self._autotune

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to make self.autotune be public, since it is reasonable to allow a user to change the autotune behavior on the fly.

chiron/mcmc.py Show resolved Hide resolved
chiron/mcmc.py Outdated Show resolved Hide resolved
chiron/mcmc.py Outdated

def compute_acceptance_probability(
elapsed_step = i + self._move_iteration * self.nr_of_moves
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really bad idea, since self.nr_of_moves can be changed on the fly.
If we need a step accumulator for statistics or auto-tuning, let's actually use a step accumulator like self._number_of_attempts_made and simply increment it with each move attempt.

chiron/mcmc.py Show resolved Hide resolved
chiron/mcmc.py Show resolved Hide resolved
chiron/mcmc.py Show resolved Hide resolved
chiron/mcmc.py Outdated
Comment on lines 849 to 859
def _update_stepsize(self):
"""
Update the volume_max_scale parameter to ensure our acceptance probability is within the range of 0.25 to 0.75.
The maximum volume_max_scale will be capped at 0.3.
"""
acceptance_ratio = self.n_accepted / self.n_proposed
if acceptance_ratio < 0.25:
self.volume_max_scale /= 1.1
elif acceptance_ratio > 0.75:
self.volume_max_scale = min(self.volume_max_scale * 1.1, 0.3)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #26, we discussed creating something like an Autotune object that handles auto-tuning behavior. That could provide an API for auto-tuning operations that adapt parameters like these, and could eventually implement asymptotically consistent approaches to make sure that we don't cause issues with sampling.

… be consistent with samplerstate variable name change.
…use no cutoff (i.e. all pairs interact). Revamped some internal tooling regarding how units are treated internally. Space was revampped such that it takes box_vectors as a argument rather than storing them internally (fewer copies of this floating around), and will help to ensure treating the class as static will not mean using the wrong box vectors accidentally.
Copy link
Member

@chrisiacovella chrisiacovella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have addressed all of the comments or made issues capturing the comments/discussion for instances where I believe these would be better addressed in separate PRs, as this is already getting quite large. These issues I created are #31, #32, #33, #34, #35.

…. (I missed this comment in the PR comments)
Copy link
Member

@chrisiacovella chrisiacovella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is all ready to go; I had missed one the comments regarding the elapsed steps (got marked as outdated).

@wiederm wiederm merged commit 124e519 into multistage Feb 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants