-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splitting out SetupUnit from SimulationUnit in NonEquilibriumCyclingProtocol #3
Conversation
Pausing on this until #4 is in. |
This avoids: - serlializing/deserializing/remaking the HybridTopologyFactory in the SimulationUnit - makes it easier to make a Folding@Home-compatible version, if we want to hold on to trajectory data in a similar way
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3 +/- ##
=======================================
Coverage ? 92.12%
=======================================
Files ? 7
Lines ? 419
Branches ? 0
=======================================
Hits ? 386
Misses ? 33
Partials ? 0 ☔ View full report in Codecov by Sentry. |
def extract_positions(context, initial_atom_indices, final_atom_indices): | ||
""" | ||
Extract positions from initial and final systems based from the hybrid topology. | ||
|
||
Parameters | ||
---------- | ||
context: openmm.Context | ||
Current simulation context where from extract positions. | ||
hybrid_topology_factory: perses.annihilation.relative.HybridTopologyFactory | ||
Hybrid topology factory where to extract positions and mapping information | ||
|
||
Returns | ||
------- | ||
|
||
Notes | ||
----- | ||
It achieves this by taking the positions and indices from the initial and final states of | ||
the transformation, and computing the overlap of these with the indices of the complete | ||
hybrid topology, filtered by some mdtraj selection expression. | ||
|
||
1. Get positions from context | ||
2. Get topology from HTF (already mdtraj topology) | ||
3. Merge that information into mdtraj.Trajectory | ||
4. Filter positions for initial/final according to selection string | ||
""" | ||
import numpy as np | ||
|
||
# Get positions from current openmm context | ||
positions = context.getState(getPositions=True).getPositions(asNumpy=True) | ||
|
||
# Get indices for initial and final topologies in hybrid topology | ||
initial_indices = np.asarray(initial_atom_indices) | ||
final_indices = np.asarray(final_atom_indices) | ||
|
||
initial_positions = positions[initial_indices, :] | ||
final_positions = positions[final_indices, :] | ||
|
||
return initial_positions, final_positions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to maintain the capability of filtering the positions using an mdtraj
selection, as it originally was in https://github.com/choderalab/feflow/blob/81e1b80ea762e0ea78bbde6fc5e68d774ceadb2b/feflow/protocols/nonequilibrium_cycling.py#L100 . Maybe we want to change the default to all
? This does mean we would depend on mdtraj
(maybe we can use mdanalysis
for this if desired).
This should help reducing the storage used for trajectories when needed and might be useful when debugging/analyzing results, though I agree there are probably better ways to achieve this. One of them is creating the sampler (and reporter?) object in openmmtools
for NEq cycling (analogue to what we do with MultiStateSampler
for repex), and have that one handling all the trajectory/topologies serialization.
Are there any specific reasons other than not wanting to depend on mdtraj
/mdanalysis
that motivates this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the selection capabilities are not being tested in CI. This is something that we would like to be testing if we decide to include it back again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly I was motivated to not have to serialize the hybrid_topology_factory
in the SetupUnit
and pass it to (or re-create it in) the SimulationUnit
. So instead of passing that object around and into this method, we pass in just the positions. And since we don't have hybridt_topology_factory
, we don't have hybrid_topology
, so it's not so easy to do mdtraj
manipulations anymore. But they seemed largely unnecessary?
I don't believe we use this method anywhere for trajectory storage or filtering. Isn't this method just used here, in this protocol? My view is that if we need do something different later, then we make it do that, but not pre-emptively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just seemed like a feature that we weren't even using here, and its effect under default usage ("not water") was really nothing at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense? It's possible I've missed some important detail, so please correct me if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this was being used is that when serializing trajectories we were extracting the positions and velocities of all things not water
. It was just a way to not having to store them for everything and make the storage requirements a bit lighter, but it was used when storing the trajectories.
That said, I do think that the advantage of not having to pass the HTF object around might be a better compromise in this case, since the stored trajectories are not things we were using anyway. I think it's okay as it is now. If we end up needing to store only selections of atoms in the future we can give this another try.
@ijpulidos any additional questions on this PR? I may want to start working on #2 soon, and we'd want to build on this one in building that out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I think we can revisit the atom selection storage if needed in the future.
def extract_positions(context, initial_atom_indices, final_atom_indices): | ||
""" | ||
Extract positions from initial and final systems based from the hybrid topology. | ||
|
||
Parameters | ||
---------- | ||
context: openmm.Context | ||
Current simulation context where from extract positions. | ||
hybrid_topology_factory: perses.annihilation.relative.HybridTopologyFactory | ||
Hybrid topology factory where to extract positions and mapping information | ||
|
||
Returns | ||
------- | ||
|
||
Notes | ||
----- | ||
It achieves this by taking the positions and indices from the initial and final states of | ||
the transformation, and computing the overlap of these with the indices of the complete | ||
hybrid topology, filtered by some mdtraj selection expression. | ||
|
||
1. Get positions from context | ||
2. Get topology from HTF (already mdtraj topology) | ||
3. Merge that information into mdtraj.Trajectory | ||
4. Filter positions for initial/final according to selection string | ||
""" | ||
import numpy as np | ||
|
||
# Get positions from current openmm context | ||
positions = context.getState(getPositions=True).getPositions(asNumpy=True) | ||
|
||
# Get indices for initial and final topologies in hybrid topology | ||
initial_indices = np.asarray(initial_atom_indices) | ||
final_indices = np.asarray(final_atom_indices) | ||
|
||
initial_positions = positions[initial_indices, :] | ||
final_positions = positions[final_indices, :] | ||
|
||
return initial_positions, final_positions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this was being used is that when serializing trajectories we were extracting the positions and velocities of all things not water
. It was just a way to not having to store them for everything and make the storage requirements a bit lighter, but it was used when storing the trajectories.
That said, I do think that the advantage of not having to pass the HTF object around might be a better compromise in this case, since the stored trajectories are not things we were using anyway. I think it's okay as it is now. If we end up needing to store only selections of atoms in the future we can give this another try.
I think we're ready to merge this! |
We want to separate out the creation of
system
,state
, andintegrator
from theSimulationUnit
in theNonEquilibriumCyclingProtocol
in order to enable the swapping in of alternateProtocolUnit
s for theSimulationUnit
, for example for use with Folding@Home viaalchemiscale-fah
.This PR aims to accomplish this.