-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial implementations of Transformation, AlchemicalNetwork, Protocol; stubs for Executor components #13
Conversation
Hello @dotsdl! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2022-05-25 02:15:44 UTC |
So, we have objects for nodes and edges, but now we need to decide how state is structured in
|
This is basically what we did the OpenFE ligand network. Advantage is that we can force things to be immutable, and we return a frozen NetworkX graph. It also gives us access to custom serialization for our objects (EDIT: by serializing our nodes and edges to strings and making that into a NetworkX graph, which is then serialized using NetworkX.) |
Actually, I guess we sort-of do 3; but only because we cache the nx graph after first creation. |
Working notes from today, with some questions I could use some help resolving:
|
On this one point (too late at night for a more complete response -- and this is before even reviewing the code in detail), but I would say the folowing: GraphML is a good serialization format for graphs. Its benefits are wide support, with the ability to recognize the same, repeated, node as the same object. That isn't guaranteed in naïve JSON representations. One of the advantages of GraphML is that it allows arbitrary node/edge attributes. (Personally, not a fan of XML, but GraphML is well supported by other programs.) GraphML attributes can be stored as arbitrary strings. My recommendation would be for these strings to be JSON, which gives a good ability to serialize node/edge information. It becomes a kind of weird mix of XML and JSON, but there are good tools to handle each in most programming languages, so I think client code has the ability to work freely and serialize objects in this format. |
Adopted your approach; agree it is quite good. 😁 |
gufe/protocols/protocol_settings.py
Outdated
from openff.units import unit | ||
|
||
|
||
class SystemSettings(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be critical to sync up with the Open Force Field Initiative folks to come up with a self-consistent division between:
- Choices that impact the fully interacting potential energy function (which OpenFF is aiming to fully encode in the force field definition)
- Choices that impact the free energy difference computed for a transformation (atom mapping, alchemical modification functional form, thermodynamic state at which the calculation is performed)
- Choices that impact the efficiency of the calculation (integrators, barostats, collision rates, alchemical protocols, alchemical methods, alchemical intermediates, hydrogen mass)
Right now, you're mixing several of these together. This is highly problematic, and will lead directly to an inability to fulfill the primary mission of OpenFE (enabling free energy calculations with OpenFF force fields) because OpenFF is encoding all options that impact the fully interacting system potential function within the force field definition. Therefore, enabling options that impact the potential energy definition (such as nonbonded_method
, nonbonded_cutoff
, constraints
, rigid_water
) will mean you are failing to actually perform free energy calculations with the specified OpenFF force fields.
I'd be happy to help clarify these issues, but I think we need a design document that clearly separates these items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we had this discussion internally and had a good argument over whether e.g. the cutoff is free from the force field etc, there's room for improvement here.
While we need to support openff force fields, I'm not sure we can support them at the exclusion of other force fields, so there might have to be variations on these settings where the parameters are stated separately (such as is happening here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing in my argument that says you should support OpenFF to the exclusion of other force fields.
My only point is that you must support OpenFF force fields, which means that you will need to bundle all of the parameters that impact the potential energy for a fully interacting system together, separated from other parameters that impact simulation efficiency.
One way to do this would be to recognize that an OpenFF force field will bundle all of the parameters that impact the potential energy of a fully-interacting system into a single object. For other force fields, you can use a container object that similarly bundles everything together, explicitly specifying all the parameters left undefined by the choice of force field. These objects can then be used interchangeably to specify which "complete" force field should be used, and the choices of auxiliary parameters for traditional force fields like GAFF can subsequently be optimized to define best practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we had this discussion internally and had a good argument over whether e.g. the cutoff is free from the force field etc, there's room for improvement here.
Might I also suggest that internal discussions about how to represent force fields should likely also include a representative from OpenFF? There seems to be a great deal of accumulated knowledge within OpenFF that is not making its way into these discussions at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing in my argument that says you should support OpenFF to the exclusion of other force fields. My only point is that you must support OpenFF force fields, which means that you will need to bundle all of the parameters that impact the potential energy for a fully interacting system together, separated from other parameters that impact simulation efficiency.
Yep sorry misread this a bit at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another great cross-cutting discussion to have might be "which force fields should be supported?" and "how can we unify force field support across simulation engines?". There are a number of tools that have been developed that attempt to band-aid this situation for now, like simtk.openmm.app.ForceField, openmmforcefields.generators.SystemGenerator
, OpenFF Interchange, and now espaloma. If we know what you'd like to support, there's likely some things that can be done to make this easy that OpenFE need not solely support.
gufe/protocols/protocol_settings.py
Outdated
|
||
|
||
class TopologySettings(BaseModel): | ||
"""Settings for creating Topologies for each component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be more clearly defined? I'm not clear on what this class is supposed to represent from the brief doctstring.
gufe/protocols/protocol_settings.py
Outdated
class Config: | ||
arbitrary_types_allowed = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, this is required by pydantic to allow strange types. In the future I think we could write validators for openff.units
objects, then register these with pydantic somehow and not have to have this setting
gufe/protocols/protocol_settings.py
Outdated
solvent_model = 'tip3p' | ||
|
||
|
||
class AlchemicalSettings(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is a dumping ground for random perses
alchemical configuration parameters. Perhaps we could meet specifically to discuss how to come up with a better object model to more flexibly represent alchemical protocols, alchemical intermediates, and the like? We've been meaning to do this for a while, and I think it can be done flexibly, extensibly, and elegantly. I fear that just dumping everything into an object like this with attributes that will need to be supported is going to be problematic in the medium- and long terms.
gufe/protocols/protocol_settings.py
Outdated
|
||
|
||
class EngineSettings(BaseModel): | ||
"""OpenMM MD engine settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is general MM engine settings, not OpenMM specific?
gufe/protocols/protocol_settings.py
Outdated
return v | ||
|
||
|
||
class BarostatSettings(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a problematic design, since it convolves parameters that define the alchemical state and hence impact the free energy (pressure
) with parameters that simply control the efficiency of sampling (frequency
).
I highly recommend instead following something like what openmmtools
does and defining a ThermodynamicState
that encodes the information about thermodynamic parameters that will impact the thermodynamic ensemble sampled (and hence the free energy)---such as temperature
and optionally pressure
, pH
, etc.
This should be separated from settings that simply impact the efficiency of the sampling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting with a point of curiosity - I realise that at the limit of infinite sampling the MC frequency has no impact on how the thermodynamic state of the system, but given we are likely running short simulations and often non-equilibrium ones, does the change in relaxation time not affect the sampled free energy?
A thought on the splitting of parameters - I can understand & support the idea of separating out the "things that affect the ensemble", but at what limit can we differentiate between algorithmic details that can have an impact and broad ensemble parameters? (e.g. the choice of gamma_ln
for Langevin integrators or the splitting sequence, etc...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be many things that impact the the finite-time behavior of simulations---up to and including the exact choice of random numbers you generate. But it simply isn't sensible to convolve (1) parameters that impact the infinite-time limit of the calculation (thermodynamic state, potential function) with (2) parameters that impact the rate at which bias and variance (and hence total error to the asumptotic limit) diminish. There is a very good reason for this:
First, it's possible to independently optimize and tune (or even auto-tune adaptively on the fly) the simulation strategy to improve the rate of convergence to the asymptotic limit, and hence make the calculation more efficient. That makes for a very useful separation of effort, since work can be carried out independently on optimizing the efficiency of protocols, alchemical methods, engines, etc.
Second, if you attempt to optimize solely for accuracy of finite-time calculations to experiment directly, you will never get anywhere. Any choice you make (force field, calculation strategy, collision rate) will impact these finite-time calculations in uncontrollable and unknowable ways. We will never make any progress in improving either the force field, the protocol, or the calculation strategy because there is absolutely no way to separate any of these effects. Progress becomes completely impossible because all modifications are a random walk through calculation space.
gufe/protocols/protocol_settings.py
Outdated
arbitrary_types_allowed = True | ||
|
||
timestep = 2 * unit.femtosecond | ||
temperature = 298.15 * unit.kelvin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with BarostatSettings
, this convolutes thermodynamic parameters that impact the free energy of interest (temperature
) with a mess of settings that impact the efficiency of sampling and stability of sampling. These should be separated, grouping temperature, pressure, etc. into the thermodynamic state information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also highly recommend considering we adopt some flexible definition of how the sampling is done, perhaps in a hierarchical manner: It's most important to specify first (1) what calculation do you want to perform (openff-2.0.0 at 300 K and 1 atm), then (2) how you computed it, at a high level (e.g. replica exchange, nonequilibrium switching), then (3) how exactly it is performed can be left to best practices defaults, auto-tuned, or specified explicitly if this specification can actually be implemented by the engine. But I imagine a great deal of work will be trying to automate best practices and auto-tuning for each calculation type.
More broadly, in openmmtools, for the lowest level (3), we have embraced a general MCMC framework that allows us to compose or chain together MCMC samplers that sample from the same alchemical state in order to efficiently and flexibly sample from within an alchemical state. This is a much more general design than specifying all the details of a single integrator, since a calculation may want to compose compatible types of sampling strategies.
gufe/protocols/protocol.py
Outdated
def __init__( | ||
self, | ||
*, | ||
settings: ProtocolSettings = None | ||
): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the two ChemicalSystems
aren't here, then this class becomes a factory for simulations. I think being able to serialize the factory separately is a good idea for extending a network with the same protocol (else where is the factory stored).
In which case, I think run()
isn't run, but instead prepare(Transformation(initial, final, mapping)) -> PreparedProtocol
and PreparedProtocol
has the run()
and is_complete()
etc.
edit: I think this is also what John is saying on the run method below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, here are my notes when I separated things out this way:
- Benefits of separating
Transformation
fromProtocol
- can have a single
Protocol
object instance serving manyTransformation
s- can save on memory for very large networks, serialization cost
- may be worthwhile making them function as singletons when the hash is identical
- only two types of
Transformation
:Transformation
andNonTransformation
- we allow for a zoo of
Protocol
s
- we allow for a zoo of
- can execute a
Protocol
in isolation, without any graph/network machinery at all-
define
ChemicalSystem
s,AtomMapping
if appropriate, settings, and fire it off withProtocol(settings).run(start, end, mapping)
-
very similar to
sklearn
-style estimator classes -
should we adopt a similar approach to
sklearn
for results?
-
- can have a single
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will have to think some more on Protocol
structure to accommodate @jchodera's points here. I think you're right that a Protocol
with an initial
, final
, and mapping
at this point is a job that's ready to be executed, and it is what would need to be serializable to be shipped to a worker process, local or remote.
PreparedProtocol.run
could have an async=True
kwarg that proceeds with execution in a separate thread or process, with its methods able to yield result status from file outputs or stream buffers to the main thread/process. We'll need to determine what operations we support (stop, start, extend?).
gufe/protocols/protocol.py
Outdated
... | ||
|
||
@abc.abstractmethod | ||
def run(self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is intended to be the actual class that can be run()
to accumulate data from which a free energy difference can be estimated, it would be useful to structure this class so that it can meet the needs we've defined:
- It must be serializable so we can store it to disk, transmit it over the wire, or restore it later
- It must enable us to assess how complete the calculation is, rather than just check if it is fully complete
- It must enable us to assess the current free energy estimate and the estimates of statistical fluctuation if we want to be able to adaptively run calculations
- It must enable us to extend the calculation
I honestly don't think this structure makes sense as-is, though. You probably need to generate some kind of simulation object that is constructed from asking to perform a particular kind of transformation that would then provide an API for all of the above queries.
@dotsdl I think adding the settings definition in here is maybe a step too far and/or removed from the API for Protocol. We probably need to hash out the taxonomy of settings (which @jchodera has started) with a wider audience in something like a google doc and they try and encode this in nested pydantic structures later. |
Changes made elsewhere to make this work reasonably well.
Ready to go @richardjgowers! Got tests passing again after DAG restructuring from yesterday. |
@IAlibay blackified changes in the PR against |
|
||
Attributes | ||
---------- | ||
initial : ChemicalSystem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't we converge on stateA and stateB for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not married to initial
and final
, but I believe I'm going from @jchodera's suggestion from earlier on this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll keep A/B and have a great renaming once we're happy with the objects. I think we're still determining how many objects/layers there are right now.
|
||
|
||
# we subclass `Transformation` here for typing simplicity | ||
class NonTransformation(Transformation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So throwing my 2 cent here, I don't really like the Transformation
and NonTransformation
naming (especially the latter), it's strangely non-intuitive (especially if you have nontransformation sitting in transformations, I honestly can just see users sitting there being confused).
Is there an alternative we could use here? The concept of AlchemicalPaths (through states) or just calling them Experiments? Even being more descriptive with AlchemicalTransform(ation) would probably ease some of the likely pain points users are going to feel here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experiment sounds like a bench thing.
What a Transformation
is the application of a Protocol
onto specific chemical systems. It is also used as the edge on alchemical Graphs.
You could call it ParametrisedProtocol
but I feel like we've already got too many names revolving around Protocol.
I don't hate AlchemicalTransformation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experiment sounds like a bench thing.
That's big bench biasing you to feel inferior about your computational work 🙃
You could call it ParametrisedProtocol but I feel like we've already got too many names revolving around Protocol
Yeah it ends up getting confusing (I would expected ParameterisedProtocol to be a child of a Protocol class, etc..)
Maybe something to through for a quick view at today's meeting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree NonTransformation
is a bit obtuse. I think it's also totally descriptive without stacking in any additional, unintended meaning. One could even imagine it as a lambda protocol that changes a system to itself, effectively just doing dynamics and no alchemistry.
I'm fine with Transformation
being renamed to AlchemicalTransformation
, which goes with our AlchemicalNetwork
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll kick this down the road to the "great renaming" event we can have once we've got all the objects.
... | ||
|
||
|
||
class TestNonTransformation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a cursory look these two classes seem like they could be squashed to the one base class with just ntnf renamed to something that's declared as a class attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is some potential for consolidation, but there are some differences in test_equality
.
gufe/protocols/protocoldag.py
Outdated
completed: Set[ProtocolUnit] = set() | ||
|
||
while len(completed) != len(self._graph.nodes): | ||
for pu in graph.nodes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic but for the sake of sanity can we make pu
a bit more explicitly named here? Someone is going to eventually mess up when looking at pu
and pur
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reworked this loop to be much simpler
self._kwargs = kwargs | ||
self._name = name | ||
|
||
def __hash__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an __eq__
to deduplicate units? (or maybe deduplication isn't happening at this level?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deduplication has to happen within the Scheduler
with the way we've defined state for these objects. A ProtocolUnit
should be deduplicated on the basis of its settings and the inputs to its execute
method, so it's not possible to do all of that in __eq__
.
@dotsdl I pushed some stuff up |
Co-authored-by: Irfan Alibay <IAlibay@users.noreply.github.com>
graph = self._graph.copy(as_view=False) | ||
|
||
# iterate in DAG order | ||
for unit in reversed(list(nx.topological_sort(self._graph))): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're having to reverse the provided iterator for DAGs from Networkx. Not a drama, but it might be worth revisiting if we've got our DAG definition the right way round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always found that the DAG order in NetworkX is backwards. For example, if you're recreating a tree of objects that depend on other objects, you need to create dependents first. That requires reversing the nx iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies @richardjgowers, from our discussion Friday we had decided to encode DAGs as ProtocolUnit
s pointing to their dependents, but traversal appeared easier to go the other direction (ProtocolUnit
s point to their dependencies).
I should have mentioned this in a comment here that I reversed it in the tests, but I think you noticed this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either is fine, you're either saying "A leads to B" or "B requires A", I just don't want to be going against the grain with how everyone else seems to phrase it. Can leave this as-is, but if we keep finding counter examples we might have to switch polarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got an off by one error on the dependencies in one of the tests. I think I'll change the test to be more meaningful (sum of integers) so it'll be easier to debug.
The problem appears to be that |
We don't want *all* dependencies, recursively.
Fixed tests with change to use only immediate successors instead of all descendants. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok looks good enough to start trying it out now, I think it'll be easier to see how it fits once we try it on. @IAlibay want to merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
@richardjgowers please go ahead with the merge, I'm on the move right now.
Thanks all! Very excited to build on this! |
Adds the following components:
Transformation
Protocol
AlchemicalNetwork
Also adds stubs for
Executor
components; these were necessary to reason aboutProtocol
structure.Checklist
AlchemicalNetwork.__init__
with state held innetworkx
graph