-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft of proposed spec #29
Conversation
spec/The-OpenFF-System.md
Outdated
| Bonds | Bond constraints | ||
| | Fractional bond orders | ||
| Iterators for other valences | Residue/chain/segments | ||
| Periodicity boolean | Box vectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's communicate that we're flexible on whether the box vectors could be part of the coordinates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see them as separate silos of data (them being in the same row here is just a coincidence)
independent_variables={"x"}) potential_handler.expression | ||
``` | ||
|
||
| MUST | MAY | MUST NOT | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this style of table.
|
||
### User control over system combination | ||
|
||
A plug-in architecture will expose settings for the level of rigour executed in the internal consistency checks. The default will be strict, resulting in errors if there is any ambiguity in the physical description of each system (i.e. different non-bonded cutoff treatments). A small number (1-2) of other settings with more granularity will be exposed, i.e. one that is problematively permissive and another that implements some amount of "reasonable" discrepancies to be fudged together. This will all be constructed in a way that enables users to define their own sets of "knobs" for when system combination can and cannot be allowed, i.e. if aromaticity models need not match but cut-off treatments must. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is scope creep so early in the process, but I think the +
operator is going to be a critically important part of the implementation, since we'll be able to reuse it during system combination, modification, and splitting/subtraction (if we support it). So I wonder if we want more depth about its behavior. Like
Use of the +
operator...
MUST result in a physically valid system (able to consume coordinates and calculate energies)
MUST preserve the order of particles in both systems
MUST attempt to use all SystemCombiners registered to the System, in the order they were registered, unless one raises a specific subclass of Exception that calls for an early halt.
MAY use multiple SystemCombiners in order to fully cover the different components of the input Systems
MAY provide non-commutative behavior (nonbonded cutoff might be taken from the system on the left)
MAY accept objects other than OpenFF systems (like ParmEd Structures)
MAY raise an error on atom typing collisions (maybe have a separate registry of TopologyCombiners?)
MUST NOT, if both input Systems are OpenFF Systems, produce a new Topology that contains different chemical species than were in the original inputs
MUST NOT perform in-place modifications on either input System
MUST NOT (?) "add" information that wasn't present in either System (for example, adding a cutoff where neither input specified one)
|
||
System parameters (force field parameters applied to a chemical topology) are represented as the sum of individual components (`PotentialHandler`s). Teach term in a potential energy function is expected to be captured by a `PotentialHandler` or combination thereof. These closely mirror the `ParameterHandler`s in OpenFF Toolkit, and may merge in the future. | ||
|
||
Each `PotentialHandler` subclass must specify an string-like `expression` that encodes the algebra of its energy evaluation and a collection of `independent_variables` that specify which variables in the expression do not need to be specified by system parameters. The remaining variables are then expected to be specified in a sequence of `Potential` objects stored in the handler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that everything can be expressed as string-like, rather than a function that gets evaluated a number of times? "PME electrostatics short range" is technically a very complicated function, and expressing it as a string could be problematic. Machine learning potentials are going to be super hard if not impossible to express as strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have string constants, that define the algebra elsewhere? Is that a good or bad idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we'll need to have a few different flavors of this class - also for tabulated potentials that do not care about any algebraic form
|
||
System parameters (force field parameters applied to a chemical topology) are represented as the sum of individual components (`PotentialHandler`s). Teach term in a potential energy function is expected to be captured by a `PotentialHandler` or combination thereof. These closely mirror the `ParameterHandler`s in OpenFF Toolkit, and may merge in the future. | ||
|
||
Each `PotentialHandler` subclass must specify an string-like `expression` that encodes the algebra of its energy evaluation and a collection of `independent_variables` that specify which variables in the expression do not need to be specified by system parameters. The remaining variables are then expected to be specified in a sequence of `Potential` objects stored in the handler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have string constants, that define the algebra elsewhere? Is that a good or bad idea?
|
||
## Features | ||
|
||
### Tracking parameter sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say a little more about what the data structures are that store/represent provenance?
|
||
### User control over system combination | ||
|
||
A plug-in architecture will expose settings for the level of rigour executed in the internal consistency checks. The default will be strict, resulting in errors if there is any ambiguity in the physical description of each system (i.e. different non-bonded cutoff treatments). A small number (1-2) of other settings with more granularity will be exposed, i.e. one that is problematively permissive and another that implements some amount of "reasonable" discrepancies to be fudged together. This will all be constructed in a way that enables users to define their own sets of "knobs" for when system combination can and cannot be allowed, i.e. if aromaticity models need not match but cut-off treatments must. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there can be compositions of functions? It would be a pain to write out different van der Waals, PME electrostatic short range, etc. that each had to reimplement a tapered cutoff term to multiply by each term - seems like should only be written once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is it possible to recognize when two functions are equivalent (At least 2 different ways of doing LJ, harmonic force constants can be defined with or without 1/2, RB and periodic sums of torsion can be equivalent).
## Relevant edge cases | ||
|
||
* Conception of "pre-" and "post-typing" topologies | ||
* Do virtual sites go in the topology, a special "post-typing" topology, or should they be computed on-the-fly as needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My instinct is part of the topology, it defines the geometry, is affected by parameterization.
* Allowing/forbidding/tracking/fixing "dirty" states | ||
* Dealing with mal-formed files or those that play fast-and-loose with specifications (MOL2, PDB, etc.) | ||
* Safely supporting alchemical mutations | ||
* Tracking alchemical mutations (safely storing a diff?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could literally be arrays of parameters at each site?
* Dealing with mal-formed files or those that play fast-and-loose with specifications (MOL2, PDB, etc.) | ||
* Safely supporting alchemical mutations | ||
* Tracking alchemical mutations (safely storing a diff?) | ||
* How to handle polarizability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it is just a functional form.
Description
This PR aims to materialize a specification document for this project.