-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BondCharge virtual sites are not overriding correctly #1206
Comments
This works as intended other than the issue you raised in point 2 (see paragraph 2). I didn't have the same assumption in [1] during the implementation. I didn't think at the time of implementation that tags infer uniqueness, just like in other things like bonds, angles, and the like. Overcoming this was solved with the use of names, so if you want two different parameters on the same set of atoms, you use different names. The result was one parameter because a bond charge with the same name was specified twice in the parameter list. It works just as a standard bond parameter in this way. I think you'll get in trouble if you start using tags to infer uniqueness for e.g. Trivalent, because With all this said, assuming
Remember we took the stance that floats were never to be used to establish uniqueness :) Hope this helps. |
So we have agreement on what should happen. It might be harder to modify things in a way that realizes that. @j-wags and I hacked on this for a while today. We successfully identified several places in the code that do not fix it and at least one place that might be the best place to attack. Below is an ugly diff tracking some places in the code we poked around in - right now we think the failure is a result of diff --git a/openff/toolkit/typing/engines/smirnoff/parameters.py b/openff/toolkit/typing/engines/smirnoff/parameters.py
index e170bb89..98b33eda 100644
--- a/openff/toolkit/typing/engines/smirnoff/parameters.py
+++ b/openff/toolkit/typing/engines/smirnoff/parameters.py
@@ -5273,7 +5273,9 @@ class VirtualSiteHandler(_NonbondedHandler):
# has the match setting, which ultimately decides which orientations
# to include.
if self.match == "once":
+ # print(f"{orientations=}")
key = self.transformed_dict_cls.key_transform(orientations[0])
+ # print(f"{key=}")
orientations = [key]
# else all matches wanted, so keep whatever was matched.
@@ -5349,6 +5351,7 @@ class VirtualSiteHandler(_NonbondedHandler):
ref_key = self.transformed_dict_cls.key_transform(orientations[0])
atoms = list([molecule.atoms[i] for i in ref_key])
args = (atoms, orientations)
+ # _add_virtual_site will ONLY PROCESS the FIRST value in the list `orientations`.
off_idx = super()._add_virtual_site(fn, *args, replace=replace)
return off_idx
@@ -5799,11 +5802,25 @@ class VirtualSiteHandler(_NonbondedHandler):
)
diff_keys = key not in vsite_struct[KEY_LIST]
same_vsite = self._same_virtual_site_type(vs_i, vs_j)
+ # print(f"before, {combined_orientations=}")
+ # import ipdb; ipdb.set_trace()
+ # This conditional should probably not evaluate to True while reproducing issue #1206. Each
+ # individual thing seems reasonable enough but something about this doesn't capture the breaking
+ # case, and it's not clear how all of the necessary information can be gathered and processed
+ # right here in a way that fixes #1206 _and_ doesn't break other stuff.
if same_atoms and same_vsite and diff_keys:
- combined_orientations[i][KEY_LIST].append(key)
+ # print(f"{combined_orientations=}")
+ # combined_orientations[i][VSITE_TYPE].append(vs_i)
+ # The last match in the atom_matches iterator takes precendence according to SMIRNOFF rules;
+ # since only the first orientation is processed by `_add_virtual_site` later, one idea is to
+ # simply override the other orientations with the one that' scurrently found. This might need
+ # a match="once" conditional.
+ combined_orientations[i][KEY_LIST] = [key]
+ combined_orientations[i][VSITE_TYPE] = vs_i
found = True
+ # print(f"after, {combined_orientations=}")
# Skip out early since there is no reason to keep
# searching since we will never add the same
# particle twice
@@ -5850,13 +5867,19 @@ class VirtualSiteHandler(_NonbondedHandler):
top_mol = Topology.from_molecules([molecule])
matches = self.find_matches(top_mol, expand_permutations=True)
-
+ # print("[k for k in matches.keys()]")
+ # print([k for k in matches.keys()])
+ # In reproducing issue #1206, this method call is where information about the " 'distance': [0.220]" virtual site
+ # type is lost.
virtual_sites = self._reduce_virtual_particles_to_sites(matches)
+ # print("[v[1] for v in virtual_sites]")
+ # print([v[1] for v in virtual_sites])
# Now handle the vsites for this molecule
# This call batches the key tuples into a single list, in order
# for the virtual site to represent multiple particles
for vsite_type, orientations in virtual_sites:
+ # orientations = [orientations[0]] # import ipdb; ipdb.set_trace()
vsite_type.add_virtual_site(molecule, orientations, replace=True)
def _create_openmm_virtual_sites(self, system, force, topology, molecule): |
I'll head to starbucks and take a look tonight :) My initial reaction is that the issue is before this because this code shouldn't see the first site as only the second should have been applied. If anything, it seems the intention from your eyes would be |
Thanks for the feedback. I'd love your eyes on it, but you're not on the hook to fix this, @trevorgokey (and even if you were, you shouldn't work on a Friday night!) :-) I'm just in the middle of rolling out a comprehensive vsite test suite, so I'll be opening issues to keep tabs on things like these. Hopefully they'll culminate in a big PR that closes several issues at once. |
I understand that the implementation is complex and contains lots of invisible comments that only I can see, so I am happy to to help. |
Just started this, and I'm glad I went through the trouble to get the biopolymer branch up. My first note is that:
So it appears to work whenever Na is the "first" atom. The assignments correspond to the order of the atoms in the parameter smirks, which seems to align with assumption [1]. The first parameter has Cl first, and so the rows above which match this results in d=-.101, and when Na is first we have d=-.22. |
Ok, this was my mistake for letting this razor sharp case fall through the cracks. I think I can at least offer a potential solution that solves the above case, and should work for others. The issue is that there was implicit assumption that a permutation of two keys e.g. 0,1 and 1,0 with the same name and the same type were the same virtual site. This is clearly a bad assumption since this issue exists. To get around this, I modified the equality operator to also check for smirks equivalence as the final tie breaker here which will separate distinguish the two parameters. With the code now able to distinguish 0,1 and 1,0, the only remaining task was to clear out any previous matches if there is a new match as a new parameter. This approach should also work for Here is my proposed fix: diff --git a/openff/toolkit/typing/engines/smirnoff/parameters.py b/openff/toolkit/typing/engines/smirnoff/parameters.py
index 1dad135f..ddcc3d38 100644
--- a/openff/toolkit/typing/engines/smirnoff/parameters.py
+++ b/openff/toolkit/typing/engines/smirnoff/parameters.py
@@ -5228,7 +5228,7 @@ class VirtualSiteHandler(_NonbondedHandler):
def __eq__(self, obj):
if type(self) != type(obj):
return False
- A = ["name"]
+ A = ["name", "smirks"]
are_equal = [getattr(self, a) == getattr(obj, a) for a in A]
return all(are_equal)
@@ -5626,23 +5626,41 @@ class VirtualSiteHandler(_NonbondedHandler):
for new_match in v:
unique = True
new_item = new_match._parameter_type
- for idx, (name, existing_match) in enumerate(
- matches[k].items()
- ):
- existing_item = existing_match._parameter_type
- same_parameter = False
-
- same_type = type(existing_item) == type(new_item)
- if same_type:
- same_parameter = existing_item == new_item
-
- # same, so replace it to have a FIFO priority
- # and the last parameter matching wins
- if same_parameter:
- matches[k][new_item.name] = new_match
- unique = False
+ for existing_indices in matches:
+ # only go over existing indices which are a
+ # permutation of our current match
+ if set(k).difference(existing_indices):
+ continue
+ # the goal here is to go over all current match
+ # indices, and potentially clear them out if
+ # our new parameter is a match
+ for idx, (name, existing_match) in enumerate(
+ list(matches[existing_indices].items())
+ ):
+ existing_item = existing_match._parameter_type
+ same_parameter = False
+
+ same_type = type(existing_item) == type(new_item)
+ if same_type:
+ same_parameter = existing_item == new_item
+
+ # same, so replace it to have a FIFO priority
+ # and the last parameter matching wins
+ if same_parameter:
+ matches[k][new_item.name] = new_match
+ unique = False
+ else:
+ # If we are here, we have the same
+ # set of indices as the new match,
+ # but the parameter is not the same.
+ # In this case, we need to clear out
+ # any existing vsites, since this
+ # new incoming vsite needs to replace
+ # them
+ matches[existing_indices].pop(existing_item.name)
if unique:
marginal_matches.append(new_match)
+
matches[k].update(
{p._parameter_type.name: p for p in marginal_matches}
) and here is a modification of the above test to make sure from openff.toolkit.typing.engines.smirnoff import ForceField
from openff.toolkit.topology import Molecule
from openff.units import unit
from openmm import unit as omm_unit
import pdb
#smis = ['[Na:2][Cl:1]', '[Na:1][Cl:2]', '[Cl:2][Na:1]', '[Cl:1][Na:2]', '[Na][Cl]', '[Cl][Na]']
smis = ['[N:3]=[C:2]=[O:1]', '[N:1]=[C:2]=[O:3]', '[N]=[C]=[O]', '[O:3]=[C:2]=[N:1]', '[O:1]=[C:2]=[N:3]', '[O]=[C]=[N]']
for smi in smis:
if ":1" in smi:
mol = Molecule.from_mapped_smiles(smi)
else:
mol = Molecule.from_smiles(smi)
ff = ForceField()
vsh = ff.get_parameter_handler('VirtualSites', {"version":0.3})
# Define two vsites that apply in opposite directions to the same atoms of the test molecule.
# The vsites may collide (since they have the same name), but they should be
# distinguishable in the output based on their `distance` values
vsh.add_parameter({'type':'BondCharge',
'smirks':'[Cl:1][Na:2]',
'match':'once',
'charge_increment':[0.100, -0.100] * unit.elementary_charge,
'distance': [0.1010] * unit.angstrom,
'name':'aaa'})
vsh.add_parameter({'type':'BondCharge',
'smirks':'[Na:1][Cl:2]',
'match':'once',
'charge_increment':[0.100, -0.100] * unit.elementary_charge,
'distance': [0.220] * unit.angstrom,
'name':'aaa'})
vsh.add_parameter({'type':'DivalentLonePair',
'smirks':'[*:1]~[#6:2]~[*:3]',
'match':'all_permutations',
'charge_increment':[0.100, -0.100, 0.100] * unit.elementary_charge,
'distance': [0.30] * unit.angstrom,
'outOfPlaneAngle': [45.0] * unit.degrees,
'name':'aaa'})
vsh.add_parameter({'type':'DivalentLonePair',
'smirks':'[*:3]~[#6:2]~[*:1]',
'match':'all_permutations',
'charge_increment':[0.100, -0.100, 0.100] * unit.elementary_charge,
'distance': [0.40] * unit.angstrom,
'outOfPlaneAngle': [45.0] * unit.degrees,
'name':'aaa'})
sys = ff.create_openmm_system(mol.to_topology())
# For each vsite, print out which particles are involved, and at which distance.
#import pdb; pdb.set_trace()
for vsite_idx in range(sys.getNumParticles()):
if not(sys.isVirtualSite(vsite_idx)):
continue
vsite = sys.getVirtualSite(vsite_idx)
vsite_parents = []
for vsite_parent_idx in range(vsite.getNumParticles()):
vsite_parent = vsite.getParticle(vsite_parent_idx)
vsite_parents.append(vsite_parent)
distance = vsite.getLocalPosition()[0].value_in_unit(omm_unit.angstrom)
print(smi, vsite_parents, distance) I can certainly make a PR, but I figured you would want to just apply the patch by hand and play around. |
Please do make a PR - it'll help me understand the changes if I can quickly pull them down locally |
Sorry for the delay; PR is up. |
Ha! 1.1 cups of coffee well spent :-) The PR is showing some new issues, I think this may turn into whack-a-mole but I'll comment more there. I'm working on a more comprehensive vsite test suite that could help us ensure we fix this and don't lose ground somewhere that we're not aware of. |
Yeah, when I was looking at that failing test I was scratching my head at first. There are likely a few more tests than can hammer things out. The tests use a very symmetric problem, and I wouldn't be surprised if your clever NaCl representation tricks show up in other places. In any case, I am glad you are sitting down and really trying to hammer out a thorough test suite. As you can see, I only wrote the "does it work, in theory" tests, and you are making sure it works in practice. Much appreciated! |
Co-authored-by: Jeff Wagner <jwagnerjpl@gmail.com>
Describe the bug
SMIRNOFF hierarchy rules are somewhat complicated for virtualsites. A virtualsite should only override another virtualsite if they match the same atoms[1], and have the same name and type.
To Reproduce
This code makes two vsites with the same name and type, which should apply to the same atoms, but in a different order:
The specific problems shown in this code are:
0.220
) MIGHT overwrite the first (with distance0.101
), then I expect that the output should EITHER have one vsite with each distance, OR one vsite with0.220
. Instead we get just one vsite with0.101
.[1] It's not specifically mentioned in the SMIRNOFF spec, but I assume this means "one vsite will overwrite another if they match the same atoms in the same order"
Computing environment (please complete the following information):
The text was updated successfully, but these errors were encountered: