-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Discovered some very strange behavior with the molecule cache that might be undesirable; documenting here for discussion.
Consider the following setup:
from openff.toolkit import Molecule
from openmm.app import ForceField
from openmmforcefields.generators import SMIRNOFFTemplateGenerator
def try_make_system(cache, molecules_in, molecule_wanted):
ff = ForceField()
generator = SMIRNOFFTemplateGenerator(cache=cache, molecules=molecules_in)
ff.registerTemplateGenerator(generator.generator)
ff.createSystem(molecule_wanted.to_topology().to_openmm())
molecule_1 = Molecule.from_smiles("C")
molecule_2 = Molecule.from_smiles("CC")Then the following behaves as expected:
try_make_system(None, [molecule_1], molecule_1) # Succeeds
try_make_system(None, [], molecule_1) # Fails because molecule_1 is not in the listHowever, the following succeeds:
try_make_system("test_a.cache", [molecule_1], molecule_1)
try_make_system("test_a.cache", [], molecule_1)It turns out that the "cache" behaves as more than just a parameter cache that can speed up template generation. Loading a cache actually causes the template generator to behave as if those molecules exist in the list.
This is questionable behavior, but the following could be even more confusing:
try_make_system("test_b.cache", [molecule_1, molecule_2], molecule_1) # Succeeds
try_make_system("test_b.cache", [], molecule_1) # Succeeds, molecule_1 in the cache
try_make_system("test_b.cache", [], molecule_2) # FailsIt turns out that because molecule_2 wasn't in the topology given when the template generator had both molecules in the molecules list, it isn't cached. If between the second and third lines of the above example, you run
try_make_system("test_b.cache", [molecule_2], molecule_2)then all lines will execute successfully because molecule_2 will have made it into the cache.
This behavior means that when a "cache" is used, instead of having molecules determine what molecules are allowed to get templates generated for them, it could depend on the past history of calls to other template generators, that may not be obvious without inspection of the topologies that were provided to them. I would assume that all of this is an outright bug except that the current implementation explicitly supports it and there is even a test1 for it. If we don't want this behavior, then it should probably be removed; if we still do, it should be documented very explicitly that the cache is more than a cache and can affect the behavior of parameterization.
Footnotes
-
This test is in
test_system_generator.pybut this behavior is not related toSystemGenerator, which just passes anycacheargument on to the template generator. This behavior is fully implemented withinSmallMoleculeTemplateGenerator. ↩