Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

constraint = allbonds Causes Crashes with OpenForceFields #47

Closed
fjclark opened this issue Apr 20, 2023 · 16 comments
Closed

constraint = allbonds Causes Crashes with OpenForceFields #47

fjclark opened this issue Apr 20, 2023 · 16 comments
Labels
bug Something isn't working somd2 Issues that relate to things that should be fixed in somd2

Comments

@fjclark
Copy link
Collaborator

fjclark commented Apr 20, 2023

Hello,

I've been attempting to run ABFE calculations with OFF1 / OFF2. These run without crashing when constraint = hbonds but crash immediately after minimisation upon changing to constraint = allbonds (NaN or Inf has been generated along the simulation). When using GAFF2, there are no issues with constraint = allbonds.

Please see all input here. The system is a ligand in a box of water, and I've included the sdf file I used to parameterise the ligand with OFF2 (through BSS).

  • OS: Ubuntu 22.04
  • Version of Python: 3.9.16
  • Version of sire: 2023.2.1

Thanks.

@fjclark fjclark added the bug Something isn't working label Apr 20, 2023
@lohedges
Copy link
Contributor

I imagine the system is overconstrained when using AllBonds. If I run your system with OpenMM directly, e.g.:

from openmm import *
from openmm.app import *
from openmm.unit import *

# Load the topology and coordinate files.
prmtop = AmberPrmtopFile("somd.prm7")
inpcrd = AmberInpcrdFile("somd.rst7")

# Initialise the molecular system.
system = prmtop.createSystem(
    nonbondedMethod=PME, nonbondedCutoff=1 * nanometer, constraints=AllBonds
)

# Define the integrator.
integrator = LangevinMiddleIntegrator(
    300.0 * kelvin, 1.00000 / picosecond, 0.002 * picoseconds
)

# Set the simulation platform.
platform = Platform.getPlatformByName("CPU")
properties = {}

# Initialise and configure the simulation object.
simulation = Simulation(prmtop.topology, system, integrator, platform, properties)
simulation.context.setPositions(inpcrd.positions)
if inpcrd.boxVectors is not None:
    simulation.context.setPeriodicBoxVectors(*inpcrd.boxVectors)

# Setting initial system velocities.
simulation.context.setVelocitiesToTemperature(300.0)

# Run the simulation.
simulation.step(1000)

I get:

Traceback (most recent call last):
  File "/tmp/tmpdykhrg2n/test.py", line 33, in <module>
    simulation.step(1000)
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/simulation.py", line 141, in step
    self._simulate(endStep=self.currentStep+steps)
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/simulation.py", line 206, in _simulate
    self.integrator.step(10) # Only take 10 steps at a time, to give Python more chances to respond to a control-c.
  File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/openmm.py", line 2089, in step
    return _openmm.LangevinMiddleIntegrator_step(self, steps)
openmm.OpenMMException: Particle coordinate is NaN.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#nan

The CCMA constraint solver in OpenMM will blow up if the constraint matrix is singular, or if the constraint is ill defined, e.g. for planar molecules. I think you can get back information from the context as to which atoms are constrained. You can also remove constraints one-by-one to try to figure out which constraint(s) is/are causing the issue. (There's a good thread about constraint issues with OpenMM here.)

Sire now has a direct to-OpenMM converter, so it might be worth trying this, although I'm not sure if it supports AllBonds at present. Working on BioSimSpace, we've recently seen situations where going from OpenFF interchange directly to OpenMM format works, whereas going via an intermediate AMBER file causes constraint blow ups. (You could try using interchange too.)

@lohedges
Copy link
Contributor

lohedges commented Apr 20, 2023

Your input also blows up during minimisation using constraint=AllBonds. Looking here you can see that constraints are turned off for minimisation with SOMD, hence why it only crashes after the minimisation.

@jmichel80
Copy link
Contributor

Hi @fjclark - could you cross reference this issue on the openff-toolkit github issues tracker ?

@lohedges
Copy link
Contributor

lohedges commented Apr 21, 2023

Actually, looking at the OpenFF code for Interchange and Forcefield there doesn't appear to be any way of specifying the constraint when converting to an OpenMM system, which would probably explain why the user wasn't seeing crashes when using this approach. (Perhaps it does something internally to add some constraints.)

@chryswoods
Copy link
Contributor

chryswoods commented Apr 21, 2023

I can simulate the ligand with no issue in sire with all-bonds-h-angles or all-bonds constraint, using a timestep of 4 femtoseconds, so I don't think the issue is over-constraining of the ligand.

I do see an error when trying to run the entire system in water.

import sire as sr
mols = sr.load("somd.prm7", "somd.rst7")

# just run the first molecule (ligand)
mol = mols[0]

# need to tell it to use a Cartesian space when it is in a vacuum
mol = mol.minimisation({"space": sr.vol.Cartesian()}).run().commit()

d = mol.dynamics(timestep=4*sr.units.femtosecond, 
                 map={"space": sr.vol.Cartesian(), 
                      "cutoff": None, 
                      "constraint": "bonds"})

# print the constraint (it is 'bonds' - equivalent to AllBonds
print(d.constraint())

# run 10 ps of dynamics saving a snapshot every 1 ps
d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# it runs ok :-)
mol = d.commit()

# try with full constraints
d = mol.dynamics(timestep=4*sr.units.femtosecond, 
                 map={"space": sr.vol.Cartesian(), 
                      "cutoff": None})

# print the constraint (it is 'bonds-h-angles' 
# - equivalent to AllBonds and all angles with H
print(d.constraint())

# run 10 ps of dynamics saving a snapshot every 1 ps
d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# it runs ok :-)
mol = d.commit()

# Ok, let's try running all the molecules
mols = mols.minimisation(map={"cutoff": 7*sr.units.angstrom}).commit()

# That worked! Now try dynamics
d = mols.dynamics(timestep=4*sr.units.femtosecond, 
                  map={"cutoff": 7*sr.units.angstrom})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# sire detects something went wrong, so it tries to minimise
# the system again...

# after minimising it still gives a NaN error

# Trying again with a small step size
d = mols.dynamics(timestep=1*sr.units.femtosecond, 
                  map={"cutoff": 7*sr.units.angstrom,
                       "constraint": "bonds"})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# sire still sees an error, tries to minimise again, but fails...

# Trying again with just a h-bonds constraint
d = mols.dynamics(timestep=1*sr.units.femtosecond,
                       map={"cutoff": 7*sr.units.angstrom,
                            "constraint": "h-bonds"})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# This now runs ok!
# prints as output 'Dynamics(completed=10 ps, energy=-18894.6 kcal mol-1, speed=38.0 ns day-1)'

# Trying with a larger timestep
d = mols.dynamics(timestep=2*sr.units.femtosecond,
                       map={"cutoff": 7*sr.units.angstrom,
                            "constraint": "h-bonds"})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# This also works
# Dynamics(completed=10 ps, energy=-18889.1 kcal mol-1, speed=102.2 ns day-1)

# Can we push to 4 fs...
d = mols.dynamics(timestep=4*sr.units.femtosecond,
                       map={"cutoff": 7*sr.units.angstrom,
                            "constraint": "h-bonds"})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# Yes we can!
# Dynamics(completed=10 ps, energy=-18863.7 kcal mol-1, speed=192.6 ns day-1)

My conclusion is that you can't constrain all bonds for this system. It over-constrains things. I recommend only constraining bonds involving hydrogen.

Strangely, we don't see this issue with the gas-phase ligand. That blows my idea of using short gas-phase simulations to test for over-constraining...

@chryswoods
Copy link
Contributor

chryswoods commented Apr 21, 2023

The issue is using the constraint in the TriclinicBox. I've just re-run the ligand only in the box and see a NaN.

import sire as sr

mols = sr.load("somd.prm7", "somd.rst7")

# just run the first molecule (ligand)
mol = mols[0]

mol = mol.minimisation({"cutoff": 7*sr.units.angstrom}).run().commit()

d = mol.dynamics(timestep=4*sr.units.femtosecond, 
                 map={"cutoff": 7*sr.units.angstrom})

d.run(10*sr.units.picosecond, 1*sr.units.picosecond)

# gives a NaN

It may be something weird in OpenMM not liking too many constraints with a Triclinic space?

@chryswoods
Copy link
Contributor

These two pages in the sire docs show how you can convert to/from openmm and run minimisation and dynamics simulations:

https://sire.openbiosim.org/tutorial/part05/04_dynamics.html
https://sire.openbiosim.org/cheatsheet/openmm.html

@jmichel80
Copy link
Contributor

To add to the mystery I understand @fjclark can simulate with all bond constraints the exact same system if parameterised with GAFF2.

@chryswoods
Copy link
Contributor

I've just checked and get a NaN in a periodic box too

d = mol.dynamics(timestep=2*sr.units.femtosecond, map={"cutoff": 7*sr.units.angstrom, "space": sr.vol.PeriodicBox(sr.maths.Vector(20)), "constraint": "bonds"})

I think it is something about the use of Ewald?

@chryswoods
Copy link
Contributor

Nope - even see this with reaction field

d = mol.dynamics(timestep=2*sr.units.femtosecond, map={"cutoff": 7*sr.units.angstrom, "space": sr.vol.PeriodicBox(sr.maths.Vector(20)), "constraint": "bonds", "cutoff_type":"RF"})

@chryswoods
Copy link
Contributor

I get this in a Cartesian space with a 7 A cutoff, but don't get a crash when I have no cutoff. It looks like it is related to the cutoff used, not the space.

@chryswoods
Copy link
Contributor

(the with no cutoff is a red herring as it doesn't crash in this case, but completely blows up without error...)

@chryswoods
Copy link
Contributor

Looking at a video of the trajectory, it is a very planar molecule that oscillates about the plane. I think it is best to not use all-bonds for this molecule. I get stable dynamics with a 5 fs timestep using

d = mol.dynamics(timestep=5*sr.units.femtosecond,
                 map={"space": sr.vol.Cartesian(),
                      "cutoff_type": "RF",
                      "cutoff": 10*sr.units.angstrom,
                      "constraint": "h-bonds-h-angles"})

@fjclark
Copy link
Collaborator Author

fjclark commented Apr 24, 2023

Thanks very much for the comments.

Looking at a video of the trajectory, it is a very planar molecule that oscillates about the plane. I think it is best to not use all-bonds for this molecule. I get stable dynamics with a 5 fs timestep using...

Thanks - I don't plan to run my current simulations with all-bonds. However:

To add to the mystery I understand @fjclark can simulate with all bond constraints the exact same system if parameterised with GAFF2.

Yes, I have run many simulations with the same molecule in the past, only parameterised with GAFF2. I have never observed instabilities related to the ligand in this case with all-bonds constraints and identical / slight variants of the other simulation parameters given above. Please see input here, for example.

I had forgotten that there was some discussion of this issue previously. The same issue affected a variety of systems supplied by Bayer (already parameterised) for ABFE benchmarking. Changing the constraint settings consistently caused/ prevented crashes.

@chryswoods
Copy link
Contributor

This issue is really helpful, particularly as it gives concrete examples that will let us test different schemes for detecting and correcting over-constrained systems. I am not sure that there is anything that we can do to fix this though for the current version of somd? I think it is sampling issue, whereby the OFF1 / OFF2 parameters are causing this quite planar molecule to undergo motions that cause instability in the constraint solver for all-bonds.

As this isn't a "bug", would it be ok to close this issue? We will refer back to it when we start ramping up development of somd2, and developing algorithms for detecting and correcting simulations that are over-constrained.

@chryswoods chryswoods added the somd2 Issues that relate to things that should be fixed in somd2 label Apr 28, 2023
@fjclark
Copy link
Collaborator Author

fjclark commented Apr 29, 2023

Yes, no problem. Thanks for looking into this.

Just to point out that this doesn't seem to be specific to this system, as I've tried a few ligands with OFF1/2 and all seem to be affected by this issue (mentioned here).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working somd2 Issues that relate to things that should be fixed in somd2
Projects
None yet
Development

No branches or pull requests

4 participants