Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request : implementing charge group for GROMOS force field incorporation #1999

Open
hjuinj opened this issue Feb 23, 2018 · 15 comments

Comments

@hjuinj
Copy link

hjuinj commented Feb 23, 2018

We would be interested in incorporating and running the GROMOS force field in OpenMM. The GROMOS force field was parametrized using the reaction-field (RF) method to account for the electrostatic interactions beyond the cutoff. To avoid artefacts with the RF method, a charge-group based cutoff is necessary. It would be great if this functionality could be added in OpenMM, such that the GROMOS force field can be used correctly.

The charge groups are only necessary to generate the pairlist (and can even be used to speed up the latter process). Once the pairlist is generated the pairwise interactions can be calculated in the same way as with an atom-based cutoff. For example, in the GROMACS software a charge-group based pairlist generation is used even for simulations with PME for efficiency reasons (see e.g. J. Chem. Theory Comput. 4, 435, 2008 II. Domain Decomposition).

The use of reaction fields is in general interesting for free-energy calculation with charge changes or when a detailed energy decomposition into energy (or force) groups is desired.

@jchodera
Copy link
Member

I think it would be great to be able to support group based cutoffs in OpenMM. We've had tons of trouble getting anything sensible out of reaction field electrostatics without them, and the ability to reproduce forcefields like gromos that are designed for charge groups would be very useful.

@peastman
Copy link
Member

Could you explain what group based cutoffs mean in this context? With reaction field, the interaction goes smoothly to zero at the cutoff, so groups don't make sense. Gromacs no longer uses them at all. See http://manual.gromacs.org/documentation/current/user-guide/cutoff-schemes.html which states:

Before version 4.6, GROMACS always used pair-lists based on groups of particles. These groups of particles were originally charge-groups, which were necessary with plain cut-off electrostatics. With the use of PME (or reaction-field with a buffer), charge groups are no longer necessary (and are ignored in the Verlet scheme).

@jchodera
Copy link
Member

I can dig into this more shortly, but the switching and cutoff is applied to the whole group simultaneously to avoid splitting charge groups and creating artificial dipoles near the boundary. Even with atomic reaction field, this effect can lead to large problems in the representation of solvent screening.

A simple illustration of this issue is sampling equilibrium solvent configurations for one cutoff and reweighting to slightly expanded cutoffs; with atomic based cutoffs, the configurations for one cutoff have little bearing on the configurations for another cutoff because of this boundary effect. This effect disappears if group based switch/cutoff is used. We can provide some example code if this is helpful.

@peastman
Copy link
Member

That isn't how it works. The concept of a "group based cutoff" is meaningless when the energy goes smoothly to zero at the cutoff. By definition atoms beyond the cutoff have 0 interaction energy, regardless of where any other atom is. Again, see the Gromacs documentation.

@jchodera
Copy link
Member

jchodera commented Feb 24, 2018

@peastman : I don't think we're talking about the same things.

Firstly, reaction field does not go to zero smoothly at the cutoff unless a switching function is applied. An infinite dielectric forces the potential to zero (except for the constant applied) but not the derivatives.

Secondly, gromacs is not GROMOS. Gromacs is a program that previously had trouble with ensuring the potential it was computing was always accurate because it would only build the pairlist at fixed intervals; that problem is fixed now by adding support for automatic pairlist rebuilds based on atomic displacement. GROMOS is a forcefield parameterized to use group-based distance criteria to determine whether to compute or switch an interaction. There are implementations of GROMOS available for gromacs, but that's not where they come from.

Reaction field electrostatics has significant artifacts near the cutoff if atomic-based cutoffs are used. You can see this yourself by the following simple script which computes the stddev of the potential energy change upon enlarging the cutoff by less than 1A:

#!/bin/env python                                                                                                                                                                                                                             

from simtk import unit, openmm
from simtk.openmm import app
from openmmtools import testsystems
import numpy as np

nsteps = 50
niterations = 20

default_cutoff = 9.0 * unit.angstroms
waterbox = testsystems.WaterBox(nonbondedMethod=app.CutoffPeriodic, nonbondedCutoff=default_cutoff)
system, positions = waterbox.system, waterbox.positions
forces = { force.__class__.__name__ : force for force in system.getForces() }
nparticles = system.getNumParticles()

from openmmtools.integrators import LangevinIntegrator
temperature = 300 * unit.kelvin
collision_rate = 1.0 / unit.picoseconds
timestep = 2.0 * unit.femtoseconds
forces['NonbondedForce'].setCutoffDistance(default_cutoff)
integrator = LangevinIntegrator(temperature, collision_rate, timestep)
platform = openmm.Platform.getPlatformByName('OpenCL')
platform.setPropertyDefaultValue('Precision', 'mixed')
context = openmm.Context(system, integrator, platform)
context.setPositions(positions)

# Collect data at fixed cutoff                                                                                                                                                                                                                
stored_positions = unit.Quantity(np.zeros([niterations,nparticles,3], np.float32), unit.angstroms)
stored_potentials = unit.Quantity(np.zeros([niterations], np.float64), unit.kilocalories_per_mole)
for iteration in range(niterations):
    integrator.step(nsteps)
    state = context.getState(getPositions=True, getEnergy=True)
    stored_positions[iteration,:,:] = state.getPositions(asNumpy=True)
    stored_potentials[iteration] = state.getPotentialEnergy()

del context, integrator

# Reprocess data                                                                                                                                                                                                                              
for cutoff_value in np.linspace(9, 9 + 1.0, 20):
    cutoff = cutoff_value * unit.angstroms
    forces['NonbondedForce'].setCutoffDistance(cutoff)
    integrator = LangevinIntegrator(temperature, collision_rate, timestep)
    context = openmm.Context(system, integrator, platform)
    perturbed_potentials = unit.Quantity(np.zeros([niterations], np.float64), unit.kilocalories_per_mole)
    for iteration in range(niterations):
        context.setPositions(stored_positions[iteration,:,:])
        perturbed_potentials[iteration] = context.getState(getEnergy=True).getPotentialEnergy()
    dU = (perturbed_potentials - stored_potentials) / unit.kilocalories_per_mole
    print('%8.3f A -> %8.3f A : stddev %12.3f kcal/mol' % (default_cutoff/unit.angstroms, cutoff/unit.angstroms, dU.std()))
    del context, integrator

to see that atom-based reaction field is so sensitive to the boundary as to be useless

-bash-4.1$ python reaction-field.py
   9.000 A ->    9.000 A : stddev        0.001 kcal/mol
   9.000 A ->    9.053 A : stddev        0.621 kcal/mol
   9.000 A ->    9.105 A : stddev        1.178 kcal/mol
   9.000 A ->    9.158 A : stddev        1.638 kcal/mol
   9.000 A ->    9.211 A : stddev        1.993 kcal/mol
   9.000 A ->    9.263 A : stddev        2.294 kcal/mol
   9.000 A ->    9.316 A : stddev        2.543 kcal/mol
   9.000 A ->    9.368 A : stddev        2.791 kcal/mol
   9.000 A ->    9.421 A : stddev        3.091 kcal/mol
   9.000 A ->    9.474 A : stddev        3.488 kcal/mol
   9.000 A ->    9.526 A : stddev        3.975 kcal/mol
   9.000 A ->    9.579 A : stddev        4.544 kcal/mol
   9.000 A ->    9.632 A : stddev        5.105 kcal/mol
   9.000 A ->    9.684 A : stddev        5.657 kcal/mol
   9.000 A ->    9.737 A : stddev        6.193 kcal/mol
   9.000 A ->    9.789 A : stddev        6.685 kcal/mol
   9.000 A ->    9.842 A : stddev        7.178 kcal/mol
   9.000 A ->    9.895 A : stddev        7.627 kcal/mol
   9.000 A ->    9.947 A : stddev        8.019 kcal/mol
   9.000 A ->   10.000 A : stddev        8.356 kcal/mol

By contrast, using group-based cutoffs gives very very low stddevs:
#1606 (comment)

See also this thread.

@hjuinj
Copy link
Author

hjuinj commented Feb 26, 2018

Thank you John and Peter for the responses.

Peter, just to reinforce the point John made about smaller stddev in energy when charge group is employed:

A charge group contains a (small) set of atoms where their charges sum to zero. When reaction field is used, only particles within the cutoff is included for the electrostatic calculation. However, one can imagine at the cutoff boundary only some of the particles in certain charge groups are within the cutoff. An atom-based reaction field implementation would just include those within the cutoff, resulting in partial inclusion of dipoles during the electrostatic estimation. This leads to larger energy fluctuation along a trajectory and more so when the cutoff distance is changed. For group-based cutoff, all atoms in a given charge group will be either included or excluded for the electrostatic calculation.

As John said, reaction field does not go to zero smoothly at the cutoff unless a switching function is applied. The GROMOS force field was parameterised with group-based cutoff and without a switching function. The correct usage of GROMOS will hence require these.

@peastman
Copy link
Member

Firstly, reaction field does not go to zero smoothly at the cutoff unless a switching function is applied.

That is false. You can find the equations at http://docs.openmm.org/latest/userguide/theory.html#coulomb-interaction-with-cutoff. The values of the c and k coefficients to chosen to ensure E(r_cutoff) = 0. You can use reaction field, or you can use group based cutoffs, but it's impossible to use the two together. The two models simply aren't compatible with each other.

@sriniker
Copy link

As mentioned in that link (http://docs.openmm.org/latest/userguide/theory.html#coulomb-interaction-with-cutoff), the force goes only to zero at the cutoff in the limit epsilon_solvent >> 1. With water (epsilon ~80), it is close to zero but not exactly, which leads to the artefacts without a group-based pairlist. For simulations in chloroform (epsilon ~4), this is an even bigger problem. The c and k coefficients are not chosen, but are given by the permittivity of the solvent and the cutoff radius.

Reaction field and group-based cutoff are compatible and should indeed be used together. As mentioned by Shuzhe, the group-based cutoff just ensures that atoms, which form a dipole (e.g. the C and O in a carbonyl group), are included or excluded together in the pairlist.

@peastman
Copy link
Member

It's the energy that goes smoothly to zero, not the force. This is true independent of what dielectric constant you use. That means any atom beyond the cutoff distance has an interaction energy that is identically zero. Also, if you tried to ignore the cutoff and extend the same functional form out beyond the cutoff distance, you would get totally absurd behavior: the energy would actually increase as r^2 at longer distances.

@jchodera
Copy link
Member

jchodera commented Mar 1, 2018

@peastman : Your point about the c and k coefficients being selected to ensure E(r_cutoff) = 0 got me thinking that the artifacts we are observing with the OpenMM reaction-field---especially the crazy behavior of energies when enlarging the cutoff---may be due to the inclusion of different numbers of atom pairs as the cutoff is varied, which could give rise to the energy variations we're seeing.

The OpenMM reaction field definition uses

U = a * \sum_{i < j} q_i * q_j * S(r_ij) * (r_ij^(-1) + k*r_ij^2 - c)

where S(r) is the cutoff or switch function. This means that the additive constant will be

- a * \sum_{i < j} q_i * q_j * S(r_ij) * c

which changes in a discontinuous manner as the cutoff is enlarged because more terms are added.

I've been able to verify this by modifying the reaction field to a CustomNonbondedForce that removes the c term:

from openmmtools import forcefactories
system = forcefactories.replace_reaction_field(system, switch_width=1.0*unit.angstrom, return_copy=True, shifted=False)

and minimizing the system before simulation:

[chodera@lilac:chodera]$ python reaction-field-2.py 
   9.000 A ->    9.000 A : stddev        0.004 kcal/mol
   9.000 A ->    9.053 A : stddev        0.004 kcal/mol
   9.000 A ->    9.105 A : stddev        0.004 kcal/mol
   9.000 A ->    9.158 A : stddev        0.004 kcal/mol
   9.000 A ->    9.211 A : stddev        0.004 kcal/mol
   9.000 A ->    9.263 A : stddev        0.004 kcal/mol
   9.000 A ->    9.316 A : stddev        0.004 kcal/mol
   9.000 A ->    9.368 A : stddev        0.004 kcal/mol
   9.000 A ->    9.421 A : stddev        0.004 kcal/mol
   9.000 A ->    9.474 A : stddev        0.004 kcal/mol
   9.000 A ->    9.526 A : stddev        0.004 kcal/mol
   9.000 A ->    9.579 A : stddev        0.004 kcal/mol
   9.000 A ->    9.632 A : stddev        0.004 kcal/mol
   9.000 A ->    9.684 A : stddev        0.004 kcal/mol
   9.000 A ->    9.737 A : stddev        0.004 kcal/mol
   9.000 A ->    9.789 A : stddev        0.004 kcal/mol
   9.000 A ->    9.842 A : stddev        0.004 kcal/mol
   9.000 A ->    9.895 A : stddev        0.004 kcal/mol
   9.000 A ->    9.947 A : stddev        0.004 kcal/mol
   9.000 A ->   10.000 A : stddev        0.004 kcal/mol

As @peastman notes, the force is unchanged by the c constant term, but the energies can appear totally wonky.

@peastman
Copy link
Member

peastman commented Mar 1, 2018

Since combining charge groups with reaction field doesn't make sense, I looked up the GROMOS paper (https://www-ncbi-nlm-nih-gov.stanford.idm.oclc.org/pubmed/21533652) to try to figure out what it's really doing. Here's the relevant passage:

Non-bonded interactions were calculated using a triple-range cutoff scheme. The interactions within a cutoff distance of 0.8 nm were calculated at every step from a pair list which was updated every fifth time step. At this point, interactions between atoms (of charge groups) within 1.4 nm were also calculated and were kept constant between updates. To account for the influence of the dielectric medium outside the cutoff sphere of 1.4 nm, a reaction-field force based on a relative dielectric permittivity ϵ of 61 (Heinz et al. 2001) was added.

That's not entirely clear, but here's my interpretation of it. They're using reaction field with a cutoff of 1.4 nm. The charge groups are being used as an optimization to reduce the number of interactions they compute. At each step they compute every interaction that's within 0.8 nm. They then find every other atom that is a) in the same charge group as one of those atoms, and b) within 1.4 nm. They also compute those interactions (again, based on reaction field with a 1.4 nm cutoff), but only update them every fifth step.

So the fundamental potential function is reaction field with a 1.4 nm cutoff. Any interaction that gets computed, that's what they use to compute it. But interactions are divided into three classes: those which are computed every step (anything within 0.8 nm), those which are computed every five steps (beyond 0.8 nm, but another atom in the same group is within 0.8 nm), and those which are ignored (no atom in the group is within 0.8 nm).

@peastman
Copy link
Member

peastman commented Mar 1, 2018

As @peastman notes, the force is unchanged by the c constant term, but the energies can appear totally wonky.

That's a good point. The purpose of the c term is to make the energy go to zero at the cutoff. If you omit that, you won't conserve energy because you'll get discontinuous changes in energy at the cutoff. But it does introduce a shift in the total energy, and the magnitude of that shift depends on cutoff. (Of course, the same is true for a sharp cutoff. Omitting interactions beyond the cutoff changes the total energy. But this changes it more, because it also shifts the energy of interactions within the cutoff.)

@sriniker
Copy link

sriniker commented Mar 1, 2018

Yes, the energy is zero at the cutoff (only there), but the forces are not and that is important.
It is also correct that an error is made for atoms, which are within a charge group that is included but have a distance beyond the cutoff. However, for solvents with low permittivity (and sufficiently small charge groups), this error is smaller than the one that is generated by cutting through dipoles at the cutoff and the resulting charge fluctuations.

The combination of reaction field and charge-group based cutoff has been used for the last ~25 years in the GROMOS software (and in all parametrizations of the force fields). We usually use the triple-range scheme in order to reduce the computational cost that comes with the larger cutoff of 1.4 nm (however, this triple-range scheme is not very important, also a single cutoff at 1.4 nm can be used). For both the short-range pairlist (where the forces are calculated every step) and the long-range pairlist (where the forces are calculated every 5th step), a charge-group based cutoff is used. This means that all atoms of a charge group whose COG is within the cutoff distance of the COG of the central charge group are added to the corresponding pairlist.

@peastman
Copy link
Member

peastman commented Mar 1, 2018

My first suggestion would be to try just running it with the 1.4 nm cutoff in OpenMM. That involves more interactions than the optimization used in GROMOS, but that optimization also seems to be very tightly coupled to the particular way it implements nonbonded interactions. I don't think it would be as effective in OpenMM, due to the way it computes nonbonded interactions on the GPU. It's based around groups of 32 atoms, so if you're computing an interaction with even one atom in a charge group, chances are high that you're also computing interactions with most of the others (since they'll be in the same group). You could throw away the result of the ones beyond 0.8 nm, but you would still pay the cost of computing them either way.

Do you use a multiple time step integrator to combine the forces that are computed at 1 step and 5 step intervals? What are the inner and outer step sizes?

@sriniker
Copy link

sriniker commented Mar 6, 2018

The triple-range scheme is a type of multi-time stepping algorithm. The inner step size is 2 fs (if all bonds are constrained) and the outer time step is typically 10 fs (i.e. recalculation of the mid-range forces every 5th step). This scheme is not necessary to run the GROMOS force field properly. One gets the same properties when a single cutoff of 1.4 nm is used.

However, the charge-group based cutoff is necessary to run systems using a low permittivity of the reaction-field continuum properly. To illustrate this, I attach the RDF between the carbon atoms in CCl4 with a charge-group (CG) based cutoff and with an atom-based (AT) cutoff. There is a clear artificial peak just at the cutoff distance of 1.4 nm with the AT cutoff.
rdf_ccl4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants