Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul AMOEBA code #2700

Closed
peastman opened this issue May 26, 2020 · 12 comments
Closed

Overhaul AMOEBA code #2700

peastman opened this issue May 26, 2020 · 12 comments

Comments

@peastman
Copy link
Member

cc @jayponder @jppiquem @pren

The AMOEBA plugin has some of the oldest, messiest code in OpenMM. The whole architecture is badly out of date, working quite differently from how the rest of OpenMM does. This causes many problems. For one thing, it's just really hard to maintain. I dread having to make any changes to it (such as support for splitting the multipole interactions into short and long range parts). It also is slower than it needs to be, since it has missed out an many optimizations made to the rest of the library over the years. Some of the convoluted architecture was required to work around limitations of circa-2008 GPUs. We've never supported AMOEBA on the OpenCL platform, because the thought of cloning all that code, converting it to OpenCL, and maintaining two copies of it was just too daunting.

I think the time has come for a major cleanup. I suggest proceeding through the following steps.

First, eliminate most of the specialized bonded forces. These classes only exist because AMOEBA predates custom forces. That isn't how we implement new force fields anymore. Most of them could be replaced by just a few lines of code using a CustomCompoundBondForce. Eliminating these classes would remove thousands of lines of unnecessary code.

For backward compatibility, ForceField would still accept the old tags in XML files. It would just create appropriate custom forces instead. We could even do the same with serialization. If you had previously saved a System to a file, and it included an AmoebaStretchBendForce, it could still be loaded. That force would just get replaced by an equivalent custom force.

Second, I want to completely rework the code for AmoebaMultipoleForce, especially the CUDA implementation. This should both simplify it and improve the performance. I saw the effect of that while I was implementing HIPPO. Instead of starting from the existing CUDA code I mostly wrote it from scratch in a more modern way, and that led to a substantial performance gain. I believe the same should be possible with AMOEBA.

Third, I'll convert the CUDA platform to be based on the common compute framework. This will finally allow us to support OpenCL, so AMOEBA simulations can be run on AMD and Intel GPUs.

@peastman
Copy link
Member Author

I'd like to start moving forward on this. Before I do, it would be really helpful to get feedback, especially from @jayponder, @jppiquem, and @pren. Doing this will lead to a big improvement in code maintainability and hopefully also a significant gain in speed. Does it sound to you like a good idea?

@jayponder
Copy link

jayponder commented Feb 16, 2021

Hi Peter, Sure. The AMOEBA code in OpenMM does probably need an overhaul. I don't really have time/effort to actually do any of this, but would be happy to consult. My first goal is just to make sure AMOEBA is correctly implemented in OpenMM, so people get correct results with all the AMOEBA parameter sets.

In addition, I've recently been interacting with a group at McGill in Montreal that wants to run AMOEBA nucleic acid simulations using OpenMM. You will remember that we put the stretch-torsion and angle-torsion cross terms needed by the nucleic acid force field into the Tinker-OpenMM interface as custom forces. I think this means those terms will not be available in OpenMM from the standard Python interface.

Also there is a possible issue with what AMOEBA calls the "angle" and "anglep" bond angle parameters. "angle" parameters are the usual standard bond angle term, thouch with cubic through sextic terms in addition to harmonic. The "anglep" parameters are for projected in-plane angles and are used at "planar" trigonal sites. Again, I have some question as to whether this mechanism is fully/correctly implemented through the Python interface (it may be fine, but I'm not sure..). This use of angle/anglep is very basic to AMOEBA, and is used for proteins, nucleic acids, and really any AMOEBA parameters.

@peastman
Copy link
Member Author

Thanks! I've just been emailing with someone from the McGill group about converting the latest parameter set. Hopefully he'll want to take on that task.

Can you describe what your concerns are about the bond angle terms? All energy terms have been checked against Tinker to make sure the energies match, so I'd be surprised if there were any major differences in how the parameters are interpreted. Is there anything particular that makes you think it might not be correct?

@jayponder
Copy link

jayponder commented Feb 16, 2021

Yes, I'm sure we've been talking to the same guy, Michael Kilgour. Michael used an old script you must have given him to convert the most recent biomolecule AMOEBA parameter set (amoebabio18.prm) to the OpenMM XML-like format. The script does not recognize or handle the "anglep" parameters. So Michael just changed all the "anglep" parameters in the amoebabio18.prm file to "angle" before using the script. That's seriously broken in a bad way.. The resulting XML force field file will likely "run", but it will give incorrect and probably poor results. I've warned Michael about this, but I think he has no real idea how the angle parameters differ or what is going on-- so it's kind of dangerous.

I just now checked Tinker-OpenMM and the angle/anglep stuff is handled correctly. But that's because Tinker-OpenMM is using the "Tinker" version of the parameters as in our *.prm files. For OpenMM via the Python interface you use the XML-like versions, and I'm not sure any of those parameter files do the angle/anglep mechanism, which has been in Tinker for some time now but was initiated after you folks translated all the Tinker parameters. Does this make sense?

@peastman
Copy link
Member Author

I'm not quite sure I understand. Are you saying that older versions of the force field had only an angle parameter, but the newer version adds a separate anglep parameter? If so, we'll need to add support for it. Can you point me to documentation that describes the correct functional form and interpretation of the two parameters?

@jayponder
Copy link

jayponder commented Feb 16, 2021

So, AMOEBA has always (for roughly 20 years..) used the angle scheme first developed by Lou Allinger in his "MM" force fields. For most angles, we just use a "standard" angle parameter, augmented in AMOEBA by small cubic through sextic terms in addition to the usual harmonic.

But at planar trigonal sites, AMOEBA often does not use the standard bond angle. Instead it projects the central trigonal atom into the plane of its three attached atoms, and then uses the projected in-plane angle in the energy function instead of the actual angle. For many years, the Tinker code did this "automatically", meaning that the code decided whether an atom was supposed to be trigonal planar based on the presence of out-of-plane parameters centered on that atom. If out-of-plane parameters were present, then all of the angle terms centered on the atom used the in-plane projection mechanism. At that time all of the bond angle parameters in the *.prm files were of type "angle".

Some time ago (a few years now?), we decided to make the angle parameters themselves determine whether a given angle energy term is "standard" or "in-plane projected". This has two advantages: (1) it keeps the Tinker code from implementing this behind the user's back, and (2) it allows complete, explicit control over which type of angle term is used for a given angle.

In order to make this change we essentially created a new parameter type, "anglep". So the way the Tinker parameter files work now is that an "angle" parameter is for a "standard" angle, and contains the usual force constant and ideal angle value. And an "anglep" parameter also contains a force constant and ideal angle, only in this case the "ideal" angle value is for the ideal in-plane projected angle. All the current Tinker parameter files have been converted to contain both "angle" and "anglep" parameters, as appropriate. After conversion, the older AMOEBA force fields all give exactly the same energies and forces for any particular molecule that they have always given-- so the underlying force field models are not changed. But the parameter files do need to be updated.

We are actually making use of the increased flexibility of this angle/anglep mechanism. For example, In the automated parameterization methods being developed, we sometimes using standard "angle" parameters at what are normally thought of as planar sites but where there is actually weak nonplanarity (aniline nitrogens, for example).

As per my comments above, Tinker-OpenMM seems to handle this correctly. At least it gives the right energy for systems that contain both angle and anglep parameters. This means that the actual energy function in OpenMM must be capable of handling both angle parameter types. And Tinker-OpenMM is then OK since it is getting the parameters from our Tinker *.prm files that contain both angle and anglep parameter records. The concern is for the OpenMM XML-based parameter files. I think they (possibly/probably) do not contain or handle the angle vs. anglep parameters.

@peastman
Copy link
Member Author

I think I understand. OpenMM has two different classes, AmoebaAngleForce and AmoebaInPlaneAngleForce, corresponding to these two different functional forms. The Python code looks at the topology of the atoms involved and decides which class to use for each angle:

# if middle atom has covalency of 3 and
# the types of the middle atom and the partner atom (atom bonded to
# middle atom, but not in angle) match types1 and types2, then
# three out-of-plane bend angles are generated. Three in-plane angle
# are also generated. If the conditions are not satisfied, the angle is marked as 'generic' angle (not a in-plane angle)
if (middleAtom > -1 and middleCovalency == 3 and middleAtom not in skipAtoms):

I assume this matches what Tinker used to do. But the new behavior is that there are two different types of angle definitions. Each definition explicitly says whether it's a standard angle or an in-plane angle?

Rather than regenerating our existing force field files, it might be simplest to keep the existing XML tag working the same way it already does. New force fields could use new tags that are specific to the particular type of angle.

@jayponder
Copy link

jayponder commented Feb 16, 2021

Peter, Your description and analysis are correct (and a lot more concise than mine :)

The current OpenMM will work correctly with your current versions of the AMOEBA force field parameter files (for amoebabio13, etc.). Just note that going forward, at some point, you will need to be able to deal with the "angle" & "anglep" parameters. And the Python code you cite above will be broken because we are now starting to use standard "angle" parameters at some sites where the Python code would think an in-plane angle should be used.

Also, don't forget the note in my first comment of today about the stretch-torsion and angle-torsion terms needing to be made available from the Python interface. This is necessary before the McGill folks, or anyone else, can correctly run the current AMOEBA nucleic acid force field. We should test against Tinker to make sure it's correct.

Hope you are doing well... Cheers, Jay

@jchodera
Copy link
Member

@peastman I think you've addressed this with #3046 and #3120, so I'm closing this now. Please reopen if this was closed by mistake!

@peastman
Copy link
Member Author

Not ready to close! There's still one piece of this I haven't implemented yet (porting our more modern optimizations over to AMOEBA).

@peastman peastman reopened this May 31, 2021
@jchodera
Copy link
Member

My apologies! I hadn't realized you still had remaining work in this issue.

In future, you can add a checklist to make it easier for us to track progress, like this:

  • item 1
  • item 2
  • item 3

@peastman
Copy link
Member Author

I modified AMOEBA to use shuffle in place of shared memory, which was the main optimization I wanted to try. It worked, but it ended up being slower. So I'll go ahead and close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants