Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AMBER as a free-energy perturbation engine #272

Merged
merged 42 commits into from Apr 12, 2024

Conversation

lohedges
Copy link
Contributor

This PR adds support for using AMBER as an FEP engine, using either pmemd or pmemd.cuda. This ports the code from the Exscientia sandpit into the core, making sure that it is as interoperable as possible. Thankfully this wasn't too painful since I already did the work of writing all of the AMBER output parsing in the Sandpit anyway, so could copy most straight across.

The major changes are:

  • Added support for using are reference system for position restraints to all engine. (Previously, the system that is passed to the process is used as the reference.)
  • Added a custom find_exe function to the AMBER process to aid finding a supported AMBER executable for the requested simulation protocol. This can also handle different variants of the pmemd executables via globbing.
  • Improved AMBER configuration options for pmemd and pmemd.cuda. There are now tests that I can run locally in order to validate that single-point energies agree for sander, pmemd, and pmemd.cuda. (More on this later.)
  • Improved support for passing addition kwargs through to the Process objects when setting up FEP simulations.
  • Added a hidden somd1_compatibility option to AMBER and GROMACS for FEP comparisons.
  • Added test of AMBER FEP analysis to the test suite. (The code for the analysis was already written, I just needed to add some data to test that it works reproducibly.)
  • Updated the hydration free-energy tutorial to also use AMBER as an example backend.

The main challenge with the PR was getting things to work reliably with pmemd and pmemd.cuda, in particular, for vacuum simulations. One thing that I hadn't realised was that the Exscientia sandpit code is actually overloaded by private internal functionality that I don't have access too, hence some configuration options (on which I was basing my port) are not what is used in practice. The code also doesn't even work as is, i.e. in the way in which a regular BioSimSpace user would interface with it (how the other engines work).

With regards to differences between pmemd and pmemd.cuda, the main pain is that pmemd.cuda doesn't support implicit solvent for vacuum simulations so you need to make sure to add a simulation box and run with a cutoff instead. (pmemd can run with no box and an "infinite" cutoff.) From my single-point energy tests I found that I can get essentially perfect agreement in energy between pmemd and pmemd.cuda, except for vacuum FEP simulations. For my test system I find that the energy components are:

pmemd:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       3.0582E+00     1.1813E+01     2.4354E+01     C           1

 BOND    =        0.0328  ANGLE   =        2.5203  DIHED      =        0.1513
 VDWAALS =        0.0000  EEL     =        0.0000  HBOND      =        0.0000
 1-4 VDW =        0.0426  1-4 EEL =        0.3112  RESTRAINT  =        0.0000
 DV/DL  =        12.8088

pmemd.cuda:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       3.7337E+00     1.1813E+01     2.4354E+01     C           1

 BOND    =        0.0328  ANGLE   =        2.5203  DIHED      =        0.1513
 VDWAALS =       -0.0010  EEL     =       -0.0000  HBOND      =        0.0000
 1-4 VDW =        0.1185  1-4 EEL =        0.9118  RESTRAINT  =        0.0000
 DV/DL  =        12.1272

From this it is clear that the only difference is in non-bonded 1-4 VDW and 1-4 EEL terms.

At first I thought that this difference was because of performing one simulation using a periodic box with a cutoff, and the other with no box and no cutoff. However, I then checked single-point energies for an ethane-->methanol merged molecule using two approaches:

  1. Perform a regular single-point calculation using the extracted lambda=0 state via BioSimSpace.Protocol.Minimisation(steps=1). For non-FEP simulations, both pmemd and pmemd.cuda support use of implicit solvent with no box. Here the results are in agreement:

pmemd:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       5.0487E+00     1.3658E+01     4.3557E+01     C1          1

 BOND    =        0.3937  ANGLE   =        3.4642  DIHED      =        0.2291
 VDWAALS =        0.0000  EEL     =        0.0000  HBOND      =        0.0000
 1-4 VDW =        0.0521  1-4 EEL =        0.9096  RESTRAINT  =        0.0000

pmemd.cuda:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       5.0487E+00     1.3658E+01     4.3557E+01     C1          1

 BOND    =        0.3937  ANGLE   =        3.4642  DIHED      =        0.2291
 VDWAALS =        0.0000  EEL     =        0.0000  EGB        =        0.0000
 1-4 VDW =        0.0521  1-4 EEL =        0.9096  RESTRAINT  =        0.0000
  1. Do the same using the AMBER FEP code, via BioSimSpace.Protocol.FreeEnergyMinimisation(steps=1). Here the results differ (as we saw earlier), but it is clear that despite pmemd.cuda having the different setup (now using a box with cutoff) it's actually the pmemd result that is the outlier, i.e. pmemd.cuda does agree with the non-FEP result.

pmemd:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       2.5251E+00     1.8820E+01     4.3557E+01     C1          1

 BOND    =        0.3937  ANGLE   =        1.7375  DIHED      =        0.0514
 VDWAALS =        0.0000  EEL     =        0.0000  HBOND      =        0.0000
 1-4 VDW =        0.0338  1-4 EEL =        0.3088  RESTRAINT  =        0.0000
 DV/DL  =        10.4548

pmemd.cuda:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       3.1434E+00     1.8820E+01     4.3557E+01     C1          1

 BOND    =        0.3937  ANGLE   =        1.7375  DIHED      =        0.0514
 VDWAALS =       -0.0010  EEL     =        0.0001  HBOND      =        0.0000
 1-4 VDW =        0.0521  1-4 EEL =        0.9096  RESTRAINT  =        0.0000
 DV/DL  =         9.8312

Clearly something in the non-bonded calculation differs between pmemd and pmemd.cuda, but only for vacuum FEP simulations. Not being an expert here, I'm wondering if there is some additional configuration option that I need to set. (I've looked around, but can't see anything.) I believe that pmemd was used for previous hydration free-energy calculations, e.g. some of @jmichel80's, so it would be good to know if this was an issue then? It could be that the code is broken, in which case we should probably only use pmemd.cuda for FEP in vacuum when it's available. (I have also checked that this difference gives a difference in dG when performing a vacuum hydration free-energy leg, i.e. it doesn't somehow cancel out.) Let me know what you think.

Checklist:

  • I confirm that I have merged the latest version of devel into this branch before issuing this pull request (e.g. by running git pull origin devel): [y]
  • I confirm that I have added a test for any new functionality in this pull request: [y]
  • I confirm that I have added documentation (e.g. a new tutorial page or detailed guide) for any new functionality in this pull request: [y]
  • I confirm that I have permission to release this code under the GPL3 license: [y]

Suggested reviewers:

@chryswoods

@lohedges
Copy link
Contributor Author

It appears that all OpenMM tests are failing on Windows. Will debug tomorrow.

@lohedges
Copy link
Contributor Author

Okay, it seems that I can get the correct electrostatics for pmemd by using gti_add_sc=1. According the the manual, this should be the default, but clearly isn't the case for pmemd. (Wouldn't be the first thing that was wrong.) I can get the correct angle and dihedral energies using gti_bat_sc=1.

No idea about the Windows failure. It passes on all other platforms. All I can think is that os.splitext isn't doing what I think, although it's used elsewhere in the code that's also tested on Windows 🤷‍♂️

@lohedges
Copy link
Contributor Author

lohedges commented Apr 10, 2024

Okay, the Windows error with the OpenMM tests is:

---------------------------- Captured stdout call -----------------------------
Starting %PREFIX%\Library\bin\sire_python.exe: number of threads equals 4
None
  File "C:\Users\RUNNER~1\AppData\Local\Temp\tmp0j8ahz0t/test_script.py", line 21
    prm = parmed.load_file('C:\Users\RUNNER~1\AppData\Local\Temp\tmp0j8ahz0t/test.prm7', 'C:\Users\RUNNER~1\AppData\Local\Temp\tmp0j8ahz0t/test.rst7')
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

The tests in the Exscientia Sandpit pass since the codet doesn't use the full path to the file, so doesn't run into the unicode issue. I made some modifications to handle the new restraint files so I'll need to check the logic for the other engines too and avoid using the absolute path.

Copy link
Contributor

@chryswoods chryswoods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are good, and can be approved.

However, I think the root cause of your unicode issue that you are constructing file paths manually. I used to do this as well, but it ended up really biting me on windows because my file paths ended up with a mixture of forward and backward slashs (as your paths do in the error). This really confuses windows, and led to lots of random behaviour, especially when sire was launched from different shells (i.e. cmd versus powershell versus whatever jupyterlab provided).

The (annoying) solution is to replace all manual path construction with calls to os.path.join. For example, see src/sire/_load.py (or search for this in other files).

@lohedges
Copy link
Contributor Author

Yes, I've used os.path.join elsewhere where other issues were found. It's been on my TODO list to go through and update everywhere for consistency. I'll see how much more needs changing.

@lohedges
Copy link
Contributor Author

lohedges commented Apr 11, 2024

In reality, it would probably be best to use pathlib throughout, but it didn't exist when we started.

@lohedges lohedges merged commit fdff786 into devel Apr 12, 2024
5 checks passed
@lohedges lohedges deleted the feature_amber_fep branch April 12, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants