Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results for ACEMD 2 and ACEMD 3 #1088

Open
smar966 opened this issue Apr 22, 2024 · 19 comments
Open

inconsistent results for ACEMD 2 and ACEMD 3 #1088

smar966 opened this issue Apr 22, 2024 · 19 comments

Comments

@smar966
Copy link

smar966 commented Apr 22, 2024

Dear support team,

A few months ago I moved from HTMD2 to HTMD3 (finally). However, I was getting surprising and inconsistent results, which made me suspect that something is running differently. To confirm this, I ran the adaptive sampling on the very same system (I used the exact same topology and adaptive script), only changing the input and the implementation. The behavior was radically different in the two calculations.

This makes me think that something in the default settings for running the MDs must have changed.

For the sake of comparing different systems in my project, I need to run MDs in the exact same conditions, but I would like to use the more recent HTMD3! Could you help me understand what changed, and how I can make HTMD3 run as previously HTMD2.

NOTE: I prepared the adaptive MD using the same python script (with production_v6). The respective input files for HTMD2 and HTMD3 look very different, and you can find them at the link:
https://drive.google.com/drive/folders/1gxjq9OjqbCE86EI4PvhqCZfXUXzbQC49?usp=drive_link

Thank you in advance.
Sergio

@stefdoerr stefdoerr changed the title inconsistent results for HTMD2 and HTMD3 inconsistent results for ACEMD 2 and ACEMD 3 Apr 22, 2024
@stefdoerr
Copy link
Contributor

You mean ACEMD not HTMD. I looked at the input files but they seem consistent to each other. Same settings in both.
ACEMD3 is using OpenMM as the backend for the calculations while before it was using a different codebase. I don't believe that in production runs there should be any difference between them though other than luck in sampling. The two implementations were tested against each other and were equivalent.

@smar966
Copy link
Author

smar966 commented Apr 22, 2024

Hi Stefan.
I was also very surprised. But trust me, the results I'm getting are suspiciously very different, at least for my systems. I am trying to observe the dissociation of a dimer. With HTMD2 (or ACEMD2), the dimer dissociated after ~1 us, while with HTMD3 it remained stable for 10 us (the end of it). The exact same system.
I got suspicious after my first trial with HTMD3 on a different (but rather similar) system, which seemed overly stable - but for which I have no comparison in HTMD2.
Is there any possibility that not everything is running exactly the same way?...

@stefdoerr
Copy link
Contributor

The ACEMD version should not be the cause of this. Are we talking about a single dissociation event or multiple in different simulations?

Did you use the exact same parameters? Or did you re-build your system? Do a diff of the structure.prmtop files to be sure. Because maybe the AMBER version changed.

Also are you using the same Adaptive sampling method? If you updated HTMD together with ACEMD there might have been changes in the algorithm compared to the old attempt.

@smar966
Copy link
Author

smar966 commented May 2, 2024

Dear Stefan,

We are talking about the single dissociation of a dimer (which is a bit stochastic, true).

However, I compared the behavior with AceMD2 and AceMD3 for 2 systems already (different protein variants). The results were very similar: the dimers DID dissociate quite early with AceMD2 but NOT at all with AceMD3 in much longer simulation times.

The facts:

  • I used the exact same topologies (as suggested, I double-checked using diff - no difference).
  • The adaptive.py scripts are slightly different but the adaptive settings should be exactly the same.
  • Only the 'input' files for the production MDs look different for AceMD2 and AceMD3. But they were generated using the same settings in protocol 6 ('from htmd.protocols.production_v6 import Production') under the respective AceMD version.

I have prepared a folder online for the two systems and for acemd2 and acemd3, containing: the inputs ('generators' folder), the adaptive.py script, and plots with the respective RMSDs vs. time. You can compare them directly here:
https://drive.google.com/drive/folders/1gxjq9OjqbCE86EI4PvhqCZfXUXzbQC49?usp=sharing

I really don't know what else to compare and how to explain the different behaviors.
The approximate dissociation time is a crucial information in this study, and I cannot proceed using AceMD3 without making sure that the differences found are due only to the systems and not the methods.

Thank you so much in advance for you help.

@giadefa
Copy link
Contributor

giadefa commented May 2, 2024 via email

@smar966
Copy link
Author

smar966 commented May 3, 2024

Hi Gianni.
We are using a quite recent implementation, installed last October.
I understand what you mean. The only problem is that this is a long project, and I have most data generated with the old version. However, it has been deprecated and it can no longer run on the newer GPUs - so we are becoming more and more limited. This is why I'm trying to move to the newer version, but I'm struggling with these issues...

@giadefa
Copy link
Contributor

giadefa commented May 6, 2024 via email

@smar966
Copy link
Author

smar966 commented May 6, 2024

Ohh really?!! Would it be different enough to provide incorrect MD simulations? this is important because most of my colleagues are currently running their adaptive simulations with 'acemd3'.

@giadefa
Copy link
Contributor

giadefa commented May 6, 2024 via email

@stefdoerr
Copy link
Contributor

stefdoerr commented May 6, 2024

I will insist though that the difference is not in ACEMD. The forcefield energies and forces have been compared between ACEMD2/3 and they were consistent and as far as I know no major bugs were fixed other than adding new features.

What is more probable to have changed is the adaptive sampling algorithm. If you can send us the code you used and also the HTMD versions from the conda environment it would be helpful.

To get the HTMD versions in the two conda environments do

conda activate env1
conda list htmd
conda list acemd
conda list acemd3

And the same for env2. Then I can take a look if anything significant has changed in the Adaptive classes.

@smar966
Copy link
Author

smar966 commented May 9, 2024

Hi Stefan,
Thanks for your insights. If the adaptive protocol has changed, it could explain the differences I'm observing.

Unfortunately, because our group uses a grid environment, I am not using the regular conda installation but a singularity installation instead. Therefore, the commands 'conda activate env1' or 'conda activate env2' do not work for me. The other 'conda list xx' commands you requested produce:

conda list htmd
# Name Version Build Channel
htmd 2.3.2 py310_0 acellera
htmd-deps 2.3.2 py310_0 acellera

conda list acemd
# Name Version Build Channel
acemd 3.7.2 cuda1122py310_0 acellera
acemd3 3.7.2 0 acellera

conda list acemd3
# Name Version Build Channel
acemd3 3.7.2 0 acellera

If you want to test my singularity installation, I have uploaded it to the following link:
https://filesender.cesnet.cz/?s=download&token=5a346f1e-837b-4ec9-a3fd-ffc8151b87c6

To run the adaptive script I'm using the following commands:
HTMD3=$LOCATION/htmd_2023_09.sif
singularity exec -H `pwd` $HTMD3 python adaptive_meta.py > adaptive.log

And to run each MD, the command:
singularity exec --nv -H `pwd` $HTMD3 acemd --ncpus 1 input >log.txt

Meanwhile, I am rerunin the adaptive sampling of one of the dimers I mentioned above using the 'acemd' to run MDs instead of 'acemd3'. But for the moment, I do not see any difference (no dissociation has happened yet)

@stefdoerr
Copy link
Contributor

Can you run the same commands in the old container? So that I can compare the versions?

@smar966
Copy link
Author

smar966 commented May 9, 2024

Sure. For the old container, the results are:

conda list htmd
# Name Version Build Channel
htmd 1.13.10 py36_0 acellera
htmd-data 0.1.hash23fb208 py_0 acellera
htmd-deps 1.13.10 py36_0 acellera
htmd-pdb2pqr 2.1.1+htmd.3 pyh13f2e89_0 acellera

conda list acemd
# Name Version Build Channel
acemd 2019.01.24 0 acellera
acemd-examples 2016.5.12 1 acellera

conda list acemd3

Empty output

@stefdoerr
Copy link
Contributor

can you also send me the adaptive_meta.py so I can see what settings and algorithm you are using? Thanks

@smar966
Copy link
Author

smar966 commented May 10, 2024

Sure. Please look into the two respective folders for acemd2 and acemd3 here:
https://drive.google.com/drive/folders/1nsYxrwVTDWJVSWmaD0H1Cx8FkyMUaMiq

@stefdoerr
Copy link
Contributor

I did a diff of the two codes. The only things which changed in Adaptive code was a fix for NPT simulations and maintaining their box size, but I assume your production runs are NVT, and that the MSM code changed from pyemma to deeptime.

There was no dramatic change in the algorithm as far as I can tell, unless somehow the deeptime switch created different Markov models which were significant enough to affect the results but I haven't had such experience when moving between the two libraries at least for MSM analysis.

Meanwhile, I am rerunning the adaptive sampling of one of the dimers I mentioned above using the 'acemd' to run MDs instead of 'acemd3'. But for the moment, I do not see any difference (no dissociation has happened yet)

You mean you are trying the latest acemd version or that you are rerunning the experiment with the old container? I would be interested if you are able to replicate the old results with the old container. Or send us the old inputs and old container and we can run it and see if we get the dissociation.

@smar966
Copy link
Author

smar966 commented May 14, 2024

I did a diff of the two codes. The only things which changed in Adaptive code was a fix for NPT simulations and maintaining their box size, but I assume your production runs are NVT, and that the MSM code changed from pyemma to deeptime.

Actually, I am (or I should) running NPT simulations. Could the changes have an impact on the ensemble I obtain? Was it buggy before? Among the many differences in the 'input' file, I can see that 'exclude scaled1-4' disappeared in the newer version. The same with the several "langevin" parameters to specify and regulate the Langevin thermostat! Could this be the issue?? You can compare the input files from the online folders that I mentioned in my previous comments.

Meanwhile, I am rerunning the adaptive sampling of one of the dimers I mentioned above using the 'acemd' to run MDs instead of 'acemd3'. But for the moment, I do not see any difference (no dissociation has happened yet)

You mean you are trying the latest acemd version or that you are rerunning the experiment with the old container? I would be interested if you are able to replicate the old results with the old container. Or send us the old inputs and old container and we can run it and see if we get the dissociation.

No, I meant that I tried to run the MDs with the new code using the 'acemd' command instead of the 'acemd3'.
But I am also trying to replicate the old results with the old container, as you suggest. Unfortunately, I cannot share the old implementation with you since we have it installed as a "module," not as a singularity like the new one.

@stefdoerr
Copy link
Contributor

No, both simulations are NVT since there is no barostat in either input files. Default barostat is off. The exclude scaled1-4 is ok it's not needed in the new version. The langevin parameters are just renamed in the new input file as thermostatxxx instead of langevinxxx and they are the same as in the old file.

langevin                       	on
langevindamping                	0.1
langevintemp                   	310
# new
thermostat           	on
thermostatdamping    	0.1
thermostattemperature	310

@smar966
Copy link
Author

smar966 commented May 16, 2024

Ohhhh I thought the barostat was turned on. My bad. Then the MD settings are not likely the problem.
If the only thing that changed in the code was the MSM method in the adaptive sampling, I don't see how it could affect so much my results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants