Forcefield molecular dynamics and forcefield refactor #722

esoteric-ephemera · 2024-02-16T00:48:32Z

Completed:

Add a forcefield maker to perform molecular dynamics with forcefields in ASE, atomate2.forcefields.md.ForceFieldMDMaker
- The defaults for this class (starting and ending temperature, pressure, Langevin damping, ensemble, thermostat) are matched as closely as possible to the VASP MD maker
- Units for params are the same between both implementations, see the docstr
- There's some duplication of the code in atomate2.forcefields.utils.Relaxer, but I don't see a better way to merge those approaches for trajectory observance without a messier implementation
Specific subclasses of ForceFieldMDMaker for CHGNet, M3GNet, and MACE
Update matgl dependence to 1.0.0 (needed to use ASE calculator wrapper for M3GNet)
MD trajectory stored in GridFS. Options to write it to disk either as pymatgen.core.trajectory.Trajectory or ase.io.Trajectory objects
Refactor Forcefield relax and static jobs to use a common atomate2.forcefields.utils.ase_calculator object. One can either:
- Load a predefined ASE calculator using one of the atomate2.forcefield.MLFF members,
- or by specifying a dict with "@module" and "@callable" keys to load a module via MontyDecoder
Add a force convergence check in atomate2.forcefields.utils.Relaxer which is later passed to ForceFieldTaskDocument to resolve Convergence check for relaxation with forcefields #753.

To Do's:

Structure of the kwargs entering MDMaker vs. its make function. For example, it may be easier to set the temperature and pressure via MDMaker().make than via MDMaker(temperature=...)

…ssible

codecov · 2024-02-16T01:01:11Z

Codecov Report

Attention: Patch coverage is 89.42598% with 35 lines in your changes are missing coverage. Please review.

Project coverage is 76.81%. Comparing base (9f64381) to head (4a88d7b).
Report is 12 commits behind head on main.

❗ Current head 4a88d7b differs from pull request most recent head 78ad999. Consider uploading reports for the commit 78ad999 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #722      +/-   ##
==========================================
+ Coverage   76.64%   76.81%   +0.17%     
==========================================
  Files         114      115       +1     
  Lines        8506     8723     +217     
  Branches     1275     1344      +69     
==========================================
+ Hits         6519     6701     +182     
- Misses       1600     1620      +20     
- Partials      387      402      +15

Files	Coverage Δ
src/atomate2/common/jobs/utils.py	`45.45% <ø> (-33.34%)`	⬇️
src/atomate2/common/schemas/phonons.py	`94.07% <ø> (ø)`
src/atomate2/forcefields/__init__.py	`100.00% <100.00%> (ø)`
src/atomate2/forcefields/schemas.py	`96.42% <97.22%> (+4.53%)`	⬆️
src/atomate2/forcefields/jobs.py	`92.80% <86.20%> (-1.62%)`	⬇️
src/atomate2/forcefields/utils.py	`90.50% <94.28%> (+6.16%)`	⬆️
src/atomate2/forcefields/md.py	`85.00% <85.00%> (ø)`

JaGeo · 2024-02-16T07:20:26Z

@esoteric-ephemera great that you started this. One thing that will likely be really brittle are the trajectories. We need to find a solution for this in the taskdocs before merging this PR. This is already a problem for normal optimizations (okay, our structures are large but still).
We had some discussions on this already in #515

utf · 2024-02-16T12:48:06Z

Firstly, thanks very much for this @esoteric-ephemera. It looks fantastic.

As I just commented in #515, we have the ability to store trajectories in the datastore for VASP workflows (see materialsproject/emmet#886 for implementation details). I think we should add a similar option for the force field workflow and probably use the datastore by default since ML MD runs are likely to be much longer than AIMD.

esoteric-ephemera · 2024-02-16T17:50:26Z

Thanks @JaGeo and @utf! I tried to mimic the store_trajectory functionality: by default, only energies, forces, stresses, and magmoms are stored in the TaskDoc ionic steps for FF MD. I can try to add that as a tag to the ForcefieldTaskDocument for consistency.

On this point: for FF MD, one only needs the structure to reconstruct those other four quantities, so maybe storing just the structure for FF MD makes more sense (probably less data-efficient)

…as pmg object

…ator

chiang-yuan · 2024-02-22T17:25:16Z

One thing worth mentioning is when I trained MACE MP on MPtrj, the original file compiled by Bowen in JSON format is around 9 times bigger than extxyz. After compression the size difference is even three-fold. pymatgen trajectory has too many repetitive keys and brackets…. ASE trajectory is written in binary so it should be even more efficient than extxyz. We may want to have a separate MDTaskDoc supporting ASE Trajectory binary

…object

…mlff-md

chiang-yuan · 2024-02-29T04:14:05Z

@esoteric-ephemera temperature and pressure schedule tests are working now. Not sure if we need to test all the calculators. It seems redundant to me if we only want to make sure schedule feature is working

allow complex schur decomposition

janosh · 2024-04-01T18:16:23Z

tests/forcefields/test_md.py

+    assert all(
+        output["from_str"].output.__getattribute__(attr)
+        == output["from_dyn"].output.__getattribute__(attr)
+        for attr in ("energy", "forces", "stress", "structure")


this looks like a float comparison (except for structure)? if so, should this use pytest.approx (and getattr)?

Changed this to use approx when appropriate. The intent was that two runs should be identical when run either explicitly specifying the dynamics object, or implicitly via str, but that is maybe too optimistic

~~I tried using getattr, but the forcefield task doc has no defined getattr method, hence the dunder~~

~~I can write these out explictly if that's preferable to the dunder. Or try to add a getattr~~

I see what you mean, replaced the __getattribute__ calls with getattr

janosh

code-wise this looks great! thanks again @esoteric-ephemera 👍
only remaining concern is test time. tests on the mlff_md branch take 17 min vs 12 on the main branch. can we reduce the number of steps or system size in tests?

…essure scheduling

esoteric-ephemera · 2024-04-01T21:36:10Z

@janosh:

only remaining concern is test time. tests on the mlff_md branch take 17 min vs 12 on the main branch. can we reduce the number of steps or system size in tests?

Tried to reduce the time for this by reducing the number of steps in the temperature / pressure MD tests, as well as only testing for CHGNet (MACE is 3x slow for these tests)

@chiang-yuan: the temperature and pressure scheduling didn't seem right to me. When I ran the tests, the temperature schedule gave me [300, 3000, 3000,...,3000], when it should've been a "smooth" (linear) ramp from 300 to 3000 K, no?

I've modified the logic for the scheduling in a new function ForceFieldMDMaker._interpolate_quantity, can you take a look and let me know if this is what you had in mind?

fix T/P schedule

chiang-yuan · 2024-04-02T17:52:02Z

Thanks @esoteric-ephemera !! Sorry I confuse the behavior of numpy interp and scipy interp1d. Thanks for the nice catch! I slightly changed a bit the syntax for schedule interpolation and make it seems (at least to me) explainable

esoteric-ephemera · 2024-04-02T17:59:20Z

Thanks @esoteric-ephemera !! Sorry I confuse the behavior of numpy interp and scipy interp1d. Thanks for the nice catch! I slightly changed a bit the syntax for schedule interpolation and make it seems (at least to me) explainable

Perfect thanks! This looks much cleaner than my edit

utf · 2024-04-03T10:34:50Z

Hi @esoteric-ephemera. Thanks again for all the edits. Is this good to go now?

I'll go through and see if there are any incompatibilities. QuantumChemist seems to have found a missing dir_name which they patched in #791, so there may be other missing attrs

Did you have a chance to do this?

esoteric-ephemera · 2024-04-03T16:05:53Z

Hi @esoteric-ephemera. Thanks again for all the edits. Is this good to go now?

I'll go through and see if there are any incompatibilities. QuantumChemist seems to have found a missing dir_name which they patched in #791, so there may be other missing attrs

Did you have a chance to do this?

Confirmed that the following works without issue:

from emmet.core.tasks import TaskDoc

forcefield_taskdoc: ForceFieldTaskDocument = <output of some forcefield job>
vasp_style_taskdoc = TaskDoc(**forcefield_taskdoc.model_dump())

There are some optional fields that aren't populated (calcs_reversed, dir_name, etc.) but those aren't so important given that MD trajectories are being stored in GridFS.

For relax / static jobs, is there interest in having a calcs_reversed field on the forcefield task doc?

utf · 2024-04-03T16:14:32Z

Thanks for checking - calcs_reversed is specific to the calculator used and won't be part of the universal task document (when we get to addressing #741). As long as the other fields match we should be good to go.

esoteric-ephemera · 2024-04-03T16:47:09Z

Then yes, I think we're good to go!

janosh

i'll go ahead and merge since everyone seems happy with this.

thanks so much @esoteric-ephemera, very excited to have MLFF MD makers in atomate2! 🎉

utf · 2024-04-03T17:55:55Z

Fantastic, thanks everyone who contributed to this PR and especially @esoteric-ephemera and @chiang-yuan!

JaGeo · 2024-04-03T18:00:15Z

Great!

Short question to the ML potential crowd: wouldn't a LAMMPS interface be even more beneficial? Especially for MD? I am aware of the atomate2-LAMMPS addon but it currently does not look ready to use (and is even archived: https://github.com/Matgenix/atomate2-lammps).

janosh · 2024-04-03T18:13:31Z

never used LAMMPS myself, I think OpenMM might be the better fit for GPU-accelerated ML potentials based on what I heard from @orionarcher

orionarcher · 2024-04-04T16:13:58Z

I've found OpenMM to be MUCH easier to use than LAMMPS and seen it run much faster on GPUs. Though I don't know how many popular ML potentials have been implemented, OpenMM has a plugin for integration with Pytorch.

Caveat: I have only used OpenMM for classical MD and not ML potentials

JaGeo · 2024-04-04T16:47:26Z

@orionarcher thanks! From the ML community side I always hear the LAMMPS wish. Just wanted to see if there are plans or opinions.

(Cc @ml-evs?)

esoteric-ephemera and others added 5 commits February 15, 2024 15:01

FF MD fully implemented

7637392

change MD defaults to be NVT + Langevin dynamics

659ad4a

add md defaults that are as consistent with vasp implementation as po…

ade233e

…ssible

Merge branch 'materialsproject:main' into mlff_md

e27b3fa

correct typing of langevin friction coeff

db51e23

gpetretto mentioned this pull request Feb 16, 2024

Output document for large MD output files #515

Open

esoteric-ephemera and others added 5 commits February 16, 2024 12:35

Add option to ForceFieldTaskDocument to store trajectory information …

1dcad1d

…as pmg object

Merge branch 'materialsproject:main' into mlff_md

47a6180

Add tests, remove dependence on pickle for trajectory i/o

35fa20d

fix tests, update matgl dependence

0d8fdef

rename MD forcefield class vars to be consistent with VASP MDSetGener…

013d113

…ator

JaGeo mentioned this pull request Feb 22, 2024

[WIP] Support general ASE Calculator for general MLIP MD simulations #738

Closed

10 tasks

esoteric-ephemera and others added 12 commits February 22, 2024 15:09

Merge branch 'materialsproject:main' into mlff_md

56c81c0

Add option to explicitly specify thermostat as ASE MolecularDynamics …

1ec5910

…object

refactoring and add T/P schedule

1e4fcb7

implement T/P schedule and attach callback func

317412c

np.nan

76ffa66

np.nan

8253a35

Merge branch 'mlff_md' of github.com:esoteric-ephemera/atomate2 into …

96ae784

…mlff-md

small fixes

298250d

pass ASE dynamics object instead of class, update docstring

0045b2b

add temp ramp test and make sure upper tria matrix cell

b870c04

add stresses to trajeoctroy observer, add pressure schedule test

71728a3

avoid stresses key not in taskdoc bug

45ddaa1

esoteric-ephemera and others added 5 commits April 1, 2024 09:39

revert default time_step units to fs

9cee412

add missing forcefields to version check in forcefield task document

da809df

allow complex schur decomposition

2fd3b6b

Merge pull request #1 from chiang-yuan/mlff-md

a65a207

allow complex schur decomposition

more snake case

3e7ced3

janosh reviewed Apr 1, 2024

View reviewed changes

janosh requested changes Apr 1, 2024

View reviewed changes

esoteric-ephemera added 2 commits April 1, 2024 14:28

Decrease number of steps in tests for speed. Fix (?) temperature / pr…

a4c6bd5

…essure scheduling

linting

edbb2e7

chiang-yuan and others added 2 commits April 2, 2024 10:41

fix T/P schedule

400bbc3

Merge pull request #2 from chiang-yuan/mlff_md

5fd59dd

fix T/P schedule

esoteric-ephemera added 2 commits April 2, 2024 11:02

linting

0a06373

replace dunder __getattribute__ --> getattr in forcefield test_md

4a88d7b

Merge branch 'materialsproject:main' into mlff_md

78ad999

janosh approved these changes Apr 3, 2024

View reviewed changes

janosh merged commit 2687443 into materialsproject:main Apr 3, 2024
6 checks passed

esoteric-ephemera deleted the mlff_md branch April 23, 2024 19:19

janosh mentioned this pull request May 3, 2024

Feature: Easy switch between GPU/CPU for forcefields #829

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forcefield molecular dynamics and forcefield refactor #722

Forcefield molecular dynamics and forcefield refactor #722

esoteric-ephemera commented Feb 16, 2024 •

edited

codecov bot commented Feb 16, 2024 •

edited

JaGeo commented Feb 16, 2024

utf commented Feb 16, 2024

esoteric-ephemera commented Feb 16, 2024

chiang-yuan commented Feb 22, 2024 •

edited

chiang-yuan commented Feb 29, 2024

janosh Apr 1, 2024

esoteric-ephemera Apr 1, 2024 •

edited

janosh left a comment

esoteric-ephemera commented Apr 1, 2024 •

edited

chiang-yuan commented Apr 2, 2024

esoteric-ephemera commented Apr 2, 2024 •

edited

utf commented Apr 3, 2024

esoteric-ephemera commented Apr 3, 2024 •

edited by janosh

utf commented Apr 3, 2024

esoteric-ephemera commented Apr 3, 2024

janosh left a comment

utf commented Apr 3, 2024

JaGeo commented Apr 3, 2024

janosh commented Apr 3, 2024

orionarcher commented Apr 4, 2024

JaGeo commented Apr 4, 2024 •

edited

Forcefield molecular dynamics and forcefield refactor #722

Forcefield molecular dynamics and forcefield refactor #722

Conversation

esoteric-ephemera commented Feb 16, 2024 • edited

Completed:

To Do's:

codecov bot commented Feb 16, 2024 • edited

Codecov Report

JaGeo commented Feb 16, 2024

utf commented Feb 16, 2024

esoteric-ephemera commented Feb 16, 2024

chiang-yuan commented Feb 22, 2024 • edited

chiang-yuan commented Feb 29, 2024

janosh Apr 1, 2024

Choose a reason for hiding this comment

esoteric-ephemera Apr 1, 2024 • edited

Choose a reason for hiding this comment

janosh left a comment

Choose a reason for hiding this comment

esoteric-ephemera commented Apr 1, 2024 • edited

chiang-yuan commented Apr 2, 2024

esoteric-ephemera commented Apr 2, 2024 • edited

utf commented Apr 3, 2024

esoteric-ephemera commented Apr 3, 2024 • edited by janosh

utf commented Apr 3, 2024

esoteric-ephemera commented Apr 3, 2024

janosh left a comment

Choose a reason for hiding this comment

utf commented Apr 3, 2024

JaGeo commented Apr 3, 2024

janosh commented Apr 3, 2024

orionarcher commented Apr 4, 2024

JaGeo commented Apr 4, 2024 • edited

esoteric-ephemera commented Feb 16, 2024 •

edited

codecov bot commented Feb 16, 2024 •

edited

chiang-yuan commented Feb 22, 2024 •

edited

esoteric-ephemera Apr 1, 2024 •

edited

esoteric-ephemera commented Apr 1, 2024 •

edited

esoteric-ephemera commented Apr 2, 2024 •

edited

esoteric-ephemera commented Apr 3, 2024 •

edited by janosh

JaGeo commented Apr 4, 2024 •

edited