Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add forcefield schemas/makers to atomate2 #322

Merged
merged 35 commits into from
May 23, 2023

Conversation

matthewkuner
Copy link
Collaborator

@matthewkuner matthewkuner commented May 3, 2023

Summary

  • Add the ability to use forcefields (e.g. CHGNet) to relax structures in Atomate2.

Additional dependencies introduced

  • chgnet, which depends on pytorch (relatively large)

TODO

  • Merge with main

Comments

Based on what Shyue Ping told me here, I think we should wait until M3GNet is fully moved to the matgl repository, because that will the location for future development of M3GNet. This will also make it so that users will only need pytorch for both CHGNet and M3GNet (as opposed to needing TensorFlow too, as TensorFlow powers the older M3GNet implementation).

EDIT: m3gnet support was later added to Atomate2 in #380

@matthewkuner
Copy link
Collaborator Author

matthewkuner commented May 3, 2023

I tested this as 1) a standalone job, and 2) as a two-part flow with the CHGNet prerelax job feeding into a VASP job. Both worked.

@matthewkuner
Copy link
Collaborator Author

People may want to use CHGNet makers as an input in the VASP flows that currently require inputted variables to be of type BaseVaspMaker. Would it be reasonable to change this to allow CHGNetRelaxMakers as well? (or perhaps some base FFRelaxMaker class that I could make)

@codecov
Copy link

codecov bot commented May 4, 2023

Codecov Report

Merging #322 (872c780) into main (18582f4) will increase coverage by 0.40%.
The diff coverage is 95.78%.

❗ Current head 872c780 differs from pull request most recent head f6a51a9. Consider uploading reports for the commit f6a51a9 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #322      +/-   ##
==========================================
+ Coverage   65.57%   65.97%   +0.40%     
==========================================
  Files          72       74       +2     
  Lines        7018     7113      +95     
  Branches      896      904       +8     
==========================================
+ Hits         4602     4693      +91     
- Misses       2150     2152       +2     
- Partials      266      268       +2     
Impacted Files Coverage Δ
src/atomate2/forcefields/jobs.py 93.75% <93.75%> (ø)
src/atomate2/forcefields/schemas.py 96.82% <96.82%> (ø)

@matthewkuner
Copy link
Collaborator Author

matthewkuner commented May 4, 2023

I did not add the new dependency for chgnet/pytorch--unsure where in the pyproject.toml that should be done / how the version restrictions should work. @utf any help?

Copy link
Member

@utf utf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matthewkuner, thanks very much for this! It is in a great state for an initial PR.

I've added a bunch of comments. They are mainly style changes, but I had some more thoughts about the task document (I've been thinking about this more since we spoke on slack).

I think it would be nice to add:

  1. A CHGNetStaticMaker that will just calculate forces but not do a relaxation. After you've made the changes I suggested this should be very quick to implement by subclassing CHGNetRelaxMaker. Just something like:
@dataclass
class CHGNetStaticMaker(CHGNetRelaxMaker):
    """
   Maker to calculate forces and stresses using the CHGNet force field.

    Parameters
    ----------
    name: str
        The job name.
    relax_kwargs : dict
        Keyword arguments that will get passed to :obj:`StructOptimizer.relax`.
    optimizer_kwargs : dict
        Keyword arguments that will get passed to :obj:`StructOptimizer()`.
    task_document_kwargs : dict
        Keyword arguments that will get passed to :obj:`ForceFieldTaskDocument()`.
    """

    name: str = "CHGNet static"
    relax_kwargs: dict = field(default_factory=dict)
    optimizer_kwargs: dict = field(default_factory={"relax_cell": False, "steps": 1})
    task_document_kwargs: dict = field(default_factory=dict)
  1. Can you add tests. CHGNet should be fast enough to run in the testing environment. Just use silicon and limit steps to 10

As for where to add the dependency. You should add it to the strict set of dependencies with a specified version number (the latest one). You can also add a new set of dependencies called chgnet which isn't pinned to a specific version. E.g., see how this is done for the lobster set:

lobster = ["lobsterpy"]

Thanks again, and please get in touch if you have any questions.

src/atomate2/forcefields/flows.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/flows.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/flows.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/flows.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/flows.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/schemas.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/schemas.py Outdated Show resolved Hide resolved
src/atomate2/forcefields/schemas.py Outdated Show resolved Hide resolved
Comment on lines 53 to 55
trajectory: dict = Field(
None, description="Step-by-step trajectory of the structural relaxation."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try and emulate the ionic_steps document in the VASP Task document. See https://github.com/materialsproject/emmet/blob/2289affb31d9b18aa82e49484ca772a2ce80e934/emmet-core/emmet/core/vasp/calculation.py#LL305C5-L305C114

and

https://github.com/materialsproject/emmet/blob/2289affb31d9b18aa82e49484ca772a2ce80e934/emmet-core/emmet/core/vasp/calculation.py#L262

So basically, it is a list of dicts rather than multiple dicts of lists. Rather than the funky e_fr_energy etc in the IonicStep linked, you can just have energy. You can also remove electronic_steps and add magmoms.

This will involve reorganising the output of CHGNet somewhat but will bring the outputs into line with VASP, CP2K, etc.

Note that we should also store the structure at each step. You can convert an ase atoms object using:

from pymatgen.io.ase import AseAtomsAdaptor
structure = AseAtomsAdaptor.get_structure(atoms)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you'd want one of the output fields to be ionic_steps, which is a list. That list will contain as many dicts as are specifies, which would include things like energy, structure, etc.? Sure I can do that.

Separately--considering that the linked VASP Task Document doesn't even let users specify which parts of the trajectory to keep, I don't think we should either. I'll just replace the tuple/list keep_info with a bool keep_trajectory. Let me know if you have any objections.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it appears that users don't even have the option to not store the trajectory for MD jobs. Seems pointless to exclude it for FF relaxations in that case. I'll just remove the tag entirely.

Copy link
Collaborator Author

@matthewkuner matthewkuner May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: converting the Atoms objects to structures. It appears that CHGNet only outputs a single Atoms object, rather than one for each step.

update: I've been trying to convert the 'atom_positions' and 'cells' outputs from results['trajectory'], but am not having success.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the keep_info tag is actually useful and just because its not there in VASP doesn't mean we can't add it here. So I would keep that in and forget about keep_trajectory for now. However, one we add in MD through force fields, we will want to add trajectory support (a latter PR).

You can generate a new structure from the positions and cells along the lines of:

species = atoms.get_chemical_symbols()
for pos, cell in zip(atom_positions, cells):
    structure = Structure(cell, species, positions, coords_are_cartesian=False)

Double check whether the atom_positions are in cartesian coordinates and update coords_are_cartesian accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't know about the coords_are_cartesian input, and missed it when looking through the documentation. That fixed it.

src/atomate2/forcefields/schemas.py Outdated Show resolved Hide resolved
matthewkuner and others added 12 commits May 4, 2023 15:14
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
Co-authored-by: Alex Ganose <utf@users.noreply.github.com>
@mkhorton
Copy link
Member

mkhorton commented May 5, 2023

Adding a brief note that pymatgen has started offering an API that may make the code a bit simpler:

https://pymatgen.org/pymatgen.core.structure.html#pymatgen.core.structure.Structure.relax

I believe there’s an open issue to add chgnet as a relaxer too, I think @janosh is involved.

@janosh
Copy link
Member

janosh commented May 5, 2023

I believe there’s an open issue to add chgnet as a relaxer too, I think @janosh is involved.

Yes, I think CHGnet would make a great addition to Structure.relax() and should be straightforward to add. Last I checked with @BowenD-UCB he wanted to wait a bit to make sure the CHGNet API has stabilized. I think that might be the case now?

@matthewkuner
Copy link
Collaborator Author

@mkhorton would implementing CHGNet into pymatgen's structure objects make this Flow redundant? Or are they distinct use-cases?

@utf
Copy link
Member

utf commented May 5, 2023

I don't think that function makes anything redundant, I think you we could just use it inside your job rather than constructing the relaxer manually (once chgnet has been implemented)

@matthewkuner
Copy link
Collaborator Author

What do you mean by

we could just it inside your job rather than constructing the relaxer manually (once chgnet has been implemented)
? I'm not following

@janosh
Copy link
Member

janosh commented May 5, 2023

Or are they distinct use-cases?

@matthewkuner Both are useful! Structure.relax for one-off analysis and this maker for when you want to integrate CHGNet into a HT wf. But Structure.relax could slightly simplify this maker's implementation. No need to wait though. Can be a non-breaking change later.

@utf
Copy link
Member

utf commented May 5, 2023

@matthewkuner missed a "use" from my message. But effectively your job would simplify from

     @job(output_schema=FFStructureRelaxDocument)
     def make(self, structure: Structure):
         """
         Perform a relaxation of a structure using CHGNet.

         Parameters
         ----------
         structure : .Structure
             A pymatgen structure.

         """
         from chgnet.model import StructOptimizer

         relaxer = StructOptimizer(**self.optimizer_kwargs)
         result = relaxer.relax(
             structure, relax_cell=self.relax_cell, **self.relax_kwargs
         )

         ff_structure_relax_doc = FFStructureRelaxDocument.from_chgnet_result(
             structure,
             self.relax_cell,
             self.relax_kwargs,
             self.optimizer_kwargs,
             result,
             self.keep_info,
         )

         return ff_structure_relax_doc

To

     @job(output_schema=FFStructureRelaxDocument)
     def make(self, structure: Structure):
         """
         Perform a relaxation of a structure using CHGNet.

         Parameters
         ----------
         structure : .Structure
             A pymatgen structure.

         """
         result = structure.relax(
             structure, relax_cell=self.relax_cell, **self.relax_kwargs
         )

         ff_structure_relax_doc = FFStructureRelaxDocument.from_chgnet_result(
             structure,
             self.relax_cell,
             self.relax_kwargs,
             self.optimizer_kwargs,
             result,
             self.keep_info,
         )

         return ff_structure_relax_doc

@utf
Copy link
Member

utf commented May 8, 2023

@matthewkuner are you using the latest jobflow version? I should have fixed this in v0.1.11

@matthewkuner
Copy link
Collaborator Author

@utf I was using v.0.1.9. I'll check it again with the latest version of jobflow soon.

@matthewkuner
Copy link
Collaborator Author

@utf I think this is getting very close to complete. Let me know if you have any other things you'd like me to add/change.

@Andrew-S-Rosen
Copy link
Member

This is very cool! Looking forward to using this in my own work! @utf, could we get a new version release after this is merged if you don't mind?

@matthewkuner matthewkuner changed the title [WIP] add forcefield schemas/makers to atomate2 Add forcefield schemas/makers to atomate2 May 17, 2023
@utf
Copy link
Member

utf commented May 23, 2023

Thanks very much for this!

@utf utf merged commit 1e0a4aa into materialsproject:main May 23, 2023
1 of 5 checks passed
@janosh
Copy link
Member

janosh commented Jun 3, 2023

I just read through some of the changes here as part of #362 and want to say this is a really excellent contribution! Thanks for the work you put into this @matthewkuner! 🙏

@rkingsbury
Copy link
Contributor

Looking great to me so far! Are you planning to replicate the unit tests we have in pymatgen / atomate for the legacy workflow?

@matthewkuner
Copy link
Collaborator Author

@rkingsbury I wrote my own few tests (as I didn't know pymatgen had a m3gnet method /tests until I had written my own tests), so I wasn't planning to. Is it important?

@rkingsbury
Copy link
Contributor

@rkingsbury I wrote my own few tests (as I didn't know pymatgen had a m3gnet method /tests until I had written my own tests), so I wasn't planning to. Is it important?

Ooops, sorry @matthewkuner , I posted that on the wrong PR! Please disregard. (Although I am excited to see this, too)

@matthewkuner matthewkuner added enhancement Improvements to existing features good first issue Good for newcomers labels Aug 21, 2023
@utf utf added feature A new feature being added and removed enhancement Improvements to existing features labels Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature being added good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants