Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to pymatgen VASP input sets #854

Merged
merged 73 commits into from
Jul 30, 2024

Conversation

esoteric-ephemera
Copy link
Contributor

@esoteric-ephemera esoteric-ephemera commented May 16, 2024

Summary

This PR moves all VASP input sets to the pymatgen input sets, and is dependent on pymatgen PR #3835 - now merged and released in pymatgen==2024.5.31. Merging this PR would close issues #844 and #724 (@Andrew-S-Rosen).

This large change could have breaking consequences outside the VASP library. I've checked that the VASP and LOBSTER code bases are still working as intended, so no breaking changes are expected there. That we've already had incompatibilities between the pymatgen and atomate2 implementations of the same input sets really motivates why this change is needed.

Class duplication has been reduced significantly between pymatgen and atomate2. atomate2.vasp.sets.base.VaspInputSet has been removed, with any features that were not included in its pymatgen counterpart (pymatgen.io.vasp.sets.VaspInputSet) moved to atomate2's VaspInputGenerator.

VaspInputGenerator now inherits from pymatgen's DictSet class. Some features formerly in VaspInputGenerator are included in DictSet per that PR. I've kept VaspInputGenerator after some discussions with @utf and @janosh. This class still adds two methods not included in DictSet, and which are not necessarily useful to include there. The config_dict of inherited classes are now set by their pymatgen equivalents when appropriate (e.g., MPRelaxSet for the MP GGA workflow). Some other changes to the file handling were needed because of this change.

I need to update doc strings, but I wanted to wait on more cosmetic changes until we have consensus on code structure.

Other changes:

  • Removed POTCARs and replaced them with spec files where appropriate.
  • Removed unused vaspout.h5 files, which also contain the full POTCAR (see this pymatgen PR for more context)
  • Updated pymatgen>=2024.6.4 (thanks @janosh!) and ase>=3.23.0 (many many thanks @Andrew-S-Rosen!)

Request for comments / feedback:

This PR is pretty much complete, but could use some input from various stakeholders:

  • Lobster (@naik-aakash and @JaGeo): Do you foresee any breaking changes here?
  • MatPES set (@janosh and @shyuep): The MatPES magmoms were missing from the atomate2 set previously, but are present in their pymatgen set. For now, the atomate2 MatPES set removes the magmoms for consistency with previous versions, but that may have been a bug This PR now corectly sets the magmoms for MatPES.
  • Structure (@utf, @JaGeo, @janosh): I can completely eliminate VaspInputGenerator by moving its features to RelaxSetGenerator. Right now, RelaxSetGenerator effectively aliases VaspInputGenerator. Any thoughts?

@esoteric-ephemera esoteric-ephemera changed the title Transition VASP to use pymatgen for input sets Transition VASP to pymatgen input sets May 16, 2024
@esoteric-ephemera esoteric-ephemera changed the title Transition VASP to pymatgen input sets Transition to pymatgen VASP input sets May 16, 2024
@JaGeo
Copy link
Member

JaGeo commented May 16, 2024

@esoteric-ephemera thanks. I am currently not sure at all as I cannot forsee the sideeffects of the changes.

@shyuep
Copy link
Member

shyuep commented May 16, 2024

I can foresee that the longer this is not done, the more side effects there will be. I have said it before - it is pointless to do coding in a different code base when it is something that is supposed to be another code base. Moving stuff later always creates more problems.

@JaGeo
Copy link
Member

JaGeo commented May 16, 2024

@shyuep i am not opposing this. I just need to check if there might be something 😅.

@esoteric-ephemera I guess we got our answer. lobster tests are breaking. 😬

@esoteric-ephemera
Copy link
Contributor Author

@esoteric-ephemera I guess we got our answer. lobster tests are breaking. 😬

Yeah I need to dig into this a bit - some of the tests here will fail because of the simultaneous update in pymatgen, but I need to see if that's why first

Comment on lines 166 to 169
def from_directory(directory: str | Path, optional_files: dict = None) -> VaspInput:
"""Load a set of VASP inputs from a directory.

Note that only the standard INCAR, POSCAR, POTCAR and KPOINTS files are read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to leave from_directory and _get_nedos here? could both be moved to pymatgen as well, which would allow us to remove the VaspInputGenerator class from atomate2 entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this should be added to pymatgen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to PMG

outcar: Outcar = None,
) -> dict:
@property
def kpoints_updates(self) -> dict | Kpoints:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dict | Kpoints and not just dict? if it's for parent class compat, prob the parent class needs to change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to return a Kpoints object directly is a nice feature! I think this might be in the pymatgen sets but worth checking.

Copy link
Member

@janosh janosh May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i strongly recommend homogeneous return type. just asking for trouble calling a Kpoints method on kpoints_updates that fails because it's a dict.
i'd advise to always return either dict or Kpoint. given incar_updates is dict, would minimize user surprise to return dict here as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that this is used in some places in pymatgen already:

I think having flexibility can be useful. That said, it wouldn't be much work to allow specifying a kpoints file within a dictionary format, but it does seem like extra boilerplate.

given incar_updates is dict, would minimize user surprise to return dict here as well

Incar is already a dict so you could conceivably return Incar too.

Comment on lines 16 to 17
_matpes_config_no_magmom["INCAR"].pop("MAGMOM")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatPES set (@janosh and @shyuep): The MatPES magmoms were missing from the atomate2 set previously, but are present in their pymatgen set. For now, the atomate2 MatPES set removes the magmoms for consistency with previous versions, but that may have been a bug

that sounds like a bug! thanks for catching this! let's remove the .pop("MAGMOM")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -192,7 +192,7 @@ def write_vasp_input_set(
)

if apply_incar_updates:
vis.incar.update(SETTINGS.VASP_INCAR_UPDATES)
vis["INCAR"].update(SETTINGS.VASP_INCAR_UPDATES)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might make sense to add

@property
def incar(self) -> Incar:
    """INCAR object."""
    return self["INCAR"]

and co for POTCAR, POSCAR, ... to pymatgen DictSet so we can keep vis.incar here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done on the pmg side

@janosh
Copy link
Member

janosh commented May 16, 2024

Yeah I need to dig into this a bit - some of the tests here will fail because of the simultaneous update in pymatgen, but I need to see if that's why first

@esoteric-ephemera you could temp modify this step to install your pmg PR to test against it until both are ready to go

- name: Install pymatgen from master if triggered by pymatgen repo dispatch
if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
run: pip install --upgrade 'git+https://github.com/materialsproject/pymatgen@${{ github.event.client_payload.pymatgen_ref }}'

      - name: Install pymatgen from master if triggered by pymatgen repo dispatch
-       if: github.event_name == 'repository_dispatch' && github.event.action == 'pymatgen-ci-trigger'
-       run: pip install --upgrade 'git+https://github.com/materialsproject/pymatgen@${{ github.event.client_payload.pymatgen_ref }}'
+       run: pip install --upgrade 'git+https://github.com/materialsproject/pymatgen@refs/pull/854/head'

@esoteric-ephemera
Copy link
Contributor Author

esoteric-ephemera commented May 17, 2024

Only the lobster flows (and not the jobs) seemed to be impacted by this change - these just needed a small modification to work correctly

@JaGeo
Copy link
Member

JaGeo commented May 19, 2024

@esoteric-ephemera we are happy to look at it once the tests are passing.

Why had the Lobster test file to be removed? Thanks in advance.

@esoteric-ephemera
Copy link
Contributor Author

Why had the Lobster test file to be removed? Thanks in advance.

I don't see a lobster test file removed; the vasp.test_base file was renamed vasp.test_run because there's no longer tests of the base classes in it. The remaining test in that file is for atomate2.vasp.run

@JaGeo
Copy link
Member

JaGeo commented May 20, 2024

@esoteric-ephemera I am referring to your last commit.

Is it ready for us to take a look? Are the tests passing?

@esoteric-ephemera
Copy link
Contributor Author

esoteric-ephemera commented May 20, 2024

Ok I see - not all of the lobster tests are run in a temporary directory, so if a test fails, there might be intermediate files written to disk. That happened and the remove lobster test file commit was really removing an intermediate file

Let me know if I should change that behavior for the tests

The tests are passing on my side - I can't get pmg to build from git in the CI nor locally (some issues with c dependencies) but I'll look at that today

@JaGeo
Copy link
Member

JaGeo commented May 20, 2024

Ok I see - not all of the lobster tests are run in a temporary directory, so if a test fails, there might be intermediate files written to disk. That happened and the remove lobster test file commit was really removing an intermediate file

Let me know if I should change that behavior for the tests

The tests are passing on my side - I can't get pmg to build from git in the CI nor locally (some issues with c dependencies) but I'll look at that today

Ah, I see! Thanks! We might want to modify the Lobster tests in a separate PR then. This will otherwise again lead to many changes not directly related to this PR. Thanks!

I will test the workflows with the latest pymatgen PR tonight.

@JaGeo
Copy link
Member

JaGeo commented May 21, 2024

@esoteric-ephemera I currently get very weird errors when using these versions of atomate2 and pymatgen to run the Lobster workflow. ("free(): invalid pointer").

@naik-aakash do you maybe have a chance to shortly install both this branch of atomate2 and the linked pymatgen branch and check the Lobster workflow? It might be something else off with my settings...

@esoteric-ephemera
Copy link
Contributor Author

esoteric-ephemera commented May 21, 2024

@JaGeo which python did you build your env with? I'm using 3.11, but the cython libraries were reorganized between 3.10 and 3.11, that might lead to the C issues you're seeing (same reason the CI can't build an env for 3.11: as it recompiles the C libraries for python, and pymatgen explicitly includes modules like longintrepr.h which were moved, the build fails)

@JaGeo
Copy link
Member

JaGeo commented May 21, 2024

@esoteric-ephemera 3.10, I think. Currently on a train (at least not stuck anymore) and cannot definitely confirm

@esoteric-ephemera
Copy link
Contributor Author

No worries @JaGeo. I rebuilt my python env with 3.10 and the tests in tests/vasp/lobster still pass for me. No rush on this

@DanielYang59
Copy link

DanielYang59 commented Jul 21, 2024

Thanks for pinging me for diagnosing @janosh. I believe materialsvirtuallab/monty#699 would fix this issue @esoteric-ephemera, please let me know if you have further comments, thanks!

With:

@deprecated(replacement=MatPESStaticSet, deadline=(2025, 1, 1))
@dataclass
class MatPesGGAStaticSetGenerator(MatPESStaticSet):
    """Class to generate MP-compatible VASP GGA static input sets."""

    xc_functional: Literal["R2SCAN", "PBE", "PBE+U"] = "PBE"
    auto_ismear: bool = False
    auto_kspacing: bool = False

We now have:

from atomate2.vasp.sets.matpes import MatPesGGAStaticSetGenerator

generator = MatPesGGAStaticSetGenerator()

assert generator.xc_functional == "PBE"
assert not generator.auto_ismear
print(generator.as_dict())

Output:

{'@module': 'atomate2.vasp.sets.matpes', '@class': 'MatPesGGAStaticSetGenerator', '@version': '0.0.14.post173+g56d77311', 'structure': None, 'config_dict': {'PARENT': 'PBE64Base', 'INCAR': {'ALGO': 'Normal', 'EDIFF': 1e-05, 'ENAUG': 1360, 'ENCUT': 680, 'GGA': 'PE', 'ISMEAR': 0, 'ISPIN': 2, 'KSPACING': 0.22, 'LAECHG': True, 'LASPH': True, 'LCHARG': True, 'LMIXTAU': True, 'LORBIT': 11, 'LREAL': False, 'LWAVE': False, 'NELM': 200, 'NSW': 0, 'PREC': 'Accurate', 'SIGMA': 0.05, 'LDAU': False, 'LMAXMIX': 6, 'LDAUJ': {'F': {'Co': 0, 'Cr': 0, 'Fe': 0, 'Mn': 0, 'Mo': 0, 'Ni': 0, 'V': 0, 'W': 0}, 'O': {'Co': 0, 'Cr': 0, 'Fe': 0, 'Mn': 0, 'Mo': 0, 'Ni': 0, 'V': 0, 'W': 0}}, 'LDAUL': {'F': {'Co': 2, 'Cr': 2, 'Fe': 2, 'Mn': 2, 'Mo': 2, 'Ni': 2, 'V': 2, 'W': 2}, 'O': {'Co': 2, 'Cr': 2, 'Fe': 2, 'Mn': 2, 'Mo': 2, 'Ni': 2, 'V': 2, 'W': 2}}, 'LDAUTYPE': 2, 'LDAUU': {'F': {'Co': 3.32, 'Cr': 3.7, 'Fe': 5.3, 'Mn': 3.9, 'Mo': 4.38, 'Ni': 6.2, 'V': 3.25, 'W': 6.2}, 'O': {'Co': 3.32, 'Cr': 3.7, 'Fe': 5.3, 'Mn': 3.9, 'Mo': 4.38, 'Ni': 6.2, 'V': 3.25, 'W': 6.2}}, 'MAGMOM': {'Ce': 5, 'Ce3+': 1, 'Co': 0.6, 'Co3+': 0.6, 'Co4+': 1, 'Cr': 5, 'Dy3+': 5, 'Er3+': 3, 'Eu': 10, 'Eu2+': 7, 'Eu3+': 6, 'Fe': 5, 'Gd3+': 7, 'Ho3+': 4, 'La3+': 0.6, 'Lu3+': 0.6, 'Mn': 5, 'Mn3+': 4, 'Mn4+': 3, 'Mo': 5, 'Nd3+': 3, 'Ni': 5, 'Pm3+': 4, 'Pr3+': 2, 'Sm3+': 5, 'Tb3+': 6, 'Tm3+': 2, 'V': 5, 'W': 5, 'Yb3+': 1}}, 'POTCAR_FUNCTIONAL': 'PBE_64', 'POTCAR': {'Ac': 'Ac', 'Ag': 'Ag', 'Al': 'Al', 'Am': 'Am', 'Ar': 'Ar', 'As': 'As', 'At': 'At', 'Au': 'Au', 'B': 'B', 'Ba': 'Ba_sv', 'Be': 'Be_sv', 'Bi': 'Bi', 'Br': 'Br', 'C': 'C', 'Ca': 'Ca_sv', 'Cd': 'Cd', 'Ce': 'Ce', 'Cf': 'Cf', 'Cl': 'Cl', 'Cm': 'Cm', 'Co': 'Co', 'Cr': 'Cr_pv', 'Cs': 'Cs_sv', 'Cu': 'Cu_pv', 'Dy': 'Dy_3', 'Er': 'Er_3', 'Eu': 'Eu', 'F': 'F', 'Fe': 'Fe_pv', 'Fr': 'Fr_sv', 'Ga': 'Ga_d', 'Gd': 'Gd', 'Ge': 'Ge_d', 'H': 'H', 'He': 'He', 'Hf': 'Hf_pv', 'Hg': 'Hg', 'Ho': 'Ho_3', 'I': 'I', 'In': 'In_d', 'Ir': 'Ir', 'K': 'K_sv', 'Kr': 'Kr', 'La': 'La', 'Li': 'Li_sv', 'Lu': 'Lu_3', 'Mg': 'Mg_pv', 'Mn': 'Mn_pv', 'Mo': 'Mo_pv', 'N': 'N', 'Na': 'Na_pv', 'Nb': 'Nb_pv', 'Nd': 'Nd_3', 'Ne': 'Ne', 'Ni': 'Ni_pv', 'Np': 'Np', 'O': 'O', 'Os': 'Os_pv', 'P': 'P', 'Pa': 'Pa', 'Pb': 'Pb_d', 'Pd': 'Pd', 'Pm': 'Pm_3', 'Po': 'Po_d', 'Pr': 'Pr_3', 'Pt': 'Pt', 'Pu': 'Pu', 'Ra': 'Ra_sv', 'Rb': 'Rb_sv', 'Re': 'Re_pv', 'Rh': 'Rh_pv', 'Rn': 'Rn', 'Ru': 'Ru_pv', 'S': 'S', 'Sb': 'Sb', 'Sc': 'Sc_sv', 'Se': 'Se', 'Si': 'Si', 'Sm': 'Sm_3', 'Sn': 'Sn_d', 'Sr': 'Sr_sv', 'Ta': 'Ta_pv', 'Tb': 'Tb_3', 'Tc': 'Tc_pv', 'Te': 'Te', 'Th': 'Th', 'Ti': 'Ti_pv', 'Tl': 'Tl_d', 'Tm': 'Tm_3', 'U': 'U', 'V': 'V_pv', 'W': 'W_sv', 'Xe': 'Xe', 'Y': 'Y_sv', 'Yb': 'Yb_3', 'Zn': 'Zn', 'Zr': 'Zr_sv'}}, 'files_to_transfer': {}, 'user_incar_settings': {}, 'user_kpoints_settings': {}, 'user_potcar_settings': {}, 'constrain_total_magmom': False, 'sort_structure': True, 'user_potcar_functional': 'PBE_64', 'force_gamma': False, 'reduce_structure': None, 'vdw': None, 'use_structure_charge': False, 'standardize': False, 'sym_prec': 0.1, 'international_monoclinic': True, 'validate_magmom': True, 'inherit_incar': ['LPEAD', 'NGX', 'NGY', 'NGZ', 'SYMPREC', 'IMIX', 'LMAXMIX', 'KGAMMA', 'ISYM', 'NCORE', 'NPAR', 'NELMIN', 'IOPT', 'NBANDS', 'KPAR', 'AMIN', 'NELMDL', 'BMIX', 'AMIX_MAG', 'BMIX_MAG'], 'auto_kspacing': False, 'auto_ismear': False, 'auto_ispin': False, 'auto_lreal': False, 'auto_metal_kpoints': False, 'bandgap_tol': 0.0001, 'bandgap': None, 'prev_incar': None, 'prev_kpoints': None, '_valid_potcars': None, 'xc_functional': 'PBE'}

@esoteric-ephemera
Copy link
Contributor Author

Thanks @DanielYang59! That looks more like what I expect to see from the as_dict()

Copy link

codecov bot commented Jul 22, 2024

Codecov Report

Attention: Patch coverage is 88.77005% with 21 lines in your changes missing coverage. Please review.

Project coverage is 75.13%. Comparing base (29a5731) to head (bd98d1b).
Report is 67 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #854      +/-   ##
==========================================
+ Coverage   74.94%   75.13%   +0.19%     
==========================================
  Files         136      142       +6     
  Lines       10513    10614     +101     
  Branches     1643     1580      -63     
==========================================
+ Hits         7879     7975      +96     
- Misses       2143     2167      +24     
+ Partials      491      472      -19     
Files Coverage Δ
src/atomate2/abinit/sets/base.py 76.25% <ø> (ø)
src/atomate2/common/files.py 80.39% <ø> (ø)
src/atomate2/common/jobs/defect.py 81.81% <ø> (+0.13%) ⬆️
src/atomate2/cp2k/jobs/base.py 89.23% <ø> (ø)
src/atomate2/vasp/files.py 85.96% <100.00%> (ø)
src/atomate2/vasp/flows/defect.py 83.58% <100.00%> (ø)
src/atomate2/vasp/flows/matpes.py 72.41% <100.00%> (ø)
src/atomate2/vasp/flows/mp.py 90.32% <100.00%> (ø)
src/atomate2/vasp/jobs/lobster.py 91.93% <100.00%> (+0.26%) ⬆️
src/atomate2/vasp/jobs/matpes.py 86.66% <100.00%> (ø)
... and 8 more

... and 6 files with indirect coverage changes

@@ -35,11 +35,11 @@ def copy_files(
either "usernamehost" or just "host" in which case the username
will be inferred from the current user. If ``None``, the local filesystem will
be used as the source.
include_files : None or list of (str or .Path)
include_files : None | list[str | Path]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you changed the doc strings? I think the correct style is to use "or" in the string since it is human readable. E.g. see here https://numpydoc.readthedocs.io/en/latest/format.html#parameters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make these changes, they're from here. But I'll revert them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK reverted!

Copy link
Member

@utf utf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @esoteric-ephemera just noticed one thing about the doc strings. After that, is this ready to merge or do we need to wait for a new monty release?

@esoteric-ephemera
Copy link
Contributor Author

esoteric-ephemera commented Jul 24, 2024

Hi @esoteric-ephemera just noticed one thing about the doc strings. After that, is this ready to merge or do we need to wait for a new monty release?

Just made the docstr changes. We could wait for monty, right now the deprecation warnings print something like:

<string>:31: FutureWarning: __post_init__ is deprecated, and will be removed on 2025-01-01
Use MPRelaxSet in pymatgen.io.vasp.sets instead.

which isn't very instructive because of the name used. Happy to change it if preferred

@utf
Copy link
Member

utf commented Jul 25, 2024

Ok, once a new version of monty is released we can merge this. Thanks very much for your work getting this over the line.

@esoteric-ephemera
Copy link
Contributor Author

Sounds good and thanks!

warning before:
<string>:31: FutureWarning: __post_init__ is deprecated, and will be removed on 2025-01-01
Use MPRelaxSet in pymatgen.io.vasp.sets instead.

warning after:
<string>:31: FutureWarning: MPGGARelaxSetGenerator is deprecated, and will be removed on 2025-01-01
Use MPRelaxSet in pymatgen.io.vasp.sets instead.
@janosh
Copy link
Member

janosh commented Jul 29, 2024

monty==v2024.7.29 was released 8 hours ago so i'll set this to auto-merge. thanks @esoteric-ephemera for the massive amount of work that went into this PR! 🙏

auto-merge was automatically disabled July 29, 2024 21:19

Head branch was pushed to by a user without write access

@esoteric-ephemera
Copy link
Contributor Author

Fantastic @janosh! Added a little fix for the Lobster schemas to work with the newer version of monty, pending your PR getting released

@utf utf enabled auto-merge (squash) July 30, 2024 13:50
@utf
Copy link
Member

utf commented Jul 30, 2024

Thanks @esoteric-ephemera this is fantastic.

@utf utf merged commit fad9396 into materialsproject:main Jul 30, 2024
6 checks passed
@utf utf added the feature A new feature being added label Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature being added
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants