Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cp2k 2.0 #2672

Merged
merged 284 commits into from Jan 3, 2023
Merged

Cp2k 2.0 #2672

merged 284 commits into from Jan 3, 2023

Conversation

nwinner
Copy link
Contributor

@nwinner nwinner commented Sep 29, 2022

Summary

PR for major updates to CP2K functionality in pmg, plus a minor edit to compatibility.py. PR has been geared towards simplifying certain features, while making others more compatible with the new atomate2 package, which will use this version.

inputs.py

  • New Kpoints section objects
  • New bandstructure objects
  • dataclass representations of basis sets and pseudopotentials

sets.py

  • Simplified number of sets
  • DftSet now supports new basis/potential scheme
  • DftSet now handles more printing/features
  • DftSet now includes a simple validator for non-compatible features

outputs.py:

  • Now supports parsing some of the CP2K properties, such as the hyperfine_coupling tensor. Support for all properties not available
  • New Dos parsing

utils.py

  • Now supports fewer functions. Some of the messy basis utility functions are now handled using the new objects in inputs.py

pmg_config.py

  • In order to make code similar to VASP module, and to allow for improved handling of pseudopotentials, a config option has been added to pmg_config that is similar to the potcar configuration for VASP. This will now replace the settings.yaml file, which is a large file to move through github.

…pdates to the corrections module, which should include some simpler corrections like point charge. Added a WIP update to the thermodynamics module, which is going to include support for predomenance diagrams in the near future
…into CP2K

# Conflicts:
#	pymatgen/command_line/bader_caller.py
#	pymatgen/io/cp2k/inputs.py
#	pymatgen/io/cp2k/outputs.py
#	pymatgen/io/cp2k/sets.py
#	pymatgen/io/cp2k/tests/test_sets.py
#	pymatgen/io/cp2k/utils.py
…into CP2K

# Conflicts:
#	pymatgen/command_line/bader_caller.py
#	pymatgen/io/cube.py
ordering of the sites, except when you use the get_unique_site_indices
function to get aliases such as {'Mg_1': [...], 'Mg_2': [...], 'O_1': [...]}.

Then, the sites are given by a alias: index dictionary that doesn't preserve
order, and so sites can get jumbled when this happens. It doesn't always happen, so
it was hard to identify. Now, when aliasing is used for the Coord section,
the keywords are created in order of their indices. To further protect against
these sorts of issues for unforseen problems, I'm changing "self.keywords"
to be stored as an OrderedDict once they are provided to the Section. Since
Ordered Dicts inherit from Dicts, this shouldn't change anything.

Lastly, I have applied black formatting to everything to keep consistent.
aliases.get because SupportsLessThan is not part of it, even though
it worked in practice. While this new method is less elegant, it still
works well enough.
…into CP2K

# Conflicts:
#	pymatgen/command_line/bader_caller.py
…into defects

# Conflicts:
#	pymatgen/analysis/defects/corrections.py
#	pymatgen/analysis/defects/thermodynamics.py
transformation history of a parent and child task.
the pseudopotential of the atom is turned off, but the basis functions are
left behind.
@nwinner
Copy link
Contributor Author

nwinner commented Dec 16, 2022

@shyuep @janosh (anybody else who should get @ for this?)

After a very long time, I finally finished this PR. Could I get a review?

Two tests failing on split 1 are related to chemenv tests, which are not part of this PR and are an issue elsewhere in pymatgen.

nwinner and others added 5 commits December 16, 2022 15:44
@janosh
Copy link
Member

janosh commented Dec 17, 2022

Sorry, currently traveling and not in front of a large enough screen to review such a significant PR. But looks like a lot of great work! 🥇

nwinner and others added 5 commits December 17, 2022 12:35
rename single-letter vars (discouraged by flake8)
…to ValueError

complete pymatgen/io/cp2k/utils.py _(pre|post)processor doc strs
add type anno
pymatgen/io/cp2k/utils.py Outdated Show resolved Hide resolved
nwinner and others added 3 commits January 3, 2023 11:13
required for mypy to analyze functions, found potential dict unpacking error in DftSet subclasses
fix typos
@janosh
Copy link
Member

janosh commented Jan 3, 2023

Thanks @nwinner! This looks like a ton of work! 👍

@shyuep @mkhorton Merging this as test and lint errors appear unrelated.

@janosh janosh merged commit f0209c9 into materialsproject:master Jan 3, 2023
lbluque pushed a commit to lbluque/pymatgen that referenced this pull request Jan 5, 2023
* Added a property assignment structure for testing. Tests not yet written.

* Modified test file. Removed NLCC_POTENTIALS.yaml, which has no use at the moment.

* Commits to this branch will be updates to defects module. Added WIP updates to the corrections module, which should include some simpler corrections like point charge. Added a WIP update to the thermodynamics module, which is going to include support for predomenance diagrams in the near future

* Defect module WIP

* Try else got tripped unecessarily. Fixed with conditional.

Also removed a print statement.

* Updated import statement.

* Updated import statement.

* Updated import statement.

* Fixed charge transfer, which was not working.

* Slightly better version of last commit

* Plotting updates.

* A very subtle error... Everything in the sets/inputs preserves the
ordering of the sites, except when you use the get_unique_site_indices
function to get aliases such as {'Mg_1': [...], 'Mg_2': [...], 'O_1': [...]}.

Then, the sites are given by a alias: index dictionary that doesn't preserve
order, and so sites can get jumbled when this happens. It doesn't always happen, so
it was hard to identify. Now, when aliasing is used for the Coord section,
the keywords are created in order of their indices. To further protect against
these sorts of issues for unforseen problems, I'm changing "self.keywords"
to be stored as an OrderedDict once they are provided to the Section. Since
Ordered Dicts inherit from Dicts, this shouldn't change anything.

Lastly, I have applied black formatting to everything to keep consistent.

* mypy on github (but not on my machine, oddly) didn't like the use of
aliases.get because SupportsLessThan is not part of it, even though
it worked in practice. While this new method is less elegant, it still
works well enough.

* Reorder import. Couple of formatting corrections from pylint.

* Update plotter.

* Created a polaron class as a child of Substitution.

* Trying out a "RelaxationTransformation" to help stitch together the
transformation history of a parent and child task.

* Tweak as_dict() summary.

* The GhostVacancy is a special vacancy object for CP2K calculations where
the pseudopotential of the atom is turned off, but the basis functions are
left behind.

* added dtype=object to np.array

* Update import statement

* Assembled basis sets, pseudopotential, and DFT+U settings into a single
settings file, which should be cleaner than using separate ones.

* Updated in accordance with new settings.yaml file.

* Update in accordance with new settings.yaml file.

* Update to one of the hubbard values.

* Fixed the getting of hubbard values.

* Updated predominance diagrams

* A temporary measure on my dev branch for avoid Potcar hashing with CP2K calculations

* These files are no longer necessary, as settings.yaml will hold all
this info.

* Update attribute.

* Update attribute

* Update the plot with "combine_charges" to make cleaner figures

* Keywords must have strings for names. Numbers will cause problems.

* Update to the mo_eigenvalues parsing for non-OT compatability. Parsing a raw
text file is incredibly sloppy.

* Update to the file reading. Subsections can have "subsection params",
some of which are used to alias the subsection, but some of which
are variables like "on" or "off". They are problematic from the
point of view of the Section objects. This will fix the issue of cp2k
restart files automatically assigning some of these superfluous
params, but is not a totally robust solution.

* Change to default diagonalization params.

* Update dict band gap.

* Updating mo_eigenvalues parsing, which is getting troublesome mixing
OT and diagonalization FWs now.

* Modifications to the DOS parser so that its more integrated with the
Dos module. Also made the DOS parser the default method for extracting
gaps, vbm, and cbm, which works for either OT or diagonalization.

* Update for aux_basis set. "match" keyword will now use primary basis, which
can be used for defect sites. Starting to get lots of spaghetti code here.
Also fixed indentation error.

* Added aux_basis argument

* Don't assume initial structure was parsed.

* Need hubbard > 0

* Update to diagonalization settings.

* Section Lists

* Reject if sizes don't match

* Testing new plot.

* Possible hash for defects.

* Nevermind.

* Remove warning.

* Update plus_u and xc_functional

* fix

* Update for functional processing. Do not like this as it relies on
having the input file present, but the text in the output file is too
variable to parse reliably.

* Still need to figure out how to get mos automatically. Upped it for
now. Elec temp of 500 maybe have been a little too high.

* Change how output parser gets the functional.

* Use the getattr operation to be case insensitive.

* Update section descriptions.

* So bader can use reference charges with cubes. Will submit patch and merge.

* This function should not default to cm^-3 units. Most/all of pymatgen does things per formula unit / composition object. So that's how this should be as well. The exponential is for an individual atom, and the multiplicity attribute will convert it to a per f.u. basis.

* Dos objects

* Use structure matcher instead of exact equality to deal with empty
site properties.

* fixing cube

* Fix for getting items. Now there is a seperate get function for keyword
and section, so internal methods can be more robust.

* Update to docstring and import.

* Adsorbent class.

* Concentration should be in formula units by default.

* (1) prettier plots

(2) get_doping() added since n/p are reused.

* Docstring.

* Add rotations

* forgot to make msonable

* Kpoints keyword

Scheme keyword was missing. Lead to errors.

* Updates for functionals

Some updates for functionals to make LibXC slightly easier.

* Parse initial structure

For some reason, one example did not show the default blank line. Not sure if it was the new version of cp2k or something else, but make the blank line a conditional now.

* �Parse initial structure

Last commit did not cover edge case. This should work.

* Updates.

Fix readline for both cases of coord table. Modify sets for hybrids. New utility function for truncation radius.

* optFIT basis

Opt fit basis sets should be preferred for C and B. These are not part of standard dist. and should be added to settings as well.

* Convergence criteria

Update the force convergence criteria to roughly match the MPRelaxSet

* Add option for strict updates.

Update now supports optional "strict" keyword to avoid inserting new keywords and sections.

* Update hybrid

Slight tweaks to logic for hybrid sets.

* Update test

Nonsense structure cannot use dummy element. Sets check validity of element inputs now.

* Update functionals

Hybrid was not being registered for PBE0 since only one functional is used.

* Add cutoff radius

Include cutoff radius even if not used in order to suppress warnings.

* Update core defects

Adsorbate is that species that adsorbs onto a surface, adsorbent is surface doing the adsorbing. Also the default defect concentration is back to cm-3. This is not ultimately desirable, but its causing too many problems.

* Updates

* Cutoff radius

Do not override. Just provide a warning. Also make the long range correction default to provide better energies when cutoff is very small.

* Fix cutoff

* Changes and Fixes

Newline in inputs.py

Adjust stride in sets.py

Read energies in outputs.py

* Update args

* Fix

* stress tensor

* Allow positive

* Updates

* New Thermo

* Remove temp

Temporary things were saved during a laptop crash. Now removed.

* SectionList

Modify the file reader with SectionList

* Double loop

* Drop undeveloped corrections and work on 2d correction

* Get section explicit

* Dont check if not running freysoldt

* Rename

Predominance diagram may confuse people. Will use Brouwer diagram instead.

DFT plus U is very finicky in cp2k due to basis sets. It only really works with LOWDIN scheme, which is only available in newer versions of the code, and even then it is prone to instabilities. For these reasons, plus u should not be the default.

* Formatting updates

* Deprecation upcoming

* Formatting

* Formatting

* Formatting

* Defect Complex and Linting

* Typing

* Emergency push

Laptop malfunctioning need to sync

* Update to correction

* Bandstructures

Input objects for band structure and ability to parse the band structure text files generated by cp2k.

* Revisions and Bandstructure support

Some formatting has been updated for new linting. Bandstructure support has been added to inputs and output parsers. Some updates to DOS parsers as well.

* Kpoints

* Format

* Revert to master

* Last minute fix. Needs to access num_kpts

* Needs to access num_kpts

* Old test and type check

* Some more lint

* Unused import

* Lint

* override_default_params

must not be none

* None->{}

* Add self

* Loops

* remove sigma

* A few changes to tests and test files

* fix assert

* AssertEqual

* Kpoints

* Dict already preserves ordering since py3.7

* use .items()

* EPS_PGF_ORB

* Gamma kpoint

* Basis and Potential

* None

* Dict

* startswith

* get element

* kpts

* Only use trunc if periodic

* Only use trunc if periodic

* ADMM Keywords

* Activate motion

* Delete

* runtype check

* molecules

* processor

* dft set

* multiplicity

* Exchange correction function

* pmg config

* Test set

* Basis and Potentials

* pmg config

* Molecule

* kpts

* Some tests

* Test sets

* Lintings

* Test files and tests

* Lint

* More lint

* Don't use built-in name

* list->List, dict->Dict

* Refactor pydantic to msonable

* Delete settings.yaml

* mypy 0.991 upgrade

* Util types and lint

* Uncomment a line

* Missing arg

* type

* arg

* Dftd3

Might as well add this

* Don't use literal

* Json

A hack so that jsanitized objects still have int keys

* Test for aux

* Tiny fixes

* rename some vars in pymatgen/cli/pmg_config.py to snake case

rename single-letter vars (discouraged by flake8)

* breaking: change CP2K _postprocessor parsing error type from OSError to ValueError

complete pymatgen/io/cp2k/utils.py _(pre|post)processor doc strs
add type anno

* Pre/postprocessor

* add return types in pymatgen/io/cp2k/sets.py

required for mypy to analyze functions, found potential dict unpacking error in DftSet subclasses
fix typos

Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
lbluque pushed a commit to lbluque/pymatgen that referenced this pull request May 23, 2023
* Added a property assignment structure for testing. Tests not yet written.

* Modified test file. Removed NLCC_POTENTIALS.yaml, which has no use at the moment.

* Commits to this branch will be updates to defects module. Added WIP updates to the corrections module, which should include some simpler corrections like point charge. Added a WIP update to the thermodynamics module, which is going to include support for predomenance diagrams in the near future

* Defect module WIP

* Try else got tripped unecessarily. Fixed with conditional.

Also removed a print statement.

* Updated import statement.

* Updated import statement.

* Updated import statement.

* Fixed charge transfer, which was not working.

* Slightly better version of last commit

* Plotting updates.

* A very subtle error... Everything in the sets/inputs preserves the
ordering of the sites, except when you use the get_unique_site_indices
function to get aliases such as {'Mg_1': [...], 'Mg_2': [...], 'O_1': [...]}.

Then, the sites are given by a alias: index dictionary that doesn't preserve
order, and so sites can get jumbled when this happens. It doesn't always happen, so
it was hard to identify. Now, when aliasing is used for the Coord section,
the keywords are created in order of their indices. To further protect against
these sorts of issues for unforseen problems, I'm changing "self.keywords"
to be stored as an OrderedDict once they are provided to the Section. Since
Ordered Dicts inherit from Dicts, this shouldn't change anything.

Lastly, I have applied black formatting to everything to keep consistent.

* mypy on github (but not on my machine, oddly) didn't like the use of
aliases.get because SupportsLessThan is not part of it, even though
it worked in practice. While this new method is less elegant, it still
works well enough.

* Reorder import. Couple of formatting corrections from pylint.

* Update plotter.

* Created a polaron class as a child of Substitution.

* Trying out a "RelaxationTransformation" to help stitch together the
transformation history of a parent and child task.

* Tweak as_dict() summary.

* The GhostVacancy is a special vacancy object for CP2K calculations where
the pseudopotential of the atom is turned off, but the basis functions are
left behind.

* added dtype=object to np.array

* Update import statement

* Assembled basis sets, pseudopotential, and DFT+U settings into a single
settings file, which should be cleaner than using separate ones.

* Updated in accordance with new settings.yaml file.

* Update in accordance with new settings.yaml file.

* Update to one of the hubbard values.

* Fixed the getting of hubbard values.

* Updated predominance diagrams

* A temporary measure on my dev branch for avoid Potcar hashing with CP2K calculations

* These files are no longer necessary, as settings.yaml will hold all
this info.

* Update attribute.

* Update attribute

* Update the plot with "combine_charges" to make cleaner figures

* Keywords must have strings for names. Numbers will cause problems.

* Update to the mo_eigenvalues parsing for non-OT compatability. Parsing a raw
text file is incredibly sloppy.

* Update to the file reading. Subsections can have "subsection params",
some of which are used to alias the subsection, but some of which
are variables like "on" or "off". They are problematic from the
point of view of the Section objects. This will fix the issue of cp2k
restart files automatically assigning some of these superfluous
params, but is not a totally robust solution.

* Change to default diagonalization params.

* Update dict band gap.

* Updating mo_eigenvalues parsing, which is getting troublesome mixing
OT and diagonalization FWs now.

* Modifications to the DOS parser so that its more integrated with the
Dos module. Also made the DOS parser the default method for extracting
gaps, vbm, and cbm, which works for either OT or diagonalization.

* Update for aux_basis set. "match" keyword will now use primary basis, which
can be used for defect sites. Starting to get lots of spaghetti code here.
Also fixed indentation error.

* Added aux_basis argument

* Don't assume initial structure was parsed.

* Need hubbard > 0

* Update to diagonalization settings.

* Section Lists

* Reject if sizes don't match

* Testing new plot.

* Possible hash for defects.

* Nevermind.

* Remove warning.

* Update plus_u and xc_functional

* fix

* Update for functional processing. Do not like this as it relies on
having the input file present, but the text in the output file is too
variable to parse reliably.

* Still need to figure out how to get mos automatically. Upped it for
now. Elec temp of 500 maybe have been a little too high.

* Change how output parser gets the functional.

* Use the getattr operation to be case insensitive.

* Update section descriptions.

* So bader can use reference charges with cubes. Will submit patch and merge.

* This function should not default to cm^-3 units. Most/all of pymatgen does things per formula unit / composition object. So that's how this should be as well. The exponential is for an individual atom, and the multiplicity attribute will convert it to a per f.u. basis.

* Dos objects

* Use structure matcher instead of exact equality to deal with empty
site properties.

* fixing cube

* Fix for getting items. Now there is a seperate get function for keyword
and section, so internal methods can be more robust.

* Update to docstring and import.

* Adsorbent class.

* Concentration should be in formula units by default.

* (1) prettier plots

(2) get_doping() added since n/p are reused.

* Docstring.

* Add rotations

* forgot to make msonable

* Kpoints keyword

Scheme keyword was missing. Lead to errors.

* Updates for functionals

Some updates for functionals to make LibXC slightly easier.

* Parse initial structure

For some reason, one example did not show the default blank line. Not sure if it was the new version of cp2k or something else, but make the blank line a conditional now.

* �Parse initial structure

Last commit did not cover edge case. This should work.

* Updates.

Fix readline for both cases of coord table. Modify sets for hybrids. New utility function for truncation radius.

* optFIT basis

Opt fit basis sets should be preferred for C and B. These are not part of standard dist. and should be added to settings as well.

* Convergence criteria

Update the force convergence criteria to roughly match the MPRelaxSet

* Add option for strict updates.

Update now supports optional "strict" keyword to avoid inserting new keywords and sections.

* Update hybrid

Slight tweaks to logic for hybrid sets.

* Update test

Nonsense structure cannot use dummy element. Sets check validity of element inputs now.

* Update functionals

Hybrid was not being registered for PBE0 since only one functional is used.

* Add cutoff radius

Include cutoff radius even if not used in order to suppress warnings.

* Update core defects

Adsorbate is that species that adsorbs onto a surface, adsorbent is surface doing the adsorbing. Also the default defect concentration is back to cm-3. This is not ultimately desirable, but its causing too many problems.

* Updates

* Cutoff radius

Do not override. Just provide a warning. Also make the long range correction default to provide better energies when cutoff is very small.

* Fix cutoff

* Changes and Fixes

Newline in inputs.py

Adjust stride in sets.py

Read energies in outputs.py

* Update args

* Fix

* stress tensor

* Allow positive

* Updates

* New Thermo

* Remove temp

Temporary things were saved during a laptop crash. Now removed.

* SectionList

Modify the file reader with SectionList

* Double loop

* Drop undeveloped corrections and work on 2d correction

* Get section explicit

* Dont check if not running freysoldt

* Rename

Predominance diagram may confuse people. Will use Brouwer diagram instead.

DFT plus U is very finicky in cp2k due to basis sets. It only really works with LOWDIN scheme, which is only available in newer versions of the code, and even then it is prone to instabilities. For these reasons, plus u should not be the default.

* Formatting updates

* Deprecation upcoming

* Formatting

* Formatting

* Formatting

* Defect Complex and Linting

* Typing

* Emergency push

Laptop malfunctioning need to sync

* Update to correction

* Bandstructures

Input objects for band structure and ability to parse the band structure text files generated by cp2k.

* Revisions and Bandstructure support

Some formatting has been updated for new linting. Bandstructure support has been added to inputs and output parsers. Some updates to DOS parsers as well.

* Kpoints

* Format

* Revert to master

* Last minute fix. Needs to access num_kpts

* Needs to access num_kpts

* Old test and type check

* Some more lint

* Unused import

* Lint

* override_default_params

must not be none

* None->{}

* Add self

* Loops

* remove sigma

* A few changes to tests and test files

* fix assert

* AssertEqual

* Kpoints

* Dict already preserves ordering since py3.7

* use .items()

* EPS_PGF_ORB

* Gamma kpoint

* Basis and Potential

* None

* Dict

* startswith

* get element

* kpts

* Only use trunc if periodic

* Only use trunc if periodic

* ADMM Keywords

* Activate motion

* Delete

* runtype check

* molecules

* processor

* dft set

* multiplicity

* Exchange correction function

* pmg config

* Test set

* Basis and Potentials

* pmg config

* Molecule

* kpts

* Some tests

* Test sets

* Lintings

* Test files and tests

* Lint

* More lint

* Don't use built-in name

* list->List, dict->Dict

* Refactor pydantic to msonable

* Delete settings.yaml

* mypy 0.991 upgrade

* Util types and lint

* Uncomment a line

* Missing arg

* type

* arg

* Dftd3

Might as well add this

* Don't use literal

* Json

A hack so that jsanitized objects still have int keys

* Test for aux

* Tiny fixes

* rename some vars in pymatgen/cli/pmg_config.py to snake case

rename single-letter vars (discouraged by flake8)

* breaking: change CP2K _postprocessor parsing error type from OSError to ValueError

complete pymatgen/io/cp2k/utils.py _(pre|post)processor doc strs
add type anno

* Pre/postprocessor

* add return types in pymatgen/io/cp2k/sets.py

required for mypy to analyze functions, found potential dict unpacking error in DftSet subclasses
fix typos

Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
lbluque pushed a commit to lbluque/pymatgen that referenced this pull request May 25, 2023
* Added a property assignment structure for testing. Tests not yet written.

* Modified test file. Removed NLCC_POTENTIALS.yaml, which has no use at the moment.

* Commits to this branch will be updates to defects module. Added WIP updates to the corrections module, which should include some simpler corrections like point charge. Added a WIP update to the thermodynamics module, which is going to include support for predomenance diagrams in the near future

* Defect module WIP

* Try else got tripped unecessarily. Fixed with conditional.

Also removed a print statement.

* Updated import statement.

* Updated import statement.

* Updated import statement.

* Fixed charge transfer, which was not working.

* Slightly better version of last commit

* Plotting updates.

* A very subtle error... Everything in the sets/inputs preserves the
ordering of the sites, except when you use the get_unique_site_indices
function to get aliases such as {'Mg_1': [...], 'Mg_2': [...], 'O_1': [...]}.

Then, the sites are given by a alias: index dictionary that doesn't preserve
order, and so sites can get jumbled when this happens. It doesn't always happen, so
it was hard to identify. Now, when aliasing is used for the Coord section,
the keywords are created in order of their indices. To further protect against
these sorts of issues for unforseen problems, I'm changing "self.keywords"
to be stored as an OrderedDict once they are provided to the Section. Since
Ordered Dicts inherit from Dicts, this shouldn't change anything.

Lastly, I have applied black formatting to everything to keep consistent.

* mypy on github (but not on my machine, oddly) didn't like the use of
aliases.get because SupportsLessThan is not part of it, even though
it worked in practice. While this new method is less elegant, it still
works well enough.

* Reorder import. Couple of formatting corrections from pylint.

* Update plotter.

* Created a polaron class as a child of Substitution.

* Trying out a "RelaxationTransformation" to help stitch together the
transformation history of a parent and child task.

* Tweak as_dict() summary.

* The GhostVacancy is a special vacancy object for CP2K calculations where
the pseudopotential of the atom is turned off, but the basis functions are
left behind.

* added dtype=object to np.array

* Update import statement

* Assembled basis sets, pseudopotential, and DFT+U settings into a single
settings file, which should be cleaner than using separate ones.

* Updated in accordance with new settings.yaml file.

* Update in accordance with new settings.yaml file.

* Update to one of the hubbard values.

* Fixed the getting of hubbard values.

* Updated predominance diagrams

* A temporary measure on my dev branch for avoid Potcar hashing with CP2K calculations

* These files are no longer necessary, as settings.yaml will hold all
this info.

* Update attribute.

* Update attribute

* Update the plot with "combine_charges" to make cleaner figures

* Keywords must have strings for names. Numbers will cause problems.

* Update to the mo_eigenvalues parsing for non-OT compatability. Parsing a raw
text file is incredibly sloppy.

* Update to the file reading. Subsections can have "subsection params",
some of which are used to alias the subsection, but some of which
are variables like "on" or "off". They are problematic from the
point of view of the Section objects. This will fix the issue of cp2k
restart files automatically assigning some of these superfluous
params, but is not a totally robust solution.

* Change to default diagonalization params.

* Update dict band gap.

* Updating mo_eigenvalues parsing, which is getting troublesome mixing
OT and diagonalization FWs now.

* Modifications to the DOS parser so that its more integrated with the
Dos module. Also made the DOS parser the default method for extracting
gaps, vbm, and cbm, which works for either OT or diagonalization.

* Update for aux_basis set. "match" keyword will now use primary basis, which
can be used for defect sites. Starting to get lots of spaghetti code here.
Also fixed indentation error.

* Added aux_basis argument

* Don't assume initial structure was parsed.

* Need hubbard > 0

* Update to diagonalization settings.

* Section Lists

* Reject if sizes don't match

* Testing new plot.

* Possible hash for defects.

* Nevermind.

* Remove warning.

* Update plus_u and xc_functional

* fix

* Update for functional processing. Do not like this as it relies on
having the input file present, but the text in the output file is too
variable to parse reliably.

* Still need to figure out how to get mos automatically. Upped it for
now. Elec temp of 500 maybe have been a little too high.

* Change how output parser gets the functional.

* Use the getattr operation to be case insensitive.

* Update section descriptions.

* So bader can use reference charges with cubes. Will submit patch and merge.

* This function should not default to cm^-3 units. Most/all of pymatgen does things per formula unit / composition object. So that's how this should be as well. The exponential is for an individual atom, and the multiplicity attribute will convert it to a per f.u. basis.

* Dos objects

* Use structure matcher instead of exact equality to deal with empty
site properties.

* fixing cube

* Fix for getting items. Now there is a seperate get function for keyword
and section, so internal methods can be more robust.

* Update to docstring and import.

* Adsorbent class.

* Concentration should be in formula units by default.

* (1) prettier plots

(2) get_doping() added since n/p are reused.

* Docstring.

* Add rotations

* forgot to make msonable

* Kpoints keyword

Scheme keyword was missing. Lead to errors.

* Updates for functionals

Some updates for functionals to make LibXC slightly easier.

* Parse initial structure

For some reason, one example did not show the default blank line. Not sure if it was the new version of cp2k or something else, but make the blank line a conditional now.

* �Parse initial structure

Last commit did not cover edge case. This should work.

* Updates.

Fix readline for both cases of coord table. Modify sets for hybrids. New utility function for truncation radius.

* optFIT basis

Opt fit basis sets should be preferred for C and B. These are not part of standard dist. and should be added to settings as well.

* Convergence criteria

Update the force convergence criteria to roughly match the MPRelaxSet

* Add option for strict updates.

Update now supports optional "strict" keyword to avoid inserting new keywords and sections.

* Update hybrid

Slight tweaks to logic for hybrid sets.

* Update test

Nonsense structure cannot use dummy element. Sets check validity of element inputs now.

* Update functionals

Hybrid was not being registered for PBE0 since only one functional is used.

* Add cutoff radius

Include cutoff radius even if not used in order to suppress warnings.

* Update core defects

Adsorbate is that species that adsorbs onto a surface, adsorbent is surface doing the adsorbing. Also the default defect concentration is back to cm-3. This is not ultimately desirable, but its causing too many problems.

* Updates

* Cutoff radius

Do not override. Just provide a warning. Also make the long range correction default to provide better energies when cutoff is very small.

* Fix cutoff

* Changes and Fixes

Newline in inputs.py

Adjust stride in sets.py

Read energies in outputs.py

* Update args

* Fix

* stress tensor

* Allow positive

* Updates

* New Thermo

* Remove temp

Temporary things were saved during a laptop crash. Now removed.

* SectionList

Modify the file reader with SectionList

* Double loop

* Drop undeveloped corrections and work on 2d correction

* Get section explicit

* Dont check if not running freysoldt

* Rename

Predominance diagram may confuse people. Will use Brouwer diagram instead.

DFT plus U is very finicky in cp2k due to basis sets. It only really works with LOWDIN scheme, which is only available in newer versions of the code, and even then it is prone to instabilities. For these reasons, plus u should not be the default.

* Formatting updates

* Deprecation upcoming

* Formatting

* Formatting

* Formatting

* Defect Complex and Linting

* Typing

* Emergency push

Laptop malfunctioning need to sync

* Update to correction

* Bandstructures

Input objects for band structure and ability to parse the band structure text files generated by cp2k.

* Revisions and Bandstructure support

Some formatting has been updated for new linting. Bandstructure support has been added to inputs and output parsers. Some updates to DOS parsers as well.

* Kpoints

* Format

* Revert to master

* Last minute fix. Needs to access num_kpts

* Needs to access num_kpts

* Old test and type check

* Some more lint

* Unused import

* Lint

* override_default_params

must not be none

* None->{}

* Add self

* Loops

* remove sigma

* A few changes to tests and test files

* fix assert

* AssertEqual

* Kpoints

* Dict already preserves ordering since py3.7

* use .items()

* EPS_PGF_ORB

* Gamma kpoint

* Basis and Potential

* None

* Dict

* startswith

* get element

* kpts

* Only use trunc if periodic

* Only use trunc if periodic

* ADMM Keywords

* Activate motion

* Delete

* runtype check

* molecules

* processor

* dft set

* multiplicity

* Exchange correction function

* pmg config

* Test set

* Basis and Potentials

* pmg config

* Molecule

* kpts

* Some tests

* Test sets

* Lintings

* Test files and tests

* Lint

* More lint

* Don't use built-in name

* list->List, dict->Dict

* Refactor pydantic to msonable

* Delete settings.yaml

* mypy 0.991 upgrade

* Util types and lint

* Uncomment a line

* Missing arg

* type

* arg

* Dftd3

Might as well add this

* Don't use literal

* Json

A hack so that jsanitized objects still have int keys

* Test for aux

* Tiny fixes

* rename some vars in pymatgen/cli/pmg_config.py to snake case

rename single-letter vars (discouraged by flake8)

* breaking: change CP2K _postprocessor parsing error type from OSError to ValueError

complete pymatgen/io/cp2k/utils.py _(pre|post)processor doc strs
add type anno

* Pre/postprocessor

* add return types in pymatgen/io/cp2k/sets.py

required for mypy to analyze functions, found potential dict unpacking error in DftSet subclasses
fix typos

Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
janosh added a commit that referenced this pull request May 26, 2023
* Added a property assignment structure for testing. Tests not yet written.

* Modified test file. Removed NLCC_POTENTIALS.yaml, which has no use at the moment.

* Commits to this branch will be updates to defects module. Added WIP updates to the corrections module, which should include some simpler corrections like point charge. Added a WIP update to the thermodynamics module, which is going to include support for predomenance diagrams in the near future

* Defect module WIP

* Try else got tripped unecessarily. Fixed with conditional.

Also removed a print statement.

* Updated import statement.

* Updated import statement.

* Updated import statement.

* Fixed charge transfer, which was not working.

* Slightly better version of last commit

* Plotting updates.

* A very subtle error... Everything in the sets/inputs preserves the
ordering of the sites, except when you use the get_unique_site_indices
function to get aliases such as {'Mg_1': [...], 'Mg_2': [...], 'O_1': [...]}.

Then, the sites are given by a alias: index dictionary that doesn't preserve
order, and so sites can get jumbled when this happens. It doesn't always happen, so
it was hard to identify. Now, when aliasing is used for the Coord section,
the keywords are created in order of their indices. To further protect against
these sorts of issues for unforseen problems, I'm changing "self.keywords"
to be stored as an OrderedDict once they are provided to the Section. Since
Ordered Dicts inherit from Dicts, this shouldn't change anything.

Lastly, I have applied black formatting to everything to keep consistent.

* mypy on github (but not on my machine, oddly) didn't like the use of
aliases.get because SupportsLessThan is not part of it, even though
it worked in practice. While this new method is less elegant, it still
works well enough.

* Reorder import. Couple of formatting corrections from pylint.

* Update plotter.

* Created a polaron class as a child of Substitution.

* Trying out a "RelaxationTransformation" to help stitch together the
transformation history of a parent and child task.

* Tweak as_dict() summary.

* The GhostVacancy is a special vacancy object for CP2K calculations where
the pseudopotential of the atom is turned off, but the basis functions are
left behind.

* added dtype=object to np.array

* Update import statement

* Assembled basis sets, pseudopotential, and DFT+U settings into a single
settings file, which should be cleaner than using separate ones.

* Updated in accordance with new settings.yaml file.

* Update in accordance with new settings.yaml file.

* Update to one of the hubbard values.

* Fixed the getting of hubbard values.

* Updated predominance diagrams

* A temporary measure on my dev branch for avoid Potcar hashing with CP2K calculations

* These files are no longer necessary, as settings.yaml will hold all
this info.

* Update attribute.

* Update attribute

* Update the plot with "combine_charges" to make cleaner figures

* Keywords must have strings for names. Numbers will cause problems.

* Update to the mo_eigenvalues parsing for non-OT compatability. Parsing a raw
text file is incredibly sloppy.

* Update to the file reading. Subsections can have "subsection params",
some of which are used to alias the subsection, but some of which
are variables like "on" or "off". They are problematic from the
point of view of the Section objects. This will fix the issue of cp2k
restart files automatically assigning some of these superfluous
params, but is not a totally robust solution.

* Change to default diagonalization params.

* Update dict band gap.

* Updating mo_eigenvalues parsing, which is getting troublesome mixing
OT and diagonalization FWs now.

* Modifications to the DOS parser so that its more integrated with the
Dos module. Also made the DOS parser the default method for extracting
gaps, vbm, and cbm, which works for either OT or diagonalization.

* Update for aux_basis set. "match" keyword will now use primary basis, which
can be used for defect sites. Starting to get lots of spaghetti code here.
Also fixed indentation error.

* Added aux_basis argument

* Don't assume initial structure was parsed.

* Need hubbard > 0

* Update to diagonalization settings.

* Section Lists

* Reject if sizes don't match

* Testing new plot.

* Possible hash for defects.

* Nevermind.

* Remove warning.

* Update plus_u and xc_functional

* fix

* Update for functional processing. Do not like this as it relies on
having the input file present, but the text in the output file is too
variable to parse reliably.

* Still need to figure out how to get mos automatically. Upped it for
now. Elec temp of 500 maybe have been a little too high.

* Change how output parser gets the functional.

* Use the getattr operation to be case insensitive.

* Update section descriptions.

* So bader can use reference charges with cubes. Will submit patch and merge.

* This function should not default to cm^-3 units. Most/all of pymatgen does things per formula unit / composition object. So that's how this should be as well. The exponential is for an individual atom, and the multiplicity attribute will convert it to a per f.u. basis.

* Dos objects

* Use structure matcher instead of exact equality to deal with empty
site properties.

* fixing cube

* Fix for getting items. Now there is a seperate get function for keyword
and section, so internal methods can be more robust.

* Update to docstring and import.

* Adsorbent class.

* Concentration should be in formula units by default.

* (1) prettier plots

(2) get_doping() added since n/p are reused.

* Docstring.

* Add rotations

* forgot to make msonable

* Kpoints keyword

Scheme keyword was missing. Lead to errors.

* Updates for functionals

Some updates for functionals to make LibXC slightly easier.

* Parse initial structure

For some reason, one example did not show the default blank line. Not sure if it was the new version of cp2k or something else, but make the blank line a conditional now.

* �Parse initial structure

Last commit did not cover edge case. This should work.

* Updates.

Fix readline for both cases of coord table. Modify sets for hybrids. New utility function for truncation radius.

* optFIT basis

Opt fit basis sets should be preferred for C and B. These are not part of standard dist. and should be added to settings as well.

* Convergence criteria

Update the force convergence criteria to roughly match the MPRelaxSet

* Add option for strict updates.

Update now supports optional "strict" keyword to avoid inserting new keywords and sections.

* Update hybrid

Slight tweaks to logic for hybrid sets.

* Update test

Nonsense structure cannot use dummy element. Sets check validity of element inputs now.

* Update functionals

Hybrid was not being registered for PBE0 since only one functional is used.

* Add cutoff radius

Include cutoff radius even if not used in order to suppress warnings.

* Update core defects

Adsorbate is that species that adsorbs onto a surface, adsorbent is surface doing the adsorbing. Also the default defect concentration is back to cm-3. This is not ultimately desirable, but its causing too many problems.

* Updates

* Cutoff radius

Do not override. Just provide a warning. Also make the long range correction default to provide better energies when cutoff is very small.

* Fix cutoff

* Changes and Fixes

Newline in inputs.py

Adjust stride in sets.py

Read energies in outputs.py

* Update args

* Fix

* stress tensor

* Allow positive

* Updates

* New Thermo

* Remove temp

Temporary things were saved during a laptop crash. Now removed.

* SectionList

Modify the file reader with SectionList

* Double loop

* Drop undeveloped corrections and work on 2d correction

* Get section explicit

* Dont check if not running freysoldt

* Rename

Predominance diagram may confuse people. Will use Brouwer diagram instead.

DFT plus U is very finicky in cp2k due to basis sets. It only really works with LOWDIN scheme, which is only available in newer versions of the code, and even then it is prone to instabilities. For these reasons, plus u should not be the default.

* Formatting updates

* Deprecation upcoming

* Formatting

* Formatting

* Formatting

* Defect Complex and Linting

* Typing

* Emergency push

Laptop malfunctioning need to sync

* Update to correction

* Bandstructures

Input objects for band structure and ability to parse the band structure text files generated by cp2k.

* Revisions and Bandstructure support

Some formatting has been updated for new linting. Bandstructure support has been added to inputs and output parsers. Some updates to DOS parsers as well.

* Kpoints

* Format

* Revert to master

* Last minute fix. Needs to access num_kpts

* Needs to access num_kpts

* Old test and type check

* Some more lint

* Unused import

* Lint

* override_default_params

must not be none

* None->{}

* Add self

* Loops

* remove sigma

* A few changes to tests and test files

* fix assert

* AssertEqual

* Kpoints

* Dict already preserves ordering since py3.7

* use .items()

* EPS_PGF_ORB

* Gamma kpoint

* Basis and Potential

* None

* Dict

* startswith

* get element

* kpts

* Only use trunc if periodic

* Only use trunc if periodic

* ADMM Keywords

* Activate motion

* Delete

* runtype check

* molecules

* processor

* dft set

* multiplicity

* Exchange correction function

* pmg config

* Test set

* Basis and Potentials

* pmg config

* Molecule

* kpts

* Some tests

* Test sets

* Lintings

* Test files and tests

* Lint

* More lint

* Don't use built-in name

* list->List, dict->Dict

* Refactor pydantic to msonable

* Delete settings.yaml

* mypy 0.991 upgrade

* Util types and lint

* Uncomment a line

* Missing arg

* type

* arg

* Dftd3

Might as well add this

* Don't use literal

* Json

A hack so that jsanitized objects still have int keys

* Test for aux

* Tiny fixes

* rename some vars in pymatgen/cli/pmg_config.py to snake case

rename single-letter vars (discouraged by flake8)

* breaking: change CP2K _postprocessor parsing error type from OSError to ValueError

complete pymatgen/io/cp2k/utils.py _(pre|post)processor doc strs
add type anno

* Pre/postprocessor

* add return types in pymatgen/io/cp2k/sets.py

required for mypy to analyze functions, found potential dict unpacking error in DftSet subclasses
fix typos

Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants