Vasp options #799

liamhuber · 2022-10-06T17:28:23Z

Amendments to #769.

A new field on vasp input is created, the DataContainer code_specific_options, which in this case stores the flag allowing the structure order to preserved (which is the meat of #769).

Still needs tests.

I just copied the pull sudarshan did for direct positioning. In particular, I am not sure why there are two different places where he needed to place the same setting. I also don't understand the logic of "allow_reordering=not self.allow_structure_reordering" And also why in the second block of code, he adjusts the setter to be _write_in_direct_coordinates opposed to the first one where he writes write_in_direct_coordinates

Tests in next commit

…? Is this appropriate?

coveralls · 2022-10-06T17:39:51Z

Pull Request Test Coverage Report for Build 3226717749

26 of 34 (76.47%) changed or added relevant lines in 3 files are covered.
63 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.04%) to 68.635%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
pyiron_atomistics/vasp/potential.py	5	8	62.5%
pyiron_atomistics/vasp/structure.py	13	18	72.22%

Files with Coverage Reduction	New Missed Lines	%
pyiron_atomistics/interactive/quasi_newton.py	14	77.9%
pyiron_atomistics/atomistics/structure/atoms.py	49	75.88%

Totals
Change from base Build 3190143705:	0.04%
Covered Lines:	12077
Relevant Lines:	17596

💛 - Coveralls

ligerzero-ai · 2022-10-06T21:37:47Z

Upon closer inspection, it looks like this problem runs much deeper than I (we) thought, I did some initial scripting for tests and found that the structure was "sorting" itself even in the branch in which it shouldn't be. I found that this was due to the following lines:

in lines 102-104 the search for species strings to be written to the POSCAR relies on the sorting behaviour.

atom_numbers = structure.get_number_species_atoms()
if write_species:
    f.write(" ".join(atom_numbers.keys()) + endline)

So the writing behaviour gets the positions/coordinates right, but fails on the species ordering side, because the get_number_species_atoms implicitly relies on the sorting behaviour. This is really all too complicated for something that is already implemented elsewhere - e.g. the Structure.to(fmt="poscar", filename="POSCAR_test") from pymatgen.

I would prefer a complete refactoring of the write_poscar function in favour of a:

structure_pmg = pyiron_to_pymatgen(structure)
structure_pmg.to(fmt="poscar", filename="POSCAR_test")

This is a simple, quick and easy fix.

For the current default sorting behaviour, we can add that as an if-else statement.

However, it comes to my mind that I am not aware of how many other places rely on this kind of implicit ordering behaviour by pyiron. That could prove a major pain to weed out.

ligerzero-ai · 2022-10-06T21:58:07Z

So I can see that the get_number_species_atoms is called in 5 other places, so explicitly it's not too bad, since I can see that the others are just simple uses of that function without implicit reliance on sorting behaviour. But I don't know where else in the codebase there will be an implicit reliance on this kind of sorting behaviour.

I am especially worried about the reading of output, because if there is a similar implicit reliance on this kind of sorting behaviour for parsing it could be a major pain to refactor out of the codebase, not to mention future maintenance of backwards-compatibility with already-generated data.

Are there tests written in mind expecting this kind of implicit sorting behaviour? Or is input-output matching explicitly tested for without a reliance on this behaviour?

For example, are there tests for the OUTCAR parsing which read to see if the species order list matches what pyiron is expecting? e.g. does pyiron implicitly rely on this ordering behaviour to remap to the user-fed original ionic positions to store into the database? are there tests for this kind of implicit behaviour? e.g. if it suddenly stopped existing, will a test break? Or will it break silently?

tl;dr If I change the input now, won't output parsing which relies on this behaviour be completely f'd?

liamhuber · 2022-10-07T17:57:34Z

So the writing behaviour gets the positions/coordinates right, but fails on the species ordering side, because the get_number_species_atoms implicitly relies on the sorting behaviour.

@ligerzero-ai good testing and good catch!

I would prefer a complete refactoring of the write_poscar function in favour of a: ...

I am open to wrapping the pymatgen implementation for some/all of this functionality, although I don't know enough about it to comment on the exact implementation. In general, if someone already in our dependencies does something we want to do, that's the absolute perfect time to cut code out of our code base and just use their implementation!

So I can see that the get_number_species_atoms is called in 5 other places, so explicitly it's not too bad...

Indeed. Actually, the fact that get_number_species_atoms is returning a sorted list with all the types clustered together is totally fine. That's the behaviour I would want. So I don't think we will even need to "fix" those occurrences.

in lines 102-104 the search for species strings to be written to the POSCAR relies on the sorting behaviour.

Agreed, this is trouble. The most straightforward solution I see is to add a second if clause here that depends on allow_reordering. Something like:

if allow_reordering:
    atom_numbers = structure.get_number_species_atoms()
else:
    atom_numbers = get_ordered_species_count(structure)
if write_species:
    f.write(" ".join(atom_numbers.keys()) + endline)
num_str = [str(val) for val in atom_numbers.values()]

Where get_ordered_species_count counts up all the consecutive atoms of the same type in the structure.

tl;dr If I change the input now, won't output parsing which relies on this behaviour be completely f'd?

Maaaaaybe not? Elsewhere in pyiron_atomistics.vasp.structure, I see stuff that looks like it is reversing the reordering. If we're really lucky, it will just continue working in the case that allow_structure_reordering==True, and it will be a useless waste of flops but not break anything in the case that allow_structure_reordering==False.

Definitely I would be more comfortable if the VASP job simply stored (and serialized) the map to transform pyiron-indices to vasp-indices, and simply applied it in reverse on reading files. I'm cautiously optimistic that such a setup would simplify a lot of code in this module.

Be a bit patient with me here, I never looked in side vasp.structure until you brought up this issue 😝

For example, are there tests for the OUTCAR parsing which read to see if the species order list matches what pyiron is expecting? e.g. does pyiron implicitly rely on this ordering behaviour to remap to the user-fed original ionic positions to store into the database? are there tests for this kind of implicit behaviour? e.g. if it suddenly stopped existing, will a test break? Or will it break silently?

Excellent questions and I'm afraid I don't know. Reassuringly, there are a bunch of tests in tests/vasp, including test_structure (which in turn includes something called test_vasp_sorter which is great news), and test_outcar. Both of these make use of a relatively extensive set of static POSCAR and OUTCAR files, so (again, if we're lucky), there is only minimal work for us. All the tests passed on the continuous integration, but that's not shocking since none of the tests take us away from the default value of allow_structure_reordering=True. I suspect what's needed is to go through a bunch of the existing tests and add new if-branches where this is set to False and test that case too. Might also require some new reference files in tests/static/vasp_test_files to accomplish this.

Again though, I'm not a VASP user so I don't have any specific insight into this particular corner of our test suite. I'm happy to provide this sort of high level advice, but we should see at the meeting if there's someone at-institute to provide more fine-grained and real-time assistance. Maybe you already had a chance to talk with Sam about it?

tldr; Good investigating. Hopefully we will get away with a minor adjustment to the species and mimicking a bunch of existing tests 🤞

ligerzero-ai · 2022-10-07T18:21:36Z

Thanks so much for the response Liam, v. grateful.

I will take a deeper dive on the weekend, but I doubt that I will have something complete written by Monday. I will probably test this behaviour explicitly on the cluster using a simple case and see what the output reader does. I will also need to take a deeper look into the output parsing to see if I can catch anything there as well.

Hopefully there is no kind of implicit reliance on this kind of behaviour, but if there is I'll be sure to let the group know, and we can decide on how to proceed with grandfathering in of old datasets after we implement this change.

pyiron_atomistics/vasp/base.py

liamhuber · 2022-10-10T13:35:39Z

Per pyiron meeting (2022-10-10), renamed input.code_specific_options to input.options.

ligerzero-ai · 2022-10-11T10:48:28Z

So, it looks like the problems are only just beginning - I can see that the POTCAR generation isn't explicitly reading the structure at runtime and relies on the default sorting behaviour:

So I'm off to fix this (POTCAR generation), but I'm not holding my breath for output parsing being rigorous with it's implementation either.

ligerzero-ai · 2022-10-11T15:09:10Z

Okay, this just got way more complicated, actually ( :((((( <---- me ). I realise now that the potential generation is dependent on the implicit ordering, 2716561, but I also cannot just feed it the raw structure, because the sorting takes place at the write_poscar step, which means that there is no way to neatly feed the POTCAR the structure and have it generate it, outside of adding another flag which should disable the default sorting behaviour.

Imho, I feel quite strongly that it's terrible practice to include sorting inside input generation routines. The expected behaviour of any filewriter should be that it generates what is fed to it, without alterations.

So, it's my view that the writer should be "dumb" and the sorting should happen outside of it. If I feed a potcar a structure: structure it should generate the POTCAR directly from the structure. If you want a sorted POTCAR, you should sort your structure before you feed it into your POTCAR writer. Right now, the potcar generator is performing sorting inside of the function, similar to what is happening in write_poscar.

What is even more concerning is that this behaviour is scattered in the output as well: the output parsing is not explicitly reading a mapped list of the indices. It's relying on the sorting behaviour being the same as what is used to generate the structure, which you can see in lines 1978-onwards in the collect function in vasp.base. This is just really nasty overall.

Currently:
vasp job -> structure -> writers which perform sorting magic inside of them --> output parsed with implied sorting magic assuming that the same function was used.

My proposal:
vasp job -> structure -> sorted_structure (can be turned off with a flag) -> user struct---written input struct map stored to hdf5 -> "dumb" writers for input file generation --> output parsed as is --> re-mapping explicitly using the hdf5 stored user vs input map.

This leaves exactly 0 room for any kind of bs happening as a result of the sorting behaviour. There's no reliance whatsoever on the input and output functions calling the same function, and is far more maintainable overall.

Imho, this calls for some heavy re-writing of the behaviour of the vasp modules because the fact that the map is implicitly relying on the same function being called at input and output makes me feel terribly uncomfortable in general. If you change one, you will break the other, silently.

ligerzero-ai · 2022-10-11T15:52:00Z

And to top it all off, you can see that the unittests pass on my latest commit, which means that the tests are failing to catch the mismatch between input and output. This should fail with the elements in the generated POTCAR having a mismatch with what is specified in the generated POSCAR.

ligerzero-ai · 2022-10-11T17:28:31Z

Evidence of unsorted POSCARs breaking output parsing:

RAW DATA:
CONTCAR, magmom datablock from OUTCAR of job:

Fe14 Ni1 P1
   1.00000000000000
     5.7400000000000002    0.0000000000000000    0.0000000000000000
     0.0000000000000000    5.7400000000000002    0.0000000000000000
     0.0000000000000000    0.0000000000000000    5.7400000000000002
   Fe   P    Fe   Ni   Fe
     1     1     3     1    10
Direct
  0.0025664630704604  0.0032319176062993  0.0032274764708489
  0.2499994228784569  0.2500029153329241  0.2500007627117586
  0.0025666904584504  0.4967683803040508  0.0032278685645549
  0.2499989124577435  0.7499961246198761  0.2500012030899373
  0.4974336564145658  0.0032323325444499  0.0032278986920338
  0.7500007266307670  0.2500018099441704  0.2500006522263918
  0.4974329547739163  0.4967689527429046  0.0032274639824345
  0.7500011310338164  0.7499959463162306  0.2500012500073081
  0.0025662305143739  0.0032323470480516  0.4967722750902888
  0.2499989179606291  0.2500039379306188  0.7499988070948959
  0.0025672336441982  0.4967688955773852  0.4967728567290476
  0.2499989374875279  0.7499959701448191  0.7499987773856067
  0.4974331611726478  0.0032319315360019  0.4967728211408048
  0.7500011539600592  0.2500039814947535  0.7499987588800141
  0.4974333446765785  0.4967684450015159  0.4967723431120620
  0.7500010628658087  0.7499961118559484  0.7499987848220121

 magnetization (x)

# of ion       s       p       d       tot
------------------------------------------
    1       -0.012  -0.034   2.425   2.378
    2        0.004  -0.072   0.000  -0.068
    3       -0.012  -0.034   2.426   2.379
    4       -0.004  -0.041   2.547   2.501
    5       -0.012  -0.034   2.425   2.378
    6       -0.021  -0.058   0.807   0.728
    7       -0.012  -0.034   2.426   2.379
    8       -0.011  -0.032   2.622   2.578
    9       -0.012  -0.034   2.425   2.378
   10       -0.004  -0.041   2.547   2.501
   11       -0.012  -0.034   2.425   2.379
   12       -0.015  -0.043   2.667   2.610
   13       -0.012  -0.034   2.425   2.378
   14       -0.011  -0.032   2.622   2.578
   15       -0.012  -0.034   2.425   2.379
   16       -0.011  -0.068   2.561   2.483
--------------------------------------------------
tot         -0.172  -0.662  35.773  34.939

note P -0.068 and Ni 0.728

IN JUPYTER NOTEBOOK:

job.get_structure() prints

Fe: [0.01473148 0.01855122 0.01852574]
P: [4.30500608 4.30497767 4.304993  ]
Fe: [1.43499667 1.43501676 1.43500436]
Fe: [0.0147328  2.8514505  0.01852797]
Fe: [1.43499374 4.30497773 1.43500689]
Ni: [2.85526737 2.8514509  2.85147323]
Fe: [2.85526921 0.01855357 0.01852815]
Fe: [4.30500419 1.43501039 1.43500373]
Fe: [2.85526513 2.85145377 0.01852562]
Fe: [4.30500649 4.30497675 1.43500718]
Fe: [0.01473016 0.01855369 2.85147289]
Fe: [1.4349938  1.43502262 4.30499317]
Fe: [0.0147359  2.85145349 2.85147622]
Fe: [1.43499392 4.30497687 4.304993  ]
Fe: [2.85526634 0.01855128 2.85147599]
Fe: [4.3050066  1.43502285 4.30499288]
tags: 
    spin: [(0: 2.378) (1: 2.483) (2: -0.068) (3: 2.379) (4: 2.501) (5: 2.379) (6: 2.378) (7: 0.728) (8: 2.379) (9: 2.578) (10: 2.378) (11: 2.501) (12: 2.379) (13: 2.61) (14: 2.378) (15: 2.578)]
pbc: [ True  True  True]
cell: 
Cell([5.74, 5.74, 5.74])

note P 2.483 Ni 2.379, when in fact it should be -0.068 (2) and 0.728 (7).

ligerzero-ai · 2022-10-11T18:46:15Z

@samwaseda #769 (comment)

So what you are saying is that the class Output should also include a self.options = VaspSpecificOptions() as it is in the Input() class currently? Then we pass different structures using structure[id_pyi_to_dft] according to the bool value of self.options.allow_reordering?

Ok, great, I think I get it.

Then we can safely remove all the vasp_sorter and get_chemical_species calls inside the actual writers and parsers and just have the raw output parsed as is before adding the final conversion back at the end of variable assignment.

@property
def id_pyi_to_dft(self):
    if self.input.option.allow_reorder:
        return self._reordered_indices
    return np.arange(len(self.structure))

samwaseda · 2022-10-11T18:54:52Z

So what you are saying is that the class Output should also include a self.options = VaspSpecificOptions() as it is in the Input() class currently? Then we pass different structures according to the bool value of self.options.allow_reordering?

No I'm saying the permutation should happen only in this one variable, so that you don't need any other if ... else anywhere else. The storage should be only one place, and I'm fine with self.input.options.allow_reordering.

Bumps [pymatgen](https://github.com/materialsproject/pymatgen) from 2022.9.21 to 2022.10.22. - [Release notes](https://github.com/materialsproject/pymatgen/releases) - [Changelog](https://github.com/materialsproject/pymatgen/blob/master/CHANGES.rst) - [Commits](materialsproject/pymatgen@v2022.9.21...v2022.10.22) --- updated-dependencies: - dependency-name: pymatgen dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [mp-api](https://github.com/materialsproject/api) from 0.29.1 to 0.29.3. - [Release notes](https://github.com/materialsproject/api/releases) - [Commits](materialsproject/api@v0.29.1...v0.29.3) --- updated-dependencies: - dependency-name: mp-api dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [scipy](https://github.com/scipy/scipy) from 1.9.1 to 1.9.3. - [Release notes](https://github.com/scipy/scipy/releases) - [Commits](scipy/scipy@v1.9.1...v1.9.3) --- updated-dependencies: - dependency-name: scipy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

For Methfessel-Paxton smearing the occupation of orbitals can be negative. Checking >0 in check_band_occupancy is therefore not enough.

Bumps [pint](https://github.com/hgrecco/pint) from 0.19.2 to 0.20.1. - [Release notes](https://github.com/hgrecco/pint/releases) - [Changelog](https://github.com/hgrecco/pint/blob/master/CHANGES) - [Commits](hgrecco/pint@0.19.2...0.20.1) --- updated-dependencies: - dependency-name: pint dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [scikit-learn](https://github.com/scikit-learn/scikit-learn) from 1.1.2 to 1.1.3. - [Release notes](https://github.com/scikit-learn/scikit-learn/releases) - [Commits](scikit-learn/scikit-learn@1.1.2...1.1.3) --- updated-dependencies: - dependency-name: scikit-learn dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

Previously get_structure returned added all elements in the storage to each structure and defined the indices accordingly, even if the structure only contains a subset of these elements. Not all codes can handle this, so I'm restricting this again.

review-notebook-app · 2022-11-04T21:06:32Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

stale · 2022-11-23T04:11:43Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

ligerzero-ai · 2022-11-23T21:54:40Z

Closed because rebase failed to satisfy unittests requirement for traitlets

Superceded by #880

ligerzero-ai and others added 7 commits September 20, 2022 22:54

Fixes issues raised by liam's review (?)

fc75547

Tests in next commit

Some other edits - remove verbose setter getter in the vaspbase class…

25dae53

…? Is this appropriate?

Make VASP specific inputs

3e8e911

Use vasp specific options

c79949a

De/Serialize options

23aebb6

liamhuber assigned ligerzero-ai Oct 6, 2022

liamhuber mentioned this pull request Oct 6, 2022

Create a setting to forbid sorting behaviour for POSCARs in pyiron #769

Closed

liamhuber commented Oct 10, 2022

View reviewed changes

pyiron_atomistics/vasp/base.py Outdated Show resolved Hide resolved

liamhuber commented Oct 10, 2022

View reviewed changes

pyiron_atomistics/vasp/base.py Outdated Show resolved Hide resolved

liamhuber commented Oct 10, 2022

View reviewed changes

pyiron_atomistics/vasp/base.py Outdated Show resolved Hide resolved

liamhuber commented Oct 10, 2022

View reviewed changes

pyiron_atomistics/vasp/base.py Outdated Show resolved Hide resolved

liamhuber added 4 commits October 10, 2022 06:34

Rename code_specific_options to options

a1d0ca3

Rename code_specific_options to options

a3543c4

Rename code_specific_options to options

34fc829

Rename code_specific_options to options

deea4fb

Fix potentials reliance on implicit chemical order

2716561

samwaseda and others added 20 commits November 4, 2022 22:06

replace start_job by ref_job

191fb16

[dependabot skip] Update environment

32a5029

[dependabot skip] Update environment

e6dae31

[dependabot skip] Update environment

389ef1b

Update environment.yml

bacb407

Update setup.py

206e032

add warning when job already compressed

5919cdf

satisfy Black

8b7979c

minor bugfix

efb4a8d

Add methods to convert MP results to StructureStorage/list

602676d

Check for negative occupation

809da14

For Methfessel-Paxton smearing the occupation of orbitals can be negative. Checking >0 in check_band_occupancy is therefore not enough.

[dependabot skip] Update environment

0f24f48

Codacy only for PRs, continue on error

29a0bca

[dependabot skip] Update environment

338a17d

Merge branch 'master' into vasp_options

d5c0867

stale bot added the stale label Nov 23, 2022

ligerzero-ai removed the stale label Nov 23, 2022

ligerzero-ai mentioned this pull request Nov 23, 2022

[new] vasp options #880

Closed

ligerzero-ai removed a link to an issue Nov 23, 2022

Pyiron generated VASP inputs: Option to preserve atomic ordering #752

Open

ligerzero-ai closed this Nov 23, 2022

liamhuber mentioned this pull request Jan 23, 2023

Add high-level spec pyiron/pyiron_base#974

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vasp options #799

Vasp options #799

liamhuber commented Oct 6, 2022

coveralls commented Oct 6, 2022 •

edited

ligerzero-ai commented Oct 6, 2022 •

edited

ligerzero-ai commented Oct 6, 2022 •

edited

liamhuber commented Oct 7, 2022

ligerzero-ai commented Oct 7, 2022

liamhuber commented Oct 10, 2022

ligerzero-ai commented Oct 11, 2022

ligerzero-ai commented Oct 11, 2022 •

edited

ligerzero-ai commented Oct 11, 2022 •

edited

ligerzero-ai commented Oct 11, 2022

ligerzero-ai commented Oct 11, 2022 •

edited

samwaseda commented Oct 11, 2022

review-notebook-app bot commented Nov 4, 2022

stale bot commented Nov 23, 2022

ligerzero-ai commented Nov 23, 2022

Vasp options #799

Vasp options #799

Conversation

liamhuber commented Oct 6, 2022

coveralls commented Oct 6, 2022 • edited

Pull Request Test Coverage Report for Build 3226717749

💛 - Coveralls

ligerzero-ai commented Oct 6, 2022 • edited

ligerzero-ai commented Oct 6, 2022 • edited

liamhuber commented Oct 7, 2022

ligerzero-ai commented Oct 7, 2022

liamhuber commented Oct 10, 2022

ligerzero-ai commented Oct 11, 2022

ligerzero-ai commented Oct 11, 2022 • edited

ligerzero-ai commented Oct 11, 2022 • edited

ligerzero-ai commented Oct 11, 2022

ligerzero-ai commented Oct 11, 2022 • edited

samwaseda commented Oct 11, 2022

review-notebook-app bot commented Nov 4, 2022

stale bot commented Nov 23, 2022

ligerzero-ai commented Nov 23, 2022

coveralls commented Oct 6, 2022 •

edited

ligerzero-ai commented Oct 6, 2022 •

edited

ligerzero-ai commented Oct 6, 2022 •

edited

ligerzero-ai commented Oct 11, 2022 •

edited

ligerzero-ai commented Oct 11, 2022 •

edited

ligerzero-ai commented Oct 11, 2022 •

edited