Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VASP/Phonopy Phonon Workflow #137

Merged
merged 77 commits into from
Sep 6, 2022
Merged

VASP/Phonopy Phonon Workflow #137

merged 77 commits into from
Sep 6, 2022

Conversation

JaGeo
Copy link
Member

@JaGeo JaGeo commented May 23, 2022

Summary

This is a work in progress implementation of a Phonon Workflow with VASP and Phonopy. It is based on the work and experiences of @ab5424 in his Master's thesis (for atomate).

I'm still a long way from finishing the workflow, but I believe I've identified the most critical classes that need to be developed and have an idea regarding data transfer.

Of course, feedback is welcome now, but I believe it requires some more changes from my side before someone of you invests too much time.

@JaGeo
Copy link
Member Author

JaGeo commented May 23, 2022

As I know that @gpetretto and @davidwaroquiers are also working on transferring workflows from abiflows to atomate2, there might be already some ideas about a Document for PhononData. If so, I would be happy to hear about it.

@utf
Copy link
Member

utf commented May 23, 2022

Thanks very much for this Janine!

I'll aim to take a proper look later this week.

Would you mind installing the pre-commit? This will automatically raise (and most likely fix) the linting errors. To install you can run the following in the root directory:

pip install pre-commit
pre-commit install

For this first time, you can run the pre-commit on all files:

pre-commit run --all

Afterwards it will get run automatically whenever you commit something.

@codecov
Copy link

codecov bot commented May 23, 2022

Codecov Report

Merging #137 (faf9515) into main (e7e4133) will increase coverage by 1.48%.
The diff coverage is 92.16%.

@@            Coverage Diff             @@
##             main     #137      +/-   ##
==========================================
+ Coverage   73.51%   74.99%   +1.48%     
==========================================
  Files          49       52       +3     
  Lines        4349     4675     +326     
  Branches      686      736      +50     
==========================================
+ Hits         3197     3506     +309     
- Misses        980      983       +3     
- Partials      172      186      +14     
Impacted Files Coverage Δ
src/atomate2/vasp/sets/base.py 67.66% <83.33%> (+0.08%) ⬆️
src/atomate2/vasp/jobs/phonons.py 84.61% <84.61%> (ø)
src/atomate2/vasp/flows/phonons.py 94.44% <94.44%> (ø)
src/atomate2/vasp/schemas/phonons.py 95.68% <95.68%> (ø)
src/atomate2/common/jobs.py 100.00% <100.00%> (+100.00%) ⬆️

@JaGeo
Copy link
Member Author

JaGeo commented May 23, 2022

Thanks. Will do. As I said, it really still needs some days of work and testing but it probably helps to get some suggestions with regard to general design now.

@gpetretto
Copy link
Contributor

Hi Janine, it is very good to see that you started working on this.

I have one general comment. I think that this would be a good example where the workflow does not need to be specific to a DFT code. Roughly the basic idea is that it would be fine to pass Makers from a different code to static_energy_maker and the others. Correct me if I am wrong, but it seems that the portion of the code strictly related to vasp is already relatively small, so it might be feasible to isolate it and abstract what is needed from a calculation with a specific code.

I don't know if @utf has any thought about this. Anyway, since the implementation of the abinit workflows is not ready yet, I would propose to proceed with the implementation for vasp and we will try to generalize it at a later stage when more interfaces with other codes are available. Maybe already trying to keep this into account and trying to separate the part that are specific to vasp from those that are not. Does this seem reasonable?

Concerning the output, we do not have anything specific for the phonons yet, sorry. However, based on how I usually work with phonons, for phonopy I think it would be important to have access to the Phonopy instance to regenerate the BS and DOS with other options or any other property(i.e. to load it more or less like with phonopy.load). Since all the parsed data is written in yaml format by phonopy (using PhonopyYaml) it should not be complicated to just store it in the database as well.

jobs.append(vasp_displacement_calcs)

# Computation of BORN charges
if self.born_maker is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two comments about this point:

  1. Maybe you already planned to update this later, but for how it is written now the BECs are always calculated. However there are good reasons to skip this step (e.g. metals). It would be good that None means that BECs are not calculated.
  2. with the born_maker: BaseVaspMaker = field(default_factory=StaticSetGenerator) it will enter this if only if born_maker is explicitly set to None. Also the default of the of born_maker will not include lepsilon. An option would be to define the born_maker:
    born_maker: BaseVaspMaker = field(default_factory=lambda: StaticSetGenerator(lepsilon=True))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing this out. I will add this option by using born_maker: BaseVaspMaker |None = field(default_factory=lambda: StaticSetGenerator(lepsilon=True)) similar to what is done for the bulk_relax_maker. I was actually thinking about doing this but somehow did not completely implement this option ...

@JaGeo
Copy link
Member Author

JaGeo commented May 30, 2022

As mentioned, I have added one of the suggestions by Guido and the pre-commit flags. I am happy to hear more. Otherwise, I hope that I can refine the workflow in the coming weeks.
In parts of the workflow, I am using directories to access the files to create the output data. I guess it would be much cleaner if I would transfer forces and born data as outputs. This would be something, where I would appreciate an opinion.

@JaGeo
Copy link
Member Author

JaGeo commented May 31, 2022

Switched to transfer via outputs for now. I think this should simplify extending the workflow to other codes as well.

Copy link
Member

@utf utf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JaGeo, apologies its taken some time to look at this in detail.

Overall it looks great! This is exactly along the lines of what I envisioned. I definitely think using the outputs directly (rather than files) is a better approach. I don't think the amount of data transfer will be too bad.

I have two main comments:

  1. We need to be careful about relying on objects inside of Flow Maker make functions. Any function that actually requires the objects should be wrapped in a job.
  2. I think it would be useful to refactor the generate_frequencies_eigenvectors so that most of the work is done inside the pydantic model.

I've gone into a lot more detail in the code review.

If anything isn't clear, I'd be happy to discuss on here/email/zoom. Thanks again for this. I'm very excited to have a phonon workflow in atomate2.

generate_phonon_displacements_kwargs: dict = field(default_factory=dict)
run_phonon_displacements_kwargs: dict = field(default_factory=dict)
born_maker: BaseVaspMaker | None = field(
default_factory=lambda: StaticSetGenerator(lepsilon=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a VaspMaker rather than an input set generator. There is already a core job Maker that does what you need:

from atomate2.vasp.jobs.core import DielectricMaker

And then:

Suggested change
default_factory=lambda: StaticSetGenerator(lepsilon=True)
default_factory=DielectricMaker

default_factory=lambda: DoubleRelaxMaker.from_relax_maker(TightRelaxMaker())
)
static_energy_maker: BaseVaspMaker | None = field(
default_factory=StaticSetGenerator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a VaspMaker rather than an input set generator. E.g.:

from atomate2.vasp.jobs.core import StaticMaker

And then:

Suggested change
default_factory=StaticSetGenerator
default_factory=StaticMaker

Comment on lines 143 to 144
sga = SpacegroupAnalyzer(structure, symprec=self.symprec)
structure = sga.get_primitive_standard_structure()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The make function should be designed so that it accepts OutputReferences as well as the actual structure objects. This allows you to use the Maker in the middle of a flow. E.g.

initial_relax_job = MyCustomRelaxMaker().make(structure)
phonon_flow = PhononMaker().make(initial_relax_job.output.structure)

The way the workflow is currently written, this will fail as SpacegroupAnalyzer will get passed an OutputReference object.

The way around this is doing everything that actually needs the real objects inside a job. E.g., you should make a new job that just converts the structure to the primitive cell and returns that. Something like:

@job
def structure_to_primitive(structure, symprec):
    sga = SpacegroupAnalyzer(structure, symprec=symprec)
    return sga.get_primitive_standard_structure()

And then inside the make function:

Suggested change
sga = SpacegroupAnalyzer(structure, symprec=self.symprec)
structure = sga.get_primitive_standard_structure()
prim_job = structure_to_primitive(structure, self.symprec)
structure = prim_job.output

symprec: float = SETTINGS.SYMPREC
displacement: float = 0.01
min_length: float = 20.0
conventional: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit iffy about converting the structure to the primitive cell without any option to disable this. Could you turn option into something more sophisticated. E.g., something like "symmetrize" where they options could be "primitive", "conventional", None. Where None disables any structure modification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but then we should probably also make the computation of the band structure optional?

Copy link
Member

@utf utf Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially? Although it might be possible to use the AUTO option for PRIMITIVE_AXIS to automatically align the band structure whatever input structure you use?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know this option. Need to check.

Copy link
Member Author

@JaGeo JaGeo Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to be able to use the kpath options from pymatgen. I am not sure phonopy uses the same convention for the primitive cell (actually think it does not).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about it further:
what we could do is allowing to switch to another kpath scheme than seekpath when the primitive cell in the corresponding convention is chosen for the phonon run. We could use seekpath as a default though.


# Computation of BORN charges
if self.born_maker is not None:
born_job = StaticMaker(input_set_generator=self.born_maker).make(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If self.born_maker is a Maker then you don't need to initialise the maker like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Understood!

return displacements


@job(output_schema=PhononBSDOSDoc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think phonon band structures and density of states should be stored in the data store (for large objects). You can do this automatically using:

Suggested change
@job(output_schema=PhononBSDOSDoc)
from pymatgen.phonon.bandstructure import PhononBandStructure
from pymatgen.phonon.dos import PhononDos
@job(output_schema=PhononBSDOSDoc, data=[PhononDos, PhononBandStructure])

# print(born)
if born is not None:
if not conventional:
primitive = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think phonopy now has tools to determine the primitive matrix automatically?

src/atomate2/vasp/jobs/phonons.py Show resolved Hide resolved
src/atomate2/vasp/jobs/phonons.py Show resolved Hide resolved
def get_phonon_object(
displacement, min_length, structure, sym_reduce, symprec, conventional
):
transformation = CubicSupercellTransformation(min_length=min_length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having this here (where CubicSupercellTransformation will get called multiple times at different parts of the workflow), could it be refactored so that the supercell matrix is calculated once at the beginning of the flow and then passed directly to the various other jobs throughout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. I will indeed change this.

@JaGeo
Copy link
Member Author

JaGeo commented Jun 8, 2022

Thank you very much for the detailed review. I will try to improve the code according to these suggestions. I hope I can manage to find time in the next 1-2 weeks :).

@JaGeo
Copy link
Member Author

JaGeo commented Jun 9, 2022

I will let you know once I have a next version. Still a lot of testing and debugging necessary and also work on the document.

@utf
Copy link
Member

utf commented Jun 9, 2022

The changes you've made already look great! Happy to help diagnose any bugs should they come up.

# I don't see it
# I think I am missing something in the code
phonon.save("phonopy.yaml")
phonon = load("phonopy.yaml")
Copy link
Member Author

@JaGeo JaGeo Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@utf : I currently have a problem here. I can only get correct bandstructures with NAC correction when I save the phonon object and then reload. There should be another way but I currently don't find the bug. If you have any idea, I am happy to hear. I guess something goes wrong in the construction of the dynamical matrix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am attaching minimal example where it does not happen. I can currently not reproduce this bug outside of the workflow.
minimal_phonopy.zip

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I can't see what is going wrong!

Can you confirm that you're using the same phonopy version outside of the workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will confirm this and check again but thanks already for having a look!

Copy link
Member Author

@JaGeo JaGeo Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have upgraded to the latest phonopy version. It is still happening. I think I will continue with this file creation part for now and try to finalize the document part first. And I tested the other script with the phonopy versions from this year. They all don't have any problems at Gamma.
phonon_band_structure-01

Copy link
Member Author

@JaGeo JaGeo Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the script that I am using for the workflow. I might need to try it in a completely new environment to make sure this is also not the reason for this problem.
start_workflow.zip
(There might be some differences in the q-points in the files. I know and I am on this but the problem is at Gamma and not affected)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could be related to the same issue I had to fix in sumo. Basically, the default symprec for generating the distortions is 1e-5 but I was applying a different symprec during the analysis. If I used the same symprec of 1e-5 when generating the band structure it fixed the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the workflow is using the same symprec throughout but it is quite high. I wonder if it will fix things if you use 1e-5?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try! Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were correct. It was the symprec... I have now introduced a second symprec setting.

I will try to further clean up the code in the next days and check the kpath setting.

@JaGeo
Copy link
Member Author

JaGeo commented Jun 17, 2022

Open points:

  • Code is still a bit messy, needs some refactoring
  • Cleanup of documentation, making sure every important parameter can be somehow adapted
  • Need to check kpath creation and correspondence with created cells again
  • Check if symprec is small enough for everything
  • Larger convergence test
  • make tests on real systems with real supercell sizes
  • check this phonopy object again and see why the object has to be saved and loaded
  • decision what we would like to save in the document (should some more fields live in a data store for large data?)
  • summarize some parameters in document in fields
  • write unit tests
  • potentially create another way to determine supercell matrices that prefers 90 degree angles [Added some option to pymatgen to test if the cell is 90 degree]

@JaGeo
Copy link
Member Author

JaGeo commented Jun 18, 2022

@utf in case you have some time in the next weeks, I would be happy about feedback on the document part.

I think we need to store force constants, born charges etc. to easily develop workflows for Grüneisen parameters/quasi-harmonic approximation.

)
# could be optional and implemented at a later stage?
npoints_band: int = Field("number of points for band structure computation")
kpath_scheme: str = Field("indicates the kpath scheme")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these calculation settings could all be grouped into a subfield. Something e.g., just "calculation_settings" or "phonopy_settings". As if more get added later this could become quite messy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to think about this field again. Will try to include it in one of the next pull-requests. Overall, the workflow is quite complex and I have to go through it again several times with different test data.

)
# have to add all folders for the computations!
# TODO: should we add folder for born and optimization as well?
uuids: List[str] = Field(None, description="The uuids of the displacement jobs.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you also want the UUID of the bulk structure to obtain the total energy?

src/atomate2/vasp/jobs/phonons.py Show resolved Hide resolved
src/atomate2/vasp/flows/phonons.py Show resolved Hide resolved
@utf
Copy link
Member

utf commented Jun 23, 2022

I think we need to store force constants, born charges etc. to easily develop workflows for Grüneisen parameters/quasi-harmonic approximation.

I think its a good idea to store the force constants too. Maybe there could be a flag to disable storing them if people are want to be very space conscious or e.g., if you are only doing convergence calculations.

@JaGeo
Copy link
Member Author

JaGeo commented Sep 2, 2022

Sorry... Added it to the wrong workflow ....

@JaGeo
Copy link
Member Author

JaGeo commented Sep 2, 2022

Had to add seekpath to the optional requirements [phonons] as well. Everytime I install phonopy, I forget to install seekpath ...

@JaGeo
Copy link
Member Author

JaGeo commented Sep 2, 2022

Tests have passed. I would reply to any further comments later today or next week.

Copy link
Member

@utf utf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JaGeo, a few more observations.

I'm really happy how this PR turned out. Thanks for all your edits. Personally, I think once you've made these changes we can merge this and then do the more substantial testing afterwards.

@@ -26,7 +26,6 @@ jobs:
pip install .[strict]
pip install .[tests]
pip install .[dev]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this back in please.

@@ -57,6 +56,7 @@ jobs:
pip install .[strict]
pip install .[tests]
pip install .[docs]
pip install .[phonons]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a thought. Could you also add phonopy and seekpath to the "strict" dependency set and specify the most recent version with ==. You can then remove this line.

This will mean that the tests are deterministic. E.g., if a new phonopy version is released that breaks the API, the tests won't start failing randomly.

pyproject.toml Show resolved Hide resolved
if set to True, supercell algorithm will first try to find a supercell
with 3 90 degree angles
get_supercell_kwargs: additional arguments
to determine supercell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This the type for this parameter is wrong, it should be dict. Also can you list the function which these parameters will be passed to. That way the user can look this function up to see what options are supported.

Comment on lines 109 to 116
you can only use seekpath with any kind of cell
Otherwise, please use the standard primitive structure
Available schemes are:
"seekpath", "hinuma", "setyawan_curtarolo", "latimer_munro".
"seekpath" and "hinuma" are the same definition but
seekpath can be used with any kind of unit cell as
it relies on phonopy to handle the relationship
to the primitive cell and not pymatgen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are indented by 1 extra space.

src/atomate2/vasp/jobs/phonons.py Show resolved Hide resolved
if True, symmetry will be used to generate displacements
symprec: float
precision to determine symmetry
use_symmetrized_structure: str»|None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use_symmetrized_structure: str»|None
use_symmetrized_structure: str or None

src/atomate2/vasp/jobs/phonons.py Show resolved Hide resolved
born: Matrix3D
Born charges
kwargs:
additional arguments that are passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say where they are to PhononBSDOSDoc.from_forces_born.

Comment on lines 308 to 310
displacements
structure: original structure for meta data
supercell_matrix: supercell matrix for meta data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing types and descriptions.

@JaGeo
Copy link
Member Author

JaGeo commented Sep 5, 2022

Thanks. Will go through all of the suggested changes today!
I am currently trying to setup everything on another computer so that I can hopefully test more compounds faster. I still sometimes have weird issues with custodian and VASP for extremely simple systems that I need to resolve ...

@JaGeo
Copy link
Member Author

JaGeo commented Sep 5, 2022

I hope I added all suggestions. I made one more change on the tests. I realized on the weekend that I had several useless statements in there ...

We could rerun the tests again. I hope that I changed the installation configurations correctly. I will be really happy if you don't need to restart the automatic tests by hand every time I make a mistake ...

@JaGeo
Copy link
Member Author

JaGeo commented Sep 5, 2022

Yeah, of course, more typos. I usually use the read-out-loud function everywhere to find such things... Now, I think it should be fine ;).

@utf
Copy link
Member

utf commented Sep 5, 2022

I will be really happy if you don't need to restart the automatic tests by hand every time I make a mistake ...

I think once this is merged GitHub won't require me to do it manually any more, since you will officially be a contributors. 🤞

One final comment. Are you happy if I merge once that is resolved?

A structure.
symprec : float
The symmetry precision.
structure: Structure object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these should be indented an extra level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, tests have to be restarted, of course. Sorry.

@JaGeo
Copy link
Member Author

JaGeo commented Sep 5, 2022

Yes, I would be happy if this is merged. I will then run several test structures, use it in my research and start a new pull-request to fix potential bugs in case they show up.

@utf
Copy link
Member

utf commented Sep 6, 2022

Thanks very much @JaGeo. This is an exemplar workflow!

@utf utf merged commit 4dbeccb into materialsproject:main Sep 6, 2022
@JaGeo
Copy link
Member Author

JaGeo commented Sep 6, 2022

Thank you so much

@JaGeo JaGeo changed the title [WIP] VASP/Phonopy Phonon Workflow VASP/Phonopy Phonon Workflow May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants