Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO support for res files (AIRSS results) #2625

Merged
merged 26 commits into from
Aug 31, 2022

Conversation

ScottNotFound
Copy link
Contributor

@ScottNotFound ScottNotFound commented Aug 18, 2022

Summary

Include a summary of major changes in bullet points:

  • Adds res file io to handle results from airss searches, Parsing AIRSS results into structures #2560
  • This isn't proper support for the shelx format (does anyone use that anymore?), but if all we care about is airss then that's fine.

Additional dependencies introduced (if any)

  • no new dependencies

TODO (if any)

If this is a work-in-progress, write something about what else needs
to be done

  • It only provides functions to build Structure types. I think there is an Entry class somewhere that uses energy for phase diagrams and whatnot. The parser fills out a parse tree made of dataclasses, so the energy, pressure, volume, and other stuff airss sticks in the header is trivial to pull out. I can add that next time I have down time.
  • Make all the docstrings in the proper format

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP]
in the pull request title.

Before a pull request can be merged, the following items must be checked:

  • Code is in the standard Python style. The easiest way to handle this
    is to run the following in the correct sequence on your local machine. Start with running
    black on your new code. This will automatically reformat
    your code to PEP8 conventions and removes most issues. Then run
    pycodestyle, followed by
    flake8.
  • Docstrings have been added in the Google docstring format.
    Run pydocstyle on your code.
  • Type annotations are highly encouraged. Run mypy to type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.
  • Support for Entries.

Note that the CI system will run all the above checks. But it will be much more efficient if you already fix most
errors prior to submitting the PR. It is highly recommended that you use the pre-commit hook provided in the pymatgen
repository. Simply cp pre-commit .git/hooks and a check will be run prior to allowing commits.

ScottNotFound and others added 14 commits August 18, 2022 16:47
- blank lines
- END tag
- empty tags
- typing names for py38
- pydoc for the things pydocstyle complained about
- Removed exposure of the Res dataclass.
- Added providers that expose features of the res file in a clean way.
- Moved all the private module level functions with bad names to appropriate classes for their use.
- Added a writer class that encapsulates all those private module level functions that supported writing.
@mkhorton
Copy link
Member

Thanks for working on this @ScottNotFound, it'll be a nice addition.

It looks like the PR is basically complete, are you waiting on anything besides linting? I'm not sure if the capitalized CELL etc. dataclass attributes on Res are a PEP8 violation, but otherwise the PR looks great.

Being unfamiliar with both the AIRSS file format and shelx, I will trust this is appropriate :) Are there any AIRSS input parameters that might substantially change the output format; are there any additional test example files that should be added?

@mkhorton
Copy link
Member

Ah, I see a test failure too:

E AttributeError: type object 'ResIO' has no attribute 'structure_to_txt'

I know this is marked as a WIP, so assume this is already known!

@ScottNotFound
Copy link
Contributor Author

The implementation is pretty much done. I was going to add some more tests for the module. After those I'll mark it ready for review.

PEP8 says something to the effect of constants being all caps, but I just named them the way the tags appear in the res file. They are frozen dataclasses.. not sure if that counts as constant. Those are all internal though so they can be changed if you prefer a different style.

I actually don't know anything about the shelx file format other than it exists. I considered supporting it, but it was hard to find information on it. Also not sure if anyone other that AIRSS still uses it.

As for AIRSS parameters changing the format, I'm not sure. I've only used CASTEP with AIRSS. I've assumed some things are common to all AIRSS runs, but it's possible that may not be the case. I be surprised if the parts that matter (like the structure and TITL entry) were different.

@ScottNotFound
Copy link
Contributor Author

Ah, I see a test failure too:

E AttributeError: type object 'ResIO' has no attribute 'structure_to_txt'

I know this is marked as a WIP, so assume this is already known!

Yeah I found that while writing a test after refactoring. VSCode is not as good as pycharm at refactoring method names.

@ScottNotFound ScottNotFound marked this pull request as ready for review August 29, 2022 23:51
@ScottNotFound ScottNotFound changed the title [WIP] IO support for res files (AIRSS results) IO support for res files (AIRSS results) Aug 29, 2022
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 63.505% when pulling 31b9939 on ScottNotFound:res into beac8d5 on materialsproject:master.

pymatgen/io/res.py Show resolved Hide resolved
pymatgen/core/structure.py Show resolved Hide resolved
@ScottNotFound
Copy link
Contributor Author

Looks like the linter failed because mypy doesn't have type info for dateutil. I installed the type stubs into my env, but the linter env doesn't have them. I guess I can either:

  1. keep the code around dateutil unchanged, and you all add the type stubs or put in a mypy ignore for dateutil
  2. throw some type: ignores on those lines
  3. not use dateutil

pymatgen/io/res.py Outdated Show resolved Hide resolved
@janosh
Copy link
Member

janosh commented Aug 31, 2022

Very nice PR! Thanks @ScottNotFound.

@mkhorton
Copy link
Member

Thanks both for lending AIRSS expertise :)

Not relevant to this PR, but I did want to point out we have new-ish InputFile and InputSet base classes. If you're looking at expanding AIRSS support for writing inputs via pymatgen in the future this would be the way to go.

I also have my own stalled CASTEP PR @ #2048 which is actually almost ready to merge, but I wanted to update to use these base classes myself.

@ScottNotFound
Copy link
Contributor Author

Looks like Testing Linux / test (3.10) (pull_request) may be related to #2515?

Not sure why Linting / build (3.9) (pull_request) is failing but it looks unrelated.

@mkhorton
Copy link
Member

Thanks @ScottNotFound, I have no concerns about this PR -- I'll go ahead and merge, thanks again.

The unrelated failures I can look into separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants