Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type annotations for io.vasp.inputs/optics #3740

Merged
merged 37 commits into from
Apr 21, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Apr 7, 2024

Summary

  • Add type annotations for io.vasp.inputs/optics
  • Replace str as path type with PathLike
  • Docstring tweaks

Originally

Proposed by @janosh in #3739 (comment).

  • Run ruff check . --select ANN204 --unsafe-fixes to auto-fix "ANN204", to experiment on auto typing fix with ruff.
  • 135 mypy errors were dug out and need to be fixed.

ANN204 refers to:

Checks that "special" methods, like __init__, __new__, and __call__, have return type annotations.

Summary by CodeRabbit

  • New Features

    • Added a 60-second timeout to HTTP requests across various modules to improve system reliability during data fetching.
  • Refactor

    • Enhanced code clarity by specifying explicit types for various attributes and variables across multiple modules.
  • Bug Fixes

    • Adjusted handling of subprocess output and method argument defaults to prevent errors and improve data handling.

Copy link

coderabbitai bot commented Apr 7, 2024

Warning

Rate Limit Exceeded

@DanielYang59 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 27 minutes and 11 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between 2954c80 and 2991771.

Walkthrough

This update primarily focuses on enhancing code clarity and reliability across the pymatgen library and associated scripts. Key changes include the introduction of explicit type annotations for variables and attributes, the addition of timeouts to HTTP requests to prevent hanging processes, and minor adjustments such as renaming variables and updating method signatures. These modifications aim to improve the maintainability and performance of the codebase, ensuring more robust and clear code structure.

Changes

File Path Change Summary
dev_scripts/update_pt_data.py Added a 60-second timeout to HTTP requests.
pymatgen/.../filters.py, transmuters.py Specified structure_list and structure_data as lists.
pymatgen/analysis/... Updated various attributes to explicitly specify types such as lists and dictionaries.
pymatgen/apps/..., pymatgen/command_line/... Enhanced clarity with type hints and modified variable handling.
pymatgen/ext/..., pymatgen/io/... Added timeouts to HTTP requests, updated type annotations in various I/O related files.
pymatgen/transformations/..., pymatgen/vis/... Updated type specifications for attributes and parameters.
tasks.py, tests/ext/... Standardized the addition of a 60-second timeout across various test and task scripts.
tests/io/vasp/test_outputs.py Applied @pytest.mark.skip() decorator to skip certain tests, no significant functionality changes.

This table groups files with similar changes, focusing on type annotations and the addition of timeouts, which are prevalent throughout this update.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@DanielYang59
Copy link
Contributor Author

@janosh I re-visited this PR and realized it might not be a good idea to work on a single method globally like this, we should push the type annotation effort module by module.

Though I agree ruff auto-fix is looking promising but we should not run it globally.

My reason being: by working like this, we're just adding types for a single (or very few) function among many other un-typed functions. But as the types within a module heavily depend on others, it would either dig out huge amount of extra work, or make the type very loose.

I would prefer close this and push our type-annotation effort module by module as discussed previously (starting for core, io.vasp, electronic, phonon and such). What do you think?

@janosh
Copy link
Member

janosh commented Apr 12, 2024

going folder by folder sounds good. no need to close this PR though, we can just start with pymatgen/io/vasp in this one.

i just reverted the initial

ruff check . --select ANN204 --unsafe-fixes --fix

and force pushed

ruff check pymatgen/io/vasp --select ANN204 --unsafe-fixes --fix

instead and keeping your manual fixes. you just need to run git reset --hard @{u} to get back to the tip of this branch

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 12, 2024

going folder by folder sounds good. no need to close this PR though, we can just start with pymatgen/io/vasp in this one.

Good point. Let's do it.

instead and keeping your manual fixes. you just need to run git reset --hard @{u} to get back to the tip of this branch

There aren't many valuable fixes though. Thanks for the tip!

@DanielYang59 DanielYang59 changed the title ruff auto typing of "ANN204" Add type annotations for io.vasp Apr 15, 2024

# If default_names is specified (usually coming from a POTCAR), use
# them. This is in line with VASP's parsing order that the POTCAR
# specified is the default used.
if default_names:
if default_names is not None:
Copy link
Contributor Author

@DanielYang59 DanielYang59 Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of default_names seems inconsistent in:

elif fmt_low == "poscar":
from pymatgen.io.vasp import Poscar
struct = Poscar.from_str(input_string, default_names=False, read_velocities=False, **kwargs).structure

Changed to None in 1a1752a.

And

@classmethod
def from_str(cls, data, default_names=None, read_velocities=True) -> Self:
"""
Reads a Poscar from a string.
The code will try its best to determine the elements in the POSCAR in
the following order:
1. If default_names are supplied and valid, it will use those. Usually,
default names comes from an external source, such as a POTCAR in the
same directory.
2. If there are no valid default names but the input file is VASP5-like
and contains element symbols in the 6th line, the code will use that.
3. Failing (2), the code will check if a symbol is provided at the end
of each coordinate.
If all else fails, the code will just assign the first n elements in
increasing atomic number, where n is the number of species, to the
Poscar. For example, H, He, Li, .... This will ensure at least a
unique element is assigned to each site and any analysis that does not
require specific elemental properties should work fine.
Args:
data (str): String containing Poscar data.
default_names ([str]): Default symbols for the POSCAR file,
usually coming from a POTCAR in the same directory.
read_velocities (bool): Whether to read or not velocities if they
are present in the POSCAR. Default is True.
Returns:
Poscar object.
"""

@DanielYang59
Copy link
Contributor Author

I'm trying to relocate the class methods such that dunder methods and properties come first. But this seems to pollute git history. Wondering if there is a way to relocate codes without such side effect?

@DanielYang59 DanielYang59 changed the title Add type annotations for io.vasp Add type annotations for io.vasp.inputs/optics Apr 18, 2024
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 18, 2024

Can you please review this @janosh? I decided to separate the work for different modules as there are too many changes. Some return types are removed for completely un-typed __init__ in io.vasp.outputs to avoid mypy checking for now, I would work on it later.

There is one minor fix for the default value of default_names in #3740 (comment).

UPDATE: And there are two requests for changing the default values for type compatibility in #3740 (comment).

Other than these two, I don't expect further functional changes (the unit test failure is related to the requested default value change, I would need your confirmation before I could change the unit test :) ). Thanks!

@DanielYang59 DanielYang59 marked this pull request as ready for review April 18, 2024 12:35
@janosh
Copy link
Member

janosh commented Apr 19, 2024

Wondering if there is a way to relocate codes without such side effect?

not really. have a look at this video which doesn't solve your problem but is slightly relevant

@DanielYang59
Copy link
Contributor Author

Wondering if there is a way to relocate codes without such side effect?

not really. have a look at this video which doesn't solve your problem but is slightly relevant

Thanks. It's always great to know more tips :)

Should be ready for reviewing then. I reverted the default values to None. Thanks!

@@ -908,7 +908,7 @@ def __init__(self, permutations_safe_override=False, only_symbols=None):

self.minpoints = {}
self.maxpoints = {}
self.separations_cg = {}
self.separations_cg: dict = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the types more specific? list/dict/set/... alone don't add any information for the reader. dict[str, CoordinationGeometry], list[Structure], etc. is more helpful

Copy link
Contributor Author

@DanielYang59 DanielYang59 Apr 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the types more specific? list/dict/set/... alone don't add any information for the reader. dict[str, CoordinationGeometry], list[Structure], etc. is more helpful

Yes this is very true. But in this PR I was working on adding types for io.vasp, sometimes mypy would complain about other unrelated modules (with need annotation for some line) which I haven't yet got time to look closer into. So in these case I just put a general type to stop mypy from complaining (if I look closely into every module, then the work can never be done).

But I would keep this in mind and try to more specific types :)

pymatgen/apps/battery/plotter.py Outdated Show resolved Hide resolved
@janosh janosh added the types Type all the things label Apr 21, 2024
Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks so much @DanielYang59! 🙏

given the generic type annotations like var_name: list = [] are easy to find and fix later, we can merge this now

@janosh janosh merged commit ad6eafe into materialsproject:master Apr 21, 2024
22 checks passed
@DanielYang59 DanielYang59 deleted the ruff-typing branch April 21, 2024 09:49
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 21, 2024

Thanks for reviewing!

given the generic type annotations like var_name: list = [] are easy to find and fix later

Yes I don't know how mypy made those decisions, but it seems to require type annotations for some unrelated and un-typed code sections. Such general types were added to stop such mypy complaints :)

@janosh janosh added io Input/output functionality vasp Vienna Ab initio Simulation Package labels Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
io Input/output functionality types Type all the things vasp Vienna Ab initio Simulation Package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants