Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyright fixes for ext/io/phonon/symmetry/transformations/util/vis/dev_scripts and improve io.lobster #3757

Merged
merged 72 commits into from
Apr 15, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Apr 13, 2024

Summary

Started by @ml-evs in #3646:

  • Install pyright to CI (delayed until all chunks are finished)
  • Related pyright fixes
  • Improvement on io.lobster by @JaGeo
  • Format tweaks (sourcery), I should not have done this (make it hard to view difference)

Need confirmation on the following:

pyright is commented out until all chunks are finished, otherwise everyone's CI would fail.

There are too many (713) pyright errors to fix in a single PR, I'm planning to chop up into chunks. First chunk would fix the following modules:

  • vis
  • util
  • transformations
  • symmetry
  • phonon
  • io
  • ext
  • dev_scripts

Side note

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced the logic for setting calculation parameters based on symmetry and time reversal considerations.
    • Introduced better error handling and contextual exception messages across various modules.
    • Improved file handling with UTF-8 encoding for consistent data processing.
  • Bug Fixes

    • Fixed issues with string formatting and variable initialization that affected data parsing and output consistency.
    • Corrected logical errors in conditional statements that impacted computational routines.
  • Refactor

    • Streamlined string formatting using f-strings for better readability and performance.
    • Simplified and optimized various methods and functions for enhanced maintainability and clarity.
  • Documentation

    • Updated method descriptions and error messages to be more informative and user-friendly.
  • Style

    • Standardized variable naming and method calls to adhere to project coding standards.
  • Tests

    • Adjusted file encoding in test setups to ensure compatibility and correctness during testing.

ml-evs and others added 8 commits February 23, 2024 10:36
- Hopefully all tests have been written in such a way that they will be
  skipped if the underlying required package is not found
Signed-off-by: Matthew Evans <7916000+ml-evs@users.noreply.github.com>
Copy link

coderabbitai bot commented Apr 13, 2024

Walkthrough

The recent updates across the pymatgen library focus on improving code quality through enhanced readability, efficiency, and robustness. Changes include variable renaming for clarity, refactoring for better logic flow, and the integration of modern Python features like the walrus operator. Additionally, stricter type checking with pyright has been introduced to ensure higher code quality.

Changes

Files Change Summary
.github/workflows/lint.yml, .pre-commit-config.yaml, pyproject.toml Integrated pyright for improved linting and type checking.
pymatgen/command_line/vampire_caller.py, pymatgen/io/.../*.py, pymatgen/io/.../outputs.py, pymatgen/io/.../inputs.py Variable renaming, enhanced string formatting, and logic simplification.
pymatgen/core/..., pymatgen/ext/... Code refactoring for efficiency and readability, including regex and error handling enhancements.
pymatgen/symmetry/..., pymatgen/phonon/..., pymatgen/transformations/... Updated variable naming, added checks, and refactored logic.
pymatgen/util/..., dev_scripts/... Improved clarity in variable names, adjusted import handling, and updated encoding settings.
tests/... Updated test scripts to align with code changes and ensure robustness.

Poem

🐇💻✨
Oh, hop through the fields of code, so spry,
Where bugs once lurked, now none shall lie.
With pyright by side, we leap and bound,
In cleaner pastures, our bytes resound.
Celebrate, dear friends, with vim and cheer,
For cleaner, swifter code is here! 🎉🌟


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

structure = Structure(
Lattice(lattice),
species,
coords,
# DEBUG (@DanielYang59): need input on coords_are_cartesian
Copy link
Contributor Author

@DanielYang59 DanielYang59 Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need help on io.pwscf please @janosh (I never use pwscf): coords_are_cartesian is only defined in one of the many branches, and is unbound all elsewhere:

@classmethod
def from_str(cls, string: str) -> Self:
"""
Reads an PWInput object from a string.
Args:
string (str): PWInput string
Returns:
PWInput object
"""
lines = list(clean_lines(string.splitlines()))
def input_mode(line):
if line[0] == "&":
return ("sections", line[1:].lower())
if "ATOMIC_SPECIES" in line:
return ("pseudo",)
if "K_POINTS" in line:
return "kpoints", line.split()[1]
if "OCCUPATIONS" in line:
return "occupations"
if "CELL_PARAMETERS" in line or "ATOMIC_POSITIONS" in line:
return "structure", line.split()[1]
if line == "/":
return None
return mode
sections: dict[str, dict] = {
"control": {},
"system": {},
"electrons": {},
"ions": {},
"cell": {},
}
pseudo = {}
lattice = []
species = []
coords = []
structure = None
site_properties: dict[str, list] = {"pseudo": []}
mode = None
for line in lines:
mode = input_mode(line)
if mode is None:
pass
elif mode[0] == "sections":
section = mode[1]
m = re.match(r"(\w+)\(?(\d*?)\)?\s*=\s*(.*)", line)
if m:
key = m.group(1).strip()
key_ = m.group(2).strip()
val = m.group(3).strip()
if key_ != "":
if sections[section].get(key) is None:
val_ = [0.0] * 20 # MAX NTYP DEFINITION
val_[int(key_) - 1] = PWInput.proc_val(key, val)
sections[section][key] = val_
site_properties[key] = []
else:
sections[section][key][int(key_) - 1] = PWInput.proc_val(key, val)
else:
sections[section][key] = PWInput.proc_val(key, val)
elif mode[0] == "pseudo":
m = re.match(r"(\w+)\s+(\d*.\d*)\s+(.*)", line)
if m:
pseudo[m.group(1).strip()] = m.group(3).strip()
elif mode[0] == "kpoints":
m = re.match(r"(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)", line)
if m:
kpoints_grid = (int(m.group(1)), int(m.group(2)), int(m.group(3)))
kpoints_shift = (int(m.group(4)), int(m.group(5)), int(m.group(6)))
else:
kpoints_mode = mode[1]
kpoints_grid = (1, 1, 1)
kpoints_shift = (0, 0, 0)
elif mode[0] == "structure":
m_l = re.match(r"(-?\d+\.?\d*)\s+(-?\d+\.?\d*)\s+(-?\d+\.?\d*)", line)
m_p = re.match(r"(\w+)\s+(-?\d+\.\d*)\s+(-?\d+\.?\d*)\s+(-?\d+\.?\d*)", line)
if m_l:
lattice += [
float(m_l.group(1)),
float(m_l.group(2)),
float(m_l.group(3)),
]
elif m_p:
site_properties["pseudo"].append(pseudo[m_p.group(1)])
species.append(m_p.group(1))
coords += [[float(m_p.group(2)), float(m_p.group(3)), float(m_p.group(4))]]
if mode[1] == "angstrom":
coords_are_cartesian = True
elif mode[1] == "crystal":
coords_are_cartesian = False
structure = Structure(
Lattice(lattice),
species,
coords,
coords_are_cartesian=coords_are_cartesian,
site_properties=site_properties,
)
return cls(
structure=structure,
control=sections["control"],
pseudo=pseudo,
system=sections["system"],
electrons=sections["electrons"],
ions=sections["ions"],
cell=sections["cell"],
kpoints_mode=kpoints_mode,
kpoints_grid=kpoints_grid,
kpoints_shift=kpoints_shift,
)

I currently set the default coords_are_cartesian to False (the default of Structure), but I'm not sure if it's the default for pwscf too. Would need confirmation on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, so many code branches and only one of them was actually tested?

[8.49746961, 10.62408601, 8.49746961],
[6.373854, 6.373854, 10.62565387],
[8.49746961, 8.49746961, 10.62408601],
[10.62408601, 10.62408601, 10.62408601],
]
)
pw_in = PWInput.from_str(string)

this seems in urgent need of more tests. i think defaulting coords_are_cartesian=False seems right. i also never used this code so it's hard to know for sure. apparently no one does or this would have been reported long ago

Copy link
Contributor Author

@DanielYang59 DanielYang59 Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so many code branches and only one of them was actually tested?

Yes, coords_are_cartesian is defined in only one of the many branches and is unbound all elsewhere 😓, meaning the other branches were likely never entered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

pymatgen/core/periodic_table.py Show resolved Hide resolved
pymatgen/io/vasp/outputs.py Show resolved Hide resolved
pymatgen/io/vasp/outputs.py Show resolved Hide resolved
pymatgen/io/vasp/outputs.py Show resolved Hide resolved
pymatgen/io/vasp/outputs.py Show resolved Hide resolved
@janosh janosh enabled auto-merge (squash) April 15, 2024 08:15
@janosh janosh added linting Linting and quality assurance housekeeping Moving around or cleaning up old code/files labels Apr 15, 2024
@janosh janosh merged commit 0e57abf into materialsproject:master Apr 15, 2024
22 checks passed
@DanielYang59 DanielYang59 deleted the pyright branch April 15, 2024 08:31
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 15, 2024

Thanks for reviewing @janosh and great to collaborate with you on this one @JaGeo.

Sorry for hijacking this topic @ml-evs.

@ml-evs
Copy link
Contributor

ml-evs commented Apr 15, 2024

Sorry for hijacking this topic @ml-evs.

No problem, this is great work! It looks this was fairly comprehensive; I'll try reopening my PR that adds the CI runs that triggered this.

@JaGeo
Copy link
Member

JaGeo commented Apr 15, 2024

Sorry for hijacking this topic @ml-evs.

No problem, this is great work! It looks this was fairly comprehensive; I'll try reopening my PR that adds the CI runs that triggered this.

This is only for some modules. More work is needed

@DanielYang59
Copy link
Contributor Author

This is only for some modules. More work is needed

Yes! It's only like 1/3 of the all work needed. I would ping you once it's all ready. @ml-evs

@ml-evs
Copy link
Contributor

ml-evs commented Apr 15, 2024

This is only for some modules. More work is needed

Yes! It's only like 1/3 of the all work needed. I would ping you once it's all ready. @ml-evs

Understood, I've opened #3759 with the same changes as before (except enabling pyright), hopefully the CI is now stable enough after the uv switch that it won't go stale again so quickly.

@janosh
Copy link
Member

janosh commented Apr 15, 2024

Yes! It's only like 1/3 of the all work needed. I would ping you once it's all ready. @ml-evs

@DanielYang59 thanks a lot for donating your time to work on this! 👍

@DanielYang59
Copy link
Contributor Author

@DanielYang59 thanks a lot for donating your time to work on this! 👍

Glad I can help :)

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 15, 2024

@janosh I'm still experimenting on an issue I met with @JaGeo in #3757 (comment). I would appreciate it if you could give me some advice 😄

I just created a minimal code to recreate this issue:

from collections import UserDict

class TestClass(UserDict):

    def __init__(self, settingsdict: dict):
        super().__init__()
        self.update(settingsdict)

    def __setitem__(self, key, val):
        """Remove case sensitivity."""
        new_key = next((key_here for key_here in self if key.strip().lower() == key_here.lower()), key)

        super().__setitem__(new_key, val)

    def __getitem__(self, item):
        """Implements getitem from dict to avoid problems with cases."""
        new_item = next((key_here for key_here in self if item.strip().lower() == key_here.lower()), item)

        return super().__getitem__(new_item)


if __name__ == "__main__":
    test_dict = {"ConTent": "hello", }
    test_instance = TestClass(test_dict)

    print(f"{test_instance=}")
    print(test_instance.get("content"), test_instance["content"])

On this same machine (MacOS Sonoma 14.4.1, Apple Silicon and Python managed by miniconda3).

With Python 3.10 and 3.11, I got:

(base) yang@Yang-MacBook test % python3 -V
Python 3.10.14
(base) yang@Yang-MacBook test % python3 recreate.py 
test_instance={'ConTent': 'hello'}
hello hello
(base) yang@Yang-MacBook test % python3 -V
Python 3.11.8
(base) yang@Yang-MacBook test % python3 recreate.py 
test_instance={'ConTent': 'hello'}
hello hello

While with Python 3.12, the two methods get different results:

(base) yang@Yang-MacBook test % python3 -V               
Python 3.12.2
(base) yang@Yang-MacBook test % python3 recreate.py      
test_instance={'ConTent': 'hello'}
None hello

Is it some known issue/change from Python 3.12?

@JaGeo
Copy link
Member

JaGeo commented Apr 15, 2024

Ah, interesting, Python version 😅.
This is why it has not shown up on CI.

@JaGeo
Copy link
Member

JaGeo commented Apr 15, 2024

@DanielYang59 I think I found it: python/cpython#105524

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 15, 2024

@DanielYang59 I think I found it: python/cpython#105524

👍 So speedy. Yes looks like exactly what we're experiencing. I guess it's not intended?

The get method:

    def get(self, key, default=None):
        if key in self:
            return self[key]
        return default

Which indeed does not use __getitem__.

But at least confirmed my machine is not the "outlier" 😅

@JaGeo
Copy link
Member

JaGeo commented Apr 15, 2024

@DanielYang59 I think I found it: python/cpython#105524

👍 So speedy. Yes looks like exactly what we're experiencing. I guess it's not intended?

The get method:

    def get(self, key, default=None):
        if key in self:
            return self[key]
        return default

Which indeed does not use __getitem__.

But at least confirmed my machine is not the "outlier" 😅

No idea if intended.
But a solution is described there as well. Could potentially affect other parts of the code as well.

@janosh
Copy link
Member

janosh commented Apr 15, 2024

there's only 1 use of UserDict in all of pmg i think. we can try the proposed fix or try to make it work without UserDict. the docs weakly recommend to subclass dict directly instead of UserDict

@JaGeo
Copy link
Member

JaGeo commented Apr 15, 2024

I think we recently switched to UserDict for other issues.

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 16, 2024

there's only 1 use of UserDict in all of pmg i think. we can try the proposed fix or try to make it work without UserDict. the docs weakly recommend to subclass dict directly instead of UserDict

I would keep it in mind. I'm now secretly working on simplifying io.lobster.inputs (prefer not to push it here for now because I'm not having much time to work on this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
housekeeping Moving around or cleaning up old code/files linting Linting and quality assurance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants