Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Run PTH flake test in prep for supporting pathlib #16060

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MridulS
Copy link
Contributor

@MridulS MridulS commented Feb 18, 2024

This PR starts work on making a unified i/o interface for reading and writing to disk. In some places using strings for filenames work, in other pathlib objects works too. We should be able to pass around Path objects all around the code. For a first go I want to make the PTH ruff flake8 linter happy.

As a start this fixes ruff complaints in couple of sub packages. Didn't want to make the PR too big.

Also want to check if this works with windows CI all fine. (pathlib has had some issues with windows from what I remember).

One way to go about #14893

Copy link

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

Copy link
Member

@nstarman nstarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of good things in this PR! I have a few suggestions that are applicable in many places to further simplify the code, e.g. using read_text.

.github/workflows/update_astropy_iers_data_pin.py Outdated Show resolved Hide resolved
.github/workflows/update_astropy_iers_data_pin.py Outdated Show resolved Hide resolved
astropy/coordinates/name_resolve.py Outdated Show resolved Hide resolved
astropy/coordinates/transformations/graph.py Outdated Show resolved Hide resolved
astropy/coordinates/transformations/graph.py Outdated Show resolved Hide resolved
astropy/visualization/tests/test_lupton_rgb.py Outdated Show resolved Hide resolved
astropy/wcs/setup_package.py Outdated Show resolved Hide resolved
astropy/wcs/tests/test_wcs.py Outdated Show resolved Hide resolved
@MridulS MridulS requested review from a team as code owners February 19, 2024 09:35
Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for time.

astropy/coordinates/name_resolve.py Outdated Show resolved Hide resolved
astropy/cosmology/_io/tests/test_json.py Outdated Show resolved Hide resolved
astropy/cosmology/tests/test_connect.py Outdated Show resolved Hide resolved
astropy/logger.py Outdated Show resolved Hide resolved
astropy/logger.py Outdated Show resolved Hide resolved
astropy/wcs/tests/test_wcs.py Outdated Show resolved Hide resolved
astropy/wcs/tests/test_wcs.py Outdated Show resolved Hide resolved
astropy/wcs/wcsapi/fitswcs.py Outdated Show resolved Hide resolved
astropy/wcs/wcsapi/low_level_api.py Outdated Show resolved Hide resolved
astropy/wcs/wcsapi/low_level_api.py Outdated Show resolved Hide resolved
Copy link
Member

@nstarman nstarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the os.path.join got replaced by os-dependent strings.

astropy/modeling/tests/test_polynomial.py Outdated Show resolved Hide resolved
astropy/time/setup_package.py Outdated Show resolved Hide resolved
astropy/wcs/tests/test_pickle.py Outdated Show resolved Hide resolved
astropy/wcs/tests/test_profiling.py Outdated Show resolved Hide resolved
Copy link
Member

@larrybradley larrybradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for convolution and stats.

@eerovaher
Copy link
Member

The way I understand it the intent behind the Ruff PTH rules is that if code is representing filesystem paths as Path instances then it's best to commit to doing that and to avoid converting them to str and then back to Path again. But what seems to be happening in this pull request is that some of the code involving our get_pkg_data_* functions is performing exactly this kind of back and forth shuffling between Path and str that the Ruff PTH rules are meant to guard against. It might therefore be better to ensure that the get_pkg_data_* functions can handle Path instances first, and then it should become possible for the relevant code to use Path throughout.

@pllim pllim added this to the v6.1.0 milestone Feb 19, 2024
@neutrinoceros
Copy link
Contributor

neutrinoceros commented Feb 20, 2024

There might be a slight misunderstanding here, that not what I understand the intention of this rule set to be. Here's the one liner description for flake8-use-pathlib (which ruff reimplements as PTH):

A plugin for flake8 finding use of functions that can be replaced by pathlib module.

That said, I agree with @eerovaher that back-and-forth conversions should be avoided !

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failure on Windows, see #16060 (comment)

Also please rebase and squash instead of pulling in merge commits, because at its current state, I cannot rebase this PR on top of latest main (gives me conflict). Thanks!

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, this is a good idea and very nice. That said, quite a few comments. A number are just that if a plain string with no directory is given, open is perfectly fine and reasonable, so no need to replace.

A more general one is whether we can pass "file names" through Path unconditionally; certainly, any time we do, we should adjust the docstring.

Finally, I'd hope we can get rid of a lot of str(path), by ensuring get_package_data_* work - would seem worth doing that here, given how many people have now looked at this...

astropy/coordinates/name_resolve.py Outdated Show resolved Hide resolved
astropy/coordinates/transformations/graph.py Outdated Show resolved Hide resolved
astropy/cosmology/_io/tests/test_json.py Show resolved Hide resolved
astropy/wcs/tests/test_wcs.py Show resolved Hide resolved
astropy/wcs/wcs.py Outdated Show resolved Hide resolved
astropy/wcs/wcs.py Show resolved Hide resolved
docs/conf.py Outdated Show resolved Hide resolved
docs/conf.py Show resolved Hide resolved
@MridulS
Copy link
Contributor Author

MridulS commented Feb 27, 2024

Finally, I'd hope we can get rid of a lot of str(path), by ensuring get_package_data_* work - would seem worth doing that here, given how many people have now looked at this...

Sure, I am happy to get this into this PR.

Should the changes be backward compatible? I am currently working on making the util functions work with the str and Path objects. Maybe it makes sense to just have a hard break here? (or maybe not as these things are up in the docs https://docs.astropy.org/en/latest/utils/data.html)

If not a hard break, does it make sense to deprecate usage of str only args for future removal?

@mhvk
Copy link
Contributor

mhvk commented Feb 27, 2024

For this PR at least, I don't think we should have anything that is not backwards compatible. But presumably passing a path through Path is super fast.

astropy/wcs/wcs.py Outdated Show resolved Hide resolved
@@ -34,7 +35,7 @@ def _docdir(request):
# Don't apply this fixture to io.rst. It reads files and doesn't write.
# Implementation from https://github.com/pytest-dev/pytest/discussions/10437
if "io.rst" not in request.node.name:
old_cwd = os.getcwd()
old_cwd = Path.cwd()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is os.getcwd() getting deprecated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is to make to standardized usage of PathLike objects rather than str.

(It's fine either way, but to not litter noqa all around we could make this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pathlib was introduced in 3.4 and is now pretty feature-complete. Seems a good opportunity to update our approach to paths from the function-oriented manipulation of strings to an object-oriented manipulation of paths. os.getcwd isn't deprecated and probably isn't going to be. It's just the next best thing: superceded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to keep in mind though is that Path objects are much heavier than strings. It doesn't make a measurable difference here but in intensive loops, switching to pathlib can add a significant overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and I don't think there are places where we are doing a lot of string/path manipulation inside astropy (maybe I missed something?). A lot of this is indeed aesthetic-y changes. But IMO we should indeed move on to using the obj oriented pathlib way as it make manipulating these path much easier, plus in a typing heavy future of python, os.PathLike gives a lot more information than str.

astropy/convolution/setup_package.py Show resolved Hide resolved
astropy/coordinates/tests/accuracy/generate_ref_ast.py Outdated Show resolved Hide resolved
astropy/coordinates/tests/accuracy/generate_ref_ast.py Outdated Show resolved Hide resolved
astropy/coordinates/tests/accuracy/generate_ref_ast.py Outdated Show resolved Hide resolved
astropy/coordinates/tests/accuracy/generate_ref_ast.py Outdated Show resolved Hide resolved
@@ -34,7 +35,7 @@ def _docdir(request):
# Don't apply this fixture to io.rst. It reads files and doesn't write.
# Implementation from https://github.com/pytest-dev/pytest/discussions/10437
if "io.rst" not in request.node.name:
old_cwd = os.getcwd()
old_cwd = Path.cwd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pathlib was introduced in 3.4 and is now pretty feature-complete. Seems a good opportunity to update our approach to paths from the function-oriented manipulation of strings to an object-oriented manipulation of paths. os.getcwd isn't deprecated and probably isn't going to be. It's just the next best thing: superceded.

@nstarman
Copy link
Member

@MridulS. This is looking nearly complete. What are the next steps, fixing get_package_data_ first?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants