Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-117586: Speed up pathlib.Path.walk() by working with strings #117726

Merged
merged 1 commit into from Apr 11, 2024

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Apr 10, 2024

Move pathlib.Path.walk() implementation into glob._Globber. The new glob._Globber.walk() classmethod works with strings internally, which is a little faster than generating Path objects and keeping them normalized. The pathlib.Path.walk() method converts the strings back to path objects.

In the private pathlib ABCs, our existing subclass of _Globber ensures that PathBase instances are used throughout.

Follow-up to #117589.

Timings:

$ ./python -m timeit -s "from pathlib import Path; p = Path.cwd()" "list(p.walk())"
10 loops, best of 5: 29.8 msec per loop  # before
10 loops, best of 5: 28.9 msec per loop  # after
# --> 1.03x faster

Speedup is nothing to write home about, but this PR has the benefit of keeping closely-related recursive-directory-walking code together, and removing some redundant private pathlib methods.

Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.

In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.

Follow-up to python#117589.
@barneygale barneygale added performance Performance or resource usage topic-pathlib labels Apr 10, 2024
@barneygale barneygale merged commit 0cc71bd into python:main Apr 11, 2024
36 checks passed
diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
…gs (python#117726)

Move `pathlib.Path.walk()` implementation into `glob._Globber`. The new
`glob._Globber.walk()` classmethod works with strings internally, which is
a little faster than generating `Path` objects and keeping them normalized.
The `pathlib.Path.walk()` method converts the strings back to path objects.

In the private pathlib ABCs, our existing subclass of `_Globber` ensures
that `PathBase` instances are used throughout.

Follow-up to python#117589.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-pathlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant