Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document importlib.resources.abc.Traversable #91298

Closed
encukou opened this issue Mar 28, 2022 · 5 comments · Fixed by #91623
Closed

Document importlib.resources.abc.Traversable #91298

encukou opened this issue Mar 28, 2022 · 5 comments · Fixed by #91623
Assignees
Labels
docs Documentation in the Doc dir

Comments

@encukou
Copy link
Member

encukou commented Mar 28, 2022

BPO 47142
Nosy @jaraco, @encukou

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/jaraco'
closed_at = None
created_at = <Date 2022-03-28.12:48:15.201>
labels = ['docs']
title = 'Document importlib.resources.abc.Traversable'
updated_at = <Date 2022-04-03.13:06:57.909>
user = 'https://github.com/encukou'

bugs.python.org fields:

activity = <Date 2022-04-03.13:06:57.909>
actor = 'jaraco'
assignee = 'jaraco'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2022-03-28.12:48:15.201>
creator = 'petr.viktorin'
dependencies = []
files = []
hgrepos = []
issue_num = 47142
keywords = []
message_count = 4.0
messages = ['416164', '416617', '416618', '416619']
nosy_count = 3.0
nosy_names = ['jaraco', 'petr.viktorin', 'docs@python']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue47142'
versions = []

@encukou
Copy link
Member Author

encukou commented Mar 28, 2022

importlib.resources.files is documented 0 as returning importlib.resources.abc.Traversable, which is an undocumented class (dead link) now. It should be documented.
The source has helpful docstrings, but is missing some details. I'd like to know these before writing some docs:

  • Can the argument to joinpath contain path separators?
  • Are the methods expected to raise specific exception types (e.g. if a resource is missing, or you call iterdir on a “file”)? This seems quite important for people who implement the protocol.

@encukou encukou added the docs Documentation in the Doc dir label Mar 28, 2022
@encukou encukou added the docs Documentation in the Doc dir label Mar 28, 2022
@jaraco jaraco assigned jaraco and unassigned docspython Mar 28, 2022
@jaraco
Copy link
Member

jaraco commented Apr 3, 2022

Can the argument to joinpath contain path separators?

Good question. Currently, I'm aware of three concrete implementations of Traversable.joinpath:

  • pathlib.Path.joinpath
  • zipfile.Path.joinpath
  • importlib.resources.simple.ResourceContainer.joinpath

Traversable.joinpath was modeled after pathlib.Path.joinpath, which itself is itself not rigorous in what path separators are allowed. The docstring does indicate that each of the parameters will be combined with the existing path and acknowledges that absolute paths are possible. The curated docs only give examples, and curiously, those examples only use posixpath.sep, suggesting that it's recommended when passing compound paths to use posixpath separators as they're more widely supported.

For zipfile.Path, I observe that joinpath does accept path separators but looking at the docs for zipfile, it's not even obvious to me what path separators are used in zipfiles, but I do observe there is coercion to posixpath.sep.

ResourceContainer, on the other hand, will not accept path separators. This class, however, may not be used anywhere and probably could be extended to support path separators as well.

Here's what I propose:

  • First, document that the interface is under-specified and that for maximum compatibility, a consumer should pass only relative paths using posixpath.sep or no separator at all. Additionally, explicitly declare that behavior with absolute paths is undefined and unsupported.
  • Second, expand the interface on Traversable.joinpath to stipulate that the interface should accept multiple relative paths.
  • Third, simple.ResourceContainer should be updated to meet that spec.

@jaraco
Copy link
Member

jaraco commented Apr 3, 2022

Are the methods expected to raise specific exception types (e.g. if a resource is missing, or you call iterdir on a “file”)?

The short answer is no. The protocol does not stipulate specific exceptions. Perhaps it should be documented that any exceptions that occur when accessing the backing resource will propagate. Although, I do believe that statement is true for all Python code unless stated otherwise.

I agree it would be nice if the protocol could stipulate which exceptions might be raised by a given implementation, but I'm not confident that's possible, at least not without wrapping/adapting the pathlib.Path and zipfile.Path objects.

Does that answer the question? Are you working on another provider that implements this protocol? I'd be happy to consult on that effort.

@jaraco
Copy link
Member

jaraco commented Apr 3, 2022

Correction. In msg416618, link should have been:

docs for zipfile

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@encukou
Copy link
Member Author

encukou commented Apr 13, 2022

not rigorous in what path separators are allowed

Right, that could be improved. IMO it could link to the constructor docs that have all the details.

Forward slashes work on all supported systems. On Windows they're equivalent to backslashes (everywhere except in old CLI commands, AFAIK). So it makes sense to normalize to them.

>>> PureWindowsPath('a/b\\c')
PureWindowsPath('a/b/c')

pathlib.Path.joinpath also allows .. and . segments in the relative paths. I assume those should not be allowed?

any exceptions [...] will propagate
Does that answer the question?

Yes. Thank you!
I was thinking about how legal code like this would be:

try:
    return pkg_files.joinpath(f'templates/{user_supplied_name}').read_text()
except (NoSuchFileError, IsADirectoryError):
    raise ...

which suggests another question: should I expect resilience against path traversal attacks?

Are you working on another provider that implements this protocol?

No, I'm just trying to make sense of it all.

jaraco added a commit that referenced this issue Apr 17, 2022
…5.7.1) (#91623)

* bpo-47142: Refine traversable (apply changes from importlib_resources 5.7.1)

* Replace changelog referencing github issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants