Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathlib support #1074

Open
sattlerc opened this issue Oct 22, 2022 · 9 comments
Open

pathlib support #1074

sattlerc opened this issue Oct 22, 2022 · 9 comments

Comments

@sattlerc
Copy link
Contributor

Suppose we are in an empty bare repository:

git init --bare

This operation gives a TypeError (instead of a KeyError as expected):

import dulwich.repo
from pathlib import Path
r = dulwich.repo.Repo(Path())
r.refs.refpath(b'ref')

Error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/noname/.local/lib/python3.10/site-packages/dulwich/refs.py", line 676, in refpath
    return os.path.join(self.path, name)
  File "/usr/lib/python3.10/posixpath.py", line 90, in join
    genericpath._check_arg_types('join', a, *p)
  File "/usr/lib/python3.10/genericpath.py", line 155, in _check_arg_types
    raise TypeError("Can't mix strings and bytes in path components") from None
TypeError: Can't mix strings and bytes in path components
@jelmer
Copy link
Owner

jelmer commented Oct 22, 2022

I'm open to merging PRs that add pathlib support, but I don't believe there's anywhere in Dulwich today that promises pathlib support.

@jelmer jelmer changed the title Bare repository fails with pathlib.Path pathlib support Oct 22, 2022
@sattlerc
Copy link
Contributor Author

I guess I was mislead by this:

        if getattr(path, "encode", None) is not None:
            path = os.fsencode(path)

@jelmer
Copy link
Owner

jelmer commented Oct 22, 2022

Ah, I see. That's meant for plain str since RefsContainer uses bytes internally.

@sattlerc
Copy link
Contributor Author

I'm not clear about whether file paths in the API should be bytes or str. Is there a general convention? The functions themselves generally don't say.

  • Some functions only work with str, for example the constructor of Index (this only becomes clear when looking at Index.__repr__, which does string formatting).
  • Some functions work both with bytes and str, for example FilePackIndex. However, it is not clear to me whether the bytes functionality is intended or officially supported.
  • Some code seems designed to officially support both bytes and str, for example this bit and lots of (unnecessary) checks whether something is bytes before calling os.fsencode.

It would be nice if there was a uniform convention.

@jelmer
Copy link
Owner

jelmer commented Oct 23, 2022

There are conventions, but not particularly well documented. We should improve that and add more type signatures. It's basically:

  • Porcelain tries to work with strings (and generally accepts either strings or bytes)

In the plumbing:

  • Paths that live inside of git datastructures are bytes. This is to avoid constant conversion back and forth, but also because there is no well defined encoding for git and thus no way to lossily decode.

  • Some high performance low level functions that work with paths accept just bytes

  • All other paths (for functions meant to be used by library users) take os-native or accept both bytes and os-native. Mostly they use bytes but accept strings and then fsencode as encessary.

@sattlerc
Copy link
Contributor Author

sattlerc commented Oct 24, 2022

When you say os-native, does that mean str? For example, Repo.__init__ does not accept a bytes path.

@jelmer
Copy link
Owner

jelmer commented Oct 24, 2022

Yep.

@nanonyme
Copy link
Contributor

To be fair, using os.fspath() on inputs would probably help. It will as necessary convert pathlib objects into something you can pass further.

@jelmer
Copy link
Owner

jelmer commented Mar 20, 2023

I'm open to PRs that add more use of os.fspath to dulwich.porcelain in particular, if somebody wanted to work on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants