Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inverse of parse_wheel_filename #616

Closed
rth opened this issue Nov 14, 2022 · 8 comments
Closed

Inverse of parse_wheel_filename #616

rth opened this issue Nov 14, 2022 · 8 comments

Comments

@rth
Copy link

rth commented Nov 14, 2022

I was wondering if there is any existing function that could be used as the inverse of parse_wheel_filename?

For instance, given,

from packaging.utils import parse_wheel_filename

name, ver, build, tags = parse_wheel_filename("attrs-22.1.0-py2.py3-none-any.whl")

how do I reconstruct the original file name from those parts? Particularly given that,

>>> [str(el) for el in tags]
['py3-none-any', 'py2-none-any']

so it looks like there is some factorization logic to get py2.py3-none-any from those two tags.

And also I'm not sure what to do with the build tuple and PEP 427 is not very explicit about it. Is is just f"-{build[0]}{build[1]}" when it's a tuple[int, str]?

@uranusjr
Copy link
Member

There’s no build part in this filename, as far as I can tell? py2.py3-none-any is a compressed tag as described in PEP 425: https://peps.python.org/pep-0425/#compressed-tag-sets

@rth
Copy link
Author

rth commented Nov 14, 2022

There’s no build part in this filename, as far as I can tell?

Well in that example no. The question was more about how it's supposed to be represented in the filename.

py2.py3-none-any is a compressed tag as described in PEP 425

Yes, thanks. It's easy to do this in this example, my question is in general if there is a function that converts a sequence of N tags to their best-factorized string representation. I suppose packages that output wheel have to do something like this.

For more context I'm writing a tool that py-compiles wheels in pyodide/pyodide#3253, while doing that it needs to update tags as well. So for example,

  • attrs-22.1.0-py2.py3-none-any.whl -> attrs-22.1.0-cp310-none-any.whl
  • numpy-1.22.4-cp310-cp310-emscripten_3_1_24_wasm32.whl -> unchanged
  • pillow_heif-0.6.1-cp36-abi3-emscripten_3_1_24_wasm32.whl -> pillow_heif-0.6.1-cp310-cp310-emscripten_3_1_24_wasm32.whl (or maybe actually abi3 here)

and I'm looking for code that converts a list of tags to a wheel filename string (particularly in the factorized representation). Of course, I can write something that works in most cases but I would rather rely on some well-tested logic in packaging for this, and avoid dealing with the edge cases myself.

Unless you think this is out of scope for this package.

@uranusjr
Copy link
Member

my question is in general if there is a function that converts a sequence of N tags to their best-factorized string representation

I am not a maintainer of packaging, but personally I feel it’s not really a common use case and likely out of scope. It’s also difficult to maintain since there are likely a lot of specific logic that doesn’t work everywhere (for example, you changed cp36-abi3 to cp310-cp310, but there’s not a general future-proof rule for that). From your needs, it’s probably a better approach to ditch parse_wheel_filename altogether and implement your own parser, since it’s actually easier to convert py2.py3 to cp310 (and other things) than merge multiple tags into one.

@rth
Copy link
Author

rth commented Nov 14, 2022

Thanks for the feedback! To clarify I'm not asking for the functionality that changes the tags following a logic that is indeed very specific and out of scope here. I only provided that information for context.

What I'm looking for / or would like to propose is,

packaging.tags.tags_to_string(tags: Sequence[Tag]) -> str

which is very similar to what pypa/wheel does in
https://github.com/pypa/wheel/blob/43fcdfda8a224918eb846f8aa4f2dbe0d440889d/src/wheel/cli/pack.py#L82 or auditwheels does in https://github.com/pypa/auditwheel/blob/main/src/auditwheel/wheeltools.py#L245
Similarly, both poetry and flit do not currently build wheels with complex tags and so have a much simpler custom logic.

So my point is that currently, each tool that generate wheels has its own implementation for this and it would be good to have an official implementation in this package.

to ditch parse_wheel_filename altogether and implement your own parser

Well, we have been doing a lot of that in pyodide, and we are trying to use shared tooling when possible :)

@uranusjr
Copy link
Member

The wheel implementation is far more restrictive than your proposed interface; it simply merges all the provided tag segments, while your function signature above would open a big floodgate to all sorts of complex merging logic. This is the precise reason I noted above that makes me feel this is out of scope of packaging, and should be left to individual tools to suit their (ever so slightly) different needs instead.

@brettcannon
Copy link
Member

Best we could do is:

# In packaging.utils ...
def combine_tags(tags: Iterable[Tag], /) -> str:
    """Naively create a tag set string."""
    interpreter = set()
    abi = set()
    platform = set()
    for tag in tags:
        interpreter.add(tag.interpreter)
        abi.add(tag.abi)
        platform.add(tag.platform)
    return f"{'.'.join(interpreter)}-{'.'.join(abi)}-{'.'.join(platform)}"

But I don't know how widely useful that would be (e.g. both auditwheel and flit wouldn't use this). Do we know which projects would actually use such a naive function?

@pradyunsg
Copy link
Member

pradyunsg commented Nov 15, 2022

I'd add a sorted call, within the join. 😅

That's actually another thing here -- it's totally valid for someone to tag the wheel as py3.py2 AFAIK and packaging.tags does not preserve that ordering information (by design, since most use cases only involve the fully qualified tags and checking if "this tag" in supported tags).

FWIW, I'm in agreement with @uranusjr and @brettcannon that there's likely limited usability of such a generic function -- that said, I'm not opposed to adding it (assuming it's useful, at least occasionally).

@rth
Copy link
Author

rth commented Nov 15, 2022

Thanks all for your feedback and examples of code, it's very helfpul! OK let close given your feedback on limited usability.

@rth rth closed this as completed Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants