Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata field to pip list json output #11097

Closed
wants to merge 2 commits into from

Conversation

sbidoul
Copy link
Member

@sbidoul sbidoul commented May 7, 2022

Such a metadata field in pip list json output is useful for tools that need to explore the content of a python environment without being installed in it (or are written in other languages than python).

This also helps answering feature requests about extending pip list query options. With a rich output we can ask users to use jq or a similar tool to process their queries.

This PR sits on top of #11095

@sbidoul
Copy link
Member Author

sbidoul commented May 8, 2022

This one is ready but should be merged after #11095 on which it depends.

@sbidoul
Copy link
Member Author

sbidoul commented May 30, 2022

I'm wondering if we should instead add a dist_info field that contains metadata, to allow for extensibility (see for instance a request to add WHEEL information in #11054 (comment), or license files).

On the other hand, metadata is special enough to stand alone and be more easily accessible.

@uranusjr
Copy link
Member

I think if we want wheel, we can simply at it at the top level; the additional level doesn’t feel necessary to me.

@github-actions github-actions bot added the needs rebase or merge PR has conflicts with current master label Jun 12, 2022
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Jun 12, 2022
@github-actions github-actions bot added the needs rebase or merge PR has conflicts with current master label Jun 23, 2022
So they are resilient to new fields added to the
json output.
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Jun 24, 2022
@sbidoul sbidoul added this to the 22.2 milestone Jun 24, 2022
@sbidoul sbidoul added the C: list/show 'pip list' or 'pip show' label Jun 26, 2022
@sbidoul sbidoul requested a review from a team July 10, 2022 11:13
@pradyunsg
Copy link
Member

To be honest, I'm not too keen on this. I can see how this is functionally useful, but we're no longer merely "listing the installed packages" but instead providing a lot of the information that pip has about it as well.

I'd rather we make this sort of information available via an opt-in mechanism (eg: --format=json-with-metadata) or expose this via a different avenue (eg: different command / behind a different flag).

@pfmoore
Copy link
Member

pfmoore commented Jul 10, 2022

Yeah, this feels very similar to the discussion in #11223.

I'm against the idea of evolving pip in the direction of being a tool for querying information about an environment. I don't think that fits pip's core purpose, and I think the logic that we do it in pip because pip is present in all environments is misguided. A script (or zipapp) that's run using the environment's Python is just as usable, and doesn't fail if people have environments without pip.

Even if we do want to see this as something pip should provide, I agree with @pradyunsg that it's veering away from the idea behind pip list - and making it only available in the JSON version only emphasises that for me. If we do this, let's make it a new subcommand, dedicated to querying information about an environment.

To be honest, though, I don't think it would be that hard to write this functionality as a standalone script. I'd even be willing to create an implementation, so can we at least hold off on this for a few days while I find some time to write a prototype?

@sbidoul
Copy link
Member Author

sbidoul commented Jul 10, 2022

To be honest, though, I don't think it would be that hard to write this functionality as a standalone script. I'd even be willing to create an implementation, so can we at least hold off on this for a few days while I find some time to write a prototype?

I have one already, don't bother :)
You also wrote something there: #10036 (comment)

But it's not completely trivial [to make correct and portable] either especially if you want to run it from outside the environment, and with pip being ubiquitous in environments (either installed in the env, or via the zipapp or ...), it seems natural to me to provide it as it is a so essential, basic and far reaching feature.

If we do this,

With the (or is it mine only?) vision that pip is a CLI for managing the database of installed distribution, it sounds reasonable to have a query command that outputs rich information about what is installed.

It can be very simple (all the logic exists) and low maintenance cost.

And since we regularly receive requests to add various filters / query options to pip list, with such a command, we can orient users to jq or similar to do the filtering, avoiding scope creep in pip.

BTW, this also relates to pip show (see the work in #8008). Personally I'd be fine to have one command that outputs everything and let the user filter details about one single installed distribution, so it would encompass a broad scope of feature requests.

let's make it a new subcommand, dedicated to querying information about an environment.

It's true that pip list has quite a bit of historical baggage and became a little bit messy over time, so it may be better to start fresh.

So I'm perfectly fine with a new subcommand.

How would it be named: pip info (suggested recently by @pfmoore ), pip inspect (once suggested by @pradyunsg IIRC ?), or something else ?

@pfmoore
Copy link
Member

pfmoore commented Jul 10, 2022

I have one already, don't bother :)
You also wrote something there: #10036 (comment)

lol, the level of déjà vu I'm getting at the moment is pretty high...

With the (or is it mine only?) vision that pip is a CLI for managing the database of installed distribution, it sounds reasonable to have a query command that outputs rich information about what is installed.

That's a reasonable view of pip's scope. It's not one I'd really thought in terms of previously, and I'm not 100% sure I'm comfortable with it, but if that is how we want to see pip, then I guess I can see how this fits (although until now we've done very little around queries, so it would still be a scope increase in some sense). But if that's how we want to view pip, then I'm unsure how well pip download and pip wheel fit with that. They are more about pip being a CLI for interacting with package indexes, which is a different, but also reasonable, characterisation (and probably closer to how I think of pip).

So I'm perfectly fine with a new subcommand.

Fair enough. High-level philosophy aside, I don't really have a strong opinion here so while I remain probably -0.5 on this, if we're going with it, I'm OK with whatever it gets named, and let's just put whatever seems useful based on the use cases we know about into it.

@sbidoul
Copy link
Member Author

sbidoul commented Jul 10, 2022

I think the logic that we do it in pip because pip is present in all environments is misguided

Actually, Paul, the zipapp approach comforts me in thinking that pip can be seen as ubiquitous in python environments.

Basically all an higher level tool needs to be configured with is the pip command (/some/python -m pip or /some/python pip.pyz). If it can then rely on having pip inspect and pip install --report, a huge lot can be done without asking much from pip maintainers).

how well pip download and pip wheel fit with that.

If the core is resolve (install --dry-run --report), build wheel, install, uninstall, inspect, then maybe download and wheel are second-order / convenience features of pip ?

It should be pretty easy to write pip download on top of the installation/resolution report (and maybe the number of pip download feature requests will go down thanks to it).

And pip wheel is "just" download + build, which should be easy to compose too (the pip wheel cache being the highest value component here).

@sbidoul sbidoul removed this from the 22.2 milestone Jul 10, 2022
@pfmoore
Copy link
Member

pfmoore commented Jul 10, 2022

Actually, Paul, the zipapp approach comforts me in thinking that pip can be seen as ubiquitous in python environments.

Hmm, that's not how I see it, but I guess I get your point. I tend to think of this as more that any zipapp can be considered ubiquitous, and so we can distribute functionality how we see fit. But yes, having a zipapp that does everything you want in one place is useful.

Thinking further, you could use the runpy approach to have a zipapp that bundles pip and redirects myzipapp pip ... ro run ... with pip, as well as having myzipapp something_else implemented independently.

@sbidoul
Copy link
Member Author

sbidoul commented Jul 10, 2022

An more elaborate alternative is now in #11245

@sbidoul sbidoul closed this Jul 11, 2022
@sbidoul sbidoul deleted the list-json-metadata-sbi branch July 21, 2022 16:11
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: list/show 'pip list' or 'pip show'
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants