Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an easy way to check modules exposed by the package through API #12710

Closed
mknorps opened this issue Dec 21, 2022 · 6 comments
Closed
Labels
feature request requires triaging maintainers need to do initial inspection of issue

Comments

@mknorps
Copy link

mknorps commented Dec 21, 2022

What's the problem this feature will solve?

There is no programmatic way to find out what modules a package exposes without downloading the package.
This feature is needed for the efficient translation of imports in Python code to the packages they may have come from.

For example google-api-python-client exposes apiclient, googleapiclient and googleapiclient/discovery_cache. Writing a Python program, you would have:

import apiclient

Just looking at the import does not give you any information from which library it was taken, unless it is installed already.

Describe the solution you'd like

There are two ends to this stick:

  1. Provide exposed modules via API for each package. Add a field that will contain content of top_level.txt file.
  2. Allow for a search by import name (for example API that provides all packages exporting apiclient)

Additional context

Libraries for automatic generation of requirements.txt from Python source code deal with it in following ways:

  1. Pipreqs has a static mapping file from package name to imports names.
  2. Pigar uses a combination of a database shipped with the package with downloading packages from PyPI, unpacking and looking at the content of top_level.txt files.
@mknorps mknorps added feature request requires triaging maintainers need to do initial inspection of issue labels Dec 21, 2022
@di
Copy link
Member

di commented Dec 21, 2022

I think this is likely a duplicate of #5375?

@mknorps
Copy link
Author

mknorps commented Dec 21, 2022

I think this is likely a duplicate of #5375?

Partially - proposed solution number 2 is exactly issue #5375.
Solution 1 is I guess simpler to implement, though less useful in searching by import name. It adds a field to JSON API containing exposed modules. So at least there is no need to download the package from PyPI only to extract top_level.txt content.

So in a perfect world, I would like to have an API that I can search by import name and get a list of candidate packages.
In a reasonably nice world, I would like to have information on import names for each PyPI package available easier (without downloading and unpacking) via API. It would already help a lot.

@di
Copy link
Member

di commented Dec 21, 2022

I think at the root, the 'hard' part of both of these feature requests is the same: PyPI would need a) some way to introspect the artifacts that are uploaded, b) some way to derive the imports from that c) and to store the results.

Given that I think we can merge this into #5375 if you want to add a comment there?

@mknorps
Copy link
Author

mknorps commented Dec 22, 2022

You can merge.
Can I help with solving this issue?
What additional context should I need?
#5375 has been there for some time, does it mean it is a hard issue or a non-priority, or both?

@di
Copy link
Member

di commented Jan 5, 2023

Can I help with solving this issue?

Sure! A first step towards this would be prototyping how this would work in https://github.com/pypi/inspector first

#5375 has been there for some time, does it mean it is a hard issue or a non-priority, or both?

It's hard! Introspecting the artifacts requires unpacking/consuming untrusted code in production, we need a safe way to do this. https://github.com/pypi/inspector is a step towards this.

I'll close this in favor of #5375.

@dstufft
Copy link
Member

dstufft commented May 23, 2023

Closing this, @di mentioned he was going to, then never did :)

@dstufft dstufft closed this as completed May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request requires triaging maintainers need to do initial inspection of issue
Projects
None yet
Development

No branches or pull requests

3 participants