New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement provenance_url.json #11865
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Fridolin Pokorny <fridolin.pokorny@datadoghq.com>
|
Checking the failed test: I'm not sure whether this PR introduces the given failure. Also, it looks like it fails only on Mac. |
| url=self.download_info.redacted_url, | ||
| info=ArchiveInfo("sha256=" + sha256.hexdigest()), | ||
| provenance_file=True, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick drive-by comment. I'll review in depth when the pep is accepted. I note however that all the pieces should be in place to implement this in pip quite easily, since download_info is actually the same as provenance_url and is available when obtaining wheels from cache too.
To avoid confusion and adding a non-standard provenance_file flag to DirectUrl, I'd add a new provenance_url argument to install_wheel(), to which we can likely pass download_info as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #11875
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused as to what the "url" should be in the case of something like pip install requests (and whether the download_info field holds that data) - in particular, given that pip builds and caches wheels locally, so that there may not actually be a URL for the wheel that got installed.
To be clear I think this is something that needs addressing in the PEP, not here, but I think it's an important difference between this case and the direct_url.json case (where there is a URL by definition). I don't think the PEP can assume that pip does the right thing in this case - it needs to say what it wants, and then we can check if download_info matches that (and if it does, following that spec becomes a new requirement on download_info for future changes to keep in mind).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In its current version the PEP is indeed unclear on this. I am assuming download_info corresponds to the intent of the PEP, but that intent absolutely needs to be made explicit in the PEP.
For reference, download_info is the URL of original artifact that was downloaded. So the sdist URL if the wheel was built locally. It works also when the locally built wheel comes from the cache, since we now record this URL in the cache entry. This is documented in the installation report specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid confusion and adding a non-standard provenance_file flag to DirectUrl, I'd add a new provenance_url argument to install_wheel(), to which we can likely pass download_info as-is.
Thanks, will incorporate this suggestion.
I'm still confused as to what the "url" should be in the case of something like
pip install requests(and whether thedownload_infofield holds that data) - in particular, given that pip builds and caches wheels locally, so that there may not actually be a URL for the wheel that got installed.
pip creates origin.json in pip's cache (for both, cached wheels from http and the ones built from sdists). Info from this file seems to be used and correctly propagated internally inside pip source code (using pip with changes in this PR):
$ pip cache purge
$ pip uninstall requests
Found existing installation: requests 2.28.2
Uninstalling requests-2.28.2:
Would remove:
/Users/fridolin.pokorny/git/fridex/pip/.venv/lib/python3.8/site-packages/requests-2.28.2.dist-info/*
/Users/fridolin.pokorny/git/fridex/pip/.venv/lib/python3.8/site-packages/requests/*
Proceed (Y/n)? y
Successfully uninstalled requests-2.28.2
$ pip install requests
Collecting requests
Downloading requests-2.28.2-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.8/62.8 kB 1.8 MB/s eta 0:00:00
Requirement already satisfied: charset-normalizer<4,>=2 in ./.venv/lib/python3.8/site-packages (from requests) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in ./.venv/lib/python3.8/site-packages (from requests) (3.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.venv/lib/python3.8/site-packages (from requests) (1.26.15)
Requirement already satisfied: certifi>=2017.4.17 in ./.venv/lib/python3.8/site-packages (from requests) (2022.12.7)
Installing collected packages: requests
Successfully installed requests-2.28.2
$ cat .venv/lib/python3.8/site-packages/requests-2.28.2.dist-info/provenance_url.json
{"archive_info": {"hash": "sha256=64299f4909223da747622c030b781c0d7811e359c37124b4bd368fb8c6518baa", "hashes": {"sha256": "64299f4909223da747622c030b781c0d7811e359c37124b4bd368fb8c6518baa"}}, "url": "https://files.pythonhosted.org/packages/d2/f4/274d1dbe96b41cf4e0efb70cbced278ffd61b5c7bb70338b62af94ccb25b/requests-2.28.2-py3-none-any.whl"}% $ pip uninstall requests
Found existing installation: requests 2.28.2
Uninstalling requests-2.28.2:
Would remove:
/Users/fridolin.pokorny/git/fridex/pip/.venv/lib/python3.8/site-packages/requests-2.28.2.dist-info/*
/Users/fridolin.pokorny/git/fridex/pip/.venv/lib/python3.8/site-packages/requests/*
Proceed (Y/n)? y
Successfully uninstalled requests-2.28.2
$ pip install requests
Collecting requests
Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in ./.venv/lib/python3.8/site-packages (from requests) (3.1.0)
Requirement already satisfied: idna<4,>=2.5 in ./.venv/lib/python3.8/site-packages (from requests) (3.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in ./.venv/lib/python3.8/site-packages (from requests) (1.26.15)
Requirement already satisfied: certifi>=2017.4.17 in ./.venv/lib/python3.8/site-packages (from requests) (2022.12.7)
Installing collected packages: requests
Successfully installed requests-2.28.2
$ cat .venv/lib/python3.8/site-packages/requests-2.28.2.dist-info/provenance_url.json
{"archive_info": {"hash": "sha256=64299f4909223da747622c030b781c0d7811e359c37124b4bd368fb8c6518baa", "hashes": {"sha256": "64299f4909223da747622c030b781c0d7811e359c37124b4bd368fb8c6518baa"}}, "url": "https://files.pythonhosted.org/packages/d2/f4/274d1dbe96b41cf4e0efb70cbced278ffd61b5c7bb70338b62af94ccb25b/requests-2.28.2-py3-none-any.whl"}% $Similarly, provenance_url.json is also created when installing sdists (but wheel must be installed, not legacy installations).
@pfmoore Thanks for bringing this - we can definitely add this to the PEP to make it explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a very fast first review.
| @@ -0,0 +1,2 @@ | |||
| Implement storing provenance_url.json file as specified in PEP 9999. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the number won't change any more.
| Implement storing provenance_url.json file as specified in PEP 9999. | |
| Implement storing provenance_url.json file as specified in PEP 710. |
| direct_url_path = os.path.join(dest_info_dir, PROVENANCE_URL_METADATA_NAME) | ||
| else: | ||
| direct_url_path = os.path.join(dest_info_dir, DIRECT_URL_METADATA_NAME) | ||
| print(direct_url_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print?
See Draft PEP: Recording provenance of installed packages and PEP-710 PR.