Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation/resolution report (aka pip install --dry-run --report) #10771

Merged
merged 14 commits into from
Jul 15, 2022

Conversation

sbidoul
Copy link
Member

@sbidoul sbidoul commented Jan 8, 2022

Following the conversation in #10748 and @cosmicexplorer 's request that I describe more precisely what I had in mind I created this proof-of-concept.

The output is conceptually very similar to #10748, but generated as part of the install command.

It enables the following use cases:

1. Tell me what pip installed: pip install --report
2. Tell me what pip would install: pip install --dry-run --report
3. Tell me what pip would install in an empty environment (aka resolve the requirements): pip install --dry-run --ignore-installed --report

It is not exactly the same as I take the opportunity to experiment a little bit on the output format.

My goal here is to have a standards based output, using PEP 610 for download_info and PEP 566 for metadata.

So far I did not need to intrude into the resolver innards as I can obtain all the information from InstallRequirement, by the way of a minor change to the preparer to record the downloaded artifact.

Missing so far, compared to #10748, is dependencies and required python information, but this will be part of the metadata field obtained form the distribution metadata. TODO about metadata is convert it to PEP 566 json.
Dependencies and required python information are part of the metadata fields in PEP 566 json format.

Here is an example:

$ pip install "pydantic>=1.9" git+https://github.com/pypa/packaging@main --report=/tmp/report.json --ignore-installed --dry-run --no-cache

{
  "version": "0",
  "pip_version": "22.2.dev0",
  "install": [
    {
      "is_direct": false,
      "download_info": {
        "url": "https://files.pythonhosted.org/packages/78/b0/fdedac2f07344035607bfbf42217c103a6421e7845fc3cb9fd07f3fa0d2e/pydantic-1.9.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
        "archive_info": {
          "hash": "sha256=574936363cd4b9eed8acdd6b80d0143162f2eb654d96cb3a8ee91d3e64bf4cf9"
        }
      },
      "metadata": {
        "name": "pydantic",
        "version": "1.9.0"
        ...
      },
      "requested": true
    },
    {
      "is_direct": true,
      "download_info": {
        "url": "https://github.com/pypa/packaging",
        "vcs_info": {
          "vcs": "git",
          "requested_revision": "main",
          "commit_id": "ba124324e866e518ebfe0378a25b6ba4816e5880"
        }
      },
      "metadata": {
        "name": "packaging",
        "version": "21.4.dev0"
        "requires_dist": [
          "pyparsing (!=3.0.5,>=2.0.2)"
        ],
        "requires_python": ">=3.7",
        ...
      },
      "requested": true
    },
    {
      "is_direct": false,
      "download_info": {
        "url": "https://files.pythonhosted.org/packages/d9/41/d9cfb4410589805cd787f8a82cddd13142d9bf7449d12adf2d05a4a7d633/pyparsing-3.0.8-py3-none-any.whl",
        "archive_info": {
          "hash": "sha256=ef7b523f6356f763771559412c0d7134753f037822dad1b16945b7b846f7ad06"
        }
      },
      "metadata": {
        "name": "pyparsing",
        "version": "3.0.8"
        ...
      }
    },
    {
      "is_direct": false,
      "download_info": {
        "url": "https://files.pythonhosted.org/packages/75/e1/932e06004039dd670c9d5e1df0cd606bf46e29a28e65d5bb28e894ea29c9/typing_extensions-4.2.0-py3-none-any.whl",
        "archive_info": {
          "hash": "sha256=6657594ee297170d19f67d55c05852a874e7eb634f4f753dbd667855e07c1708"
        }
      },
      "metadata": {
        "name": "typing-extensions",
        "version": "4.2.0"
        ...
      }
    }
  ],
  "environment": {
    "implementation_name": "cpython",
    "implementation_version": "3.8.10",
    "os_name": "posix",
    "platform_machine": "x86_64",
    "platform_release": "...",
    "platform_system": "Linux",
    "platform_version": "...",
    "python_full_version": "3.8.10",
    "platform_python_implementation": "CPython",
    "python_version": "3.8",
    "sys_platform": "linux"
  }
}

Please let me know your thoughts and if I'm missing anything.

Closes #53
Closes #6430

@cosmicexplorer
Copy link
Contributor

Note that in a67c59f I have refactored #10748 to make its output easier to consume from a pip install command.

@github-actions github-actions bot added the needs rebase or merge PR has conflicts with current master label Mar 26, 2022
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Apr 30, 2022
@sbidoul
Copy link
Member Author

sbidoul commented Apr 30, 2022

I have updated this PR to produce hashes in more situations. The only missing part is the integration with the wheel cache (on which I started working in #11042).

All in all, this seems to fit pretty neatly in the pip code base.

The hardest remaining part is to produce PEP 566 compliant metadata. To achieve this we may need to vendor importlib_metadata which has the metadata json conversion function. On the other hand, for a MVP it could be sufficient to export limited metadata information such as Name, Version and maybe Requires-Dist.

Also --dry-run needs to print/log something, but that should be easy.

If there is positive feedback in the short term (especially from people who want to produce lock files with hashes) I'm motivated to push this forward. I may be able free up some time in May so as to land this as an experimental feature in 22.2 (July).

@sbidoul sbidoul changed the title [PROOF OF CONCEPT] Installation Report [PROOF OF CONCEPT] Installation/resolution report (aka pip install --dry-run) Apr 30, 2022
@pfmoore
Copy link
Member

pfmoore commented Apr 30, 2022

The hardest remaining part is to produce PEP 566 compliant metadata. To achieve this we may need to vendor importlib_metadata which has the metadata json conversion function.

I have an implementation that converts a METADATA file (more accurately, an email.message.Message object) to that format in https://github.com/pfmoore/pkg_metadata/blob/main/src/pkg_metadata/metadata.py (I'd recommend extracting that function, rather than vendoring pkg_metadata at this point, as I only just uploaded the first draft version of the project yesterday!). The msg_to_json function is pretty robust, though - it successfully translates the METADATA file for every wheel on PyPI (including some that have things like broken encodings).

@sbidoul sbidoul changed the title [PROOF OF CONCEPT] Installation/resolution report (aka pip install --dry-run) Installation/resolution report (aka pip install --dry-run) Apr 30, 2022
@sbidoul sbidoul changed the title Installation/resolution report (aka pip install --dry-run) Installation/resolution report (aka pip install --dry-run --report) Apr 30, 2022
@sbidoul sbidoul force-pushed the install-report-sbi branch 2 times, most recently from 503702f to dec28be Compare April 30, 2022 13:08
@sbidoul sbidoul force-pushed the install-report-sbi branch 2 times, most recently from 8e8a97f to 43b98c0 Compare May 2, 2022 07:20
@sbidoul
Copy link
Member Author

sbidoul commented May 2, 2022

Thanks @pfmoore! I extracted your msg_to_json and it fits nicely.

The PR is still pretty small but it will grow with tests and docs etc. So my plan is to split it in easier to review chunks

In the meantime, this branch is ready for experimenters, and I'm happy to discuss / bikeshed the report schema here.

@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented May 11, 2022

I am about to propose basing #10748 off of this PR, since I really like the formalization of metadata parsing in _internal/metadata/json.py and moving the InstallationReport itself into _internal/models.

I had written a whole treatise trying to convince you to move this to pip download --report, but after reading #10748 (comment) again, it makes a lot more sense to me, and I now agree with you on keeping it in pip install --report. Would you consider:

  1. renaming --ignore-installed to --empty-env?
  2. renaming InstallationReport to ResolutionReport?

And as i am about to propose in #10748, I think I can make that PR additionally avoid downloading any dists when --dry-run is provided.

EDIT: see proposal at #10748 (comment) (which doesn't change anything here, just refactors that PR to consume this one).

@sbidoul
Copy link
Member Author

sbidoul commented May 22, 2022

renaming --ignore-installed to --empty-env?

@cosmicexplorer actually the --ignore-installed option already exists. So it would mean creating an alias. Perhaps this should be discussed in a separate issue ?

renaming InstallationReport to ResolutionReport?

I'll look into it. This could make sense if we use it in another context than the install command.
OTOH, this name does not appear in the JSON output so it is easy to change as it is part of the pip internals.

@sbidoul sbidoul force-pushed the install-report-sbi branch 2 times, most recently from 1682a1e to cc3844f Compare May 29, 2022 13:29
@sbidoul
Copy link
Member Author

sbidoul commented May 29, 2022

I rebased on top of the other PRs and added integration tests. This is now feature complete.

I keep it draft until the other PRs are merged but this PR (i.e. the last commit) is now simple as it can get.

@github-actions github-actions bot added the needs rebase or merge PR has conflicts with current master label Jun 1, 2022
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label Jun 1, 2022
@sbidoul
Copy link
Member Author

sbidoul commented Jul 5, 2022

Looks good to me, simple and straightforward.

Thanks @uranusjr!

Also, the groundwork that was required to make this PR simple will be useful for other things.

@sbidoul
Copy link
Member Author

sbidoul commented Jul 10, 2022

I have made a last tweak to this one. Instead of a dict keyed by canonical distribution name, the install field is now an array.
I feel the dictionary was not bringing any significant benefit to users, since nothing else in the data referred to these dictionary keys. The dictionary was also possibly restrictive (such as losing installation order information).

@sbidoul
Copy link
Member Author

sbidoul commented Jul 10, 2022

I plan to merge this in about a week so it goes in 22.2, if there are no other comments or concerns in the meantime.

@sbidoul sbidoul requested a review from a team July 10, 2022 11:50
@sbidoul sbidoul force-pushed the install-report-sbi branch 2 times, most recently from 0cc7a39 to d11f3c3 Compare July 14, 2022 19:00
docs/html/reference/installation-report.md Outdated Show resolved Hide resolved
docs/html/reference/installation-report.md Show resolved Hide resolved
@pradyunsg pradyunsg added the type: enhancement Improvements to functionality label Jul 15, 2022
@sbidoul sbidoul merged commit d830c96 into pypa:main Jul 15, 2022
@sbidoul sbidoul deleted the install-report-sbi branch July 15, 2022 10:29
@sbidoul
Copy link
Member Author

sbidoul commented Jul 15, 2022

Thanks a lot to all who contributed to this!

@woodruffw
Copy link
Member

Just wanted to say as a downstream user: thanks a ton for adding this!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: enhancement Improvements to functionality
Projects
None yet
8 participants