Skip to content

Gate Python package artifacts on wheel METADATA vs setup.cfg dependency parity#4528

Closed
Copilot wants to merge 3 commits into
mainfrom
copilot/fix-wheel-metadata-leak
Closed

Gate Python package artifacts on wheel METADATA vs setup.cfg dependency parity#4528
Copilot wants to merge 3 commits into
mainfrom
copilot/fix-wheel-metadata-leak

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 14, 2026

The existing wheel-metadata coverage only validated a single fixture target, so leaked Requires-Dist entries in real published wheels were not caught in CI. This change adds a post-package gate over actual built wheels and removes the fixture-only check path that missed regressions.

  • Post-build wheel metadata verifier (new script)

    • Added py/tools/publish_check/check_wheel_metadata.py to scan dist/*.whl after pants package ::.
    • For each wheel, it:
      • reads *.dist-info/METADATA
      • collects runtime Requires-Dist (excluding extra == ... markers)
      • matches wheel dist name to py/*/setup.cfg via canonicalized package names
      • parses [options] install_requires using the same minimal parser pattern used in tooling macros
      • normalizes both sides with canonicalize_name + Requirement(...).specifier
      • enforces set equality and reports per-wheel unexpected / missing diffs.
    • Exit behavior:
      • 1 on any mismatch
      • 2 when dist/ has no wheels.
  • CI package job gating

    • Updated .github/workflows/py.yml (package job) to run metadata verification between:
      • Run pants package
      • artifact upload (Archive wheels / Archive sdists).
    • This makes metadata leaks block artifact publication in the same job that builds publish artifacts.
  • Removed obsolete fixture-only coverage

    • Deleted py/tools/publish_check/test_publish_wheel_metadata.py.
    • Removed publish_wheel_metadata_check target from py/tools/publish_check/BUILD.
    • Kept publish_check intact (setup.cfg vs lockfile consistency remains unchanged).
  • Documentation

    • Added py/tools/publish_check/README.md note clarifying why the check runs outside the Pants test graph (it validates built dist/* artifacts).

Example of the new CI gate step:

- name: Verify wheel METADATA matches setup.cfg
  run: |
    python -m pip install --quiet packaging
    python py/tools/publish_check/check_wheel_metadata.py
Original prompt

Context

PR #4527 fixed a wheel-METADATA leak in py/aio.core where pinned //py/deps:reqs#* targets attached to the inner python_sources (via toolshed_library(dependencies=[...])) were being baked into Requires-Dist. The published aio.core wheel ended up with entries like abstracts==0.2.0, pyyaml==6.0.3, trycast==1.3.0, plus stub-only packages types-orjson==3.6.2 and types-pyyaml==6.0.12.20260508 — none of which are in py/aio.core/setup.cfg install_requires.

The existing test at py/tools/publish_check/test_publish_wheel_metadata.py (target //py/tools/publish_check:publish_wheel_metadata_check) only exercises a single fixture package //py/_test_publish_pkg:package via runtime_package_dependencies, and asserts against hardcoded _EXPECTED_* constants. The fixture happens to already be in the "correct" shape so the test stays green even while every real py/* package was leaking. This is what allowed PR #4527's bug to ship to PyPI.

Goal

Catch wheel-METADATA leaks in CI on the actual wheels that get published, without rebuilding them. The package job in .github/workflows/py.yml already runs pants --colors package ::, which produces every wheel in dist/*.whl. Add a verification step in that same job, between Run pants package and Archive wheels, that:

  1. Walks every dist/*.whl.
  2. For each wheel, parses the *.dist-info/METADATA and collects all Requires-Dist: entries excluding those gated by ; extra == "..." markers (i.e. only the runtime/required dist deps, not extras).
  3. Locates the corresponding py/<pkg>/setup.cfg by matching the wheel's normalized dist name to [metadata] name = ... in each setup.cfg under py/*/setup.cfg.
  4. Parses [options] install_requires from that setup.cfg using the same minimal parser style already used in py/pants-toolshed/macros.py::_setup_cfg_install_requires and py/pants-toolshed/toolshed_setup_cfg.
  5. Normalises both sides using packaging.utils.canonicalize_name for the name and packaging.requirements.Requirement(...).specifier for the version specifier, then asserts the two sets are equal — not subset. Mismatches (either direction: leaked extras OR missing requirements) must fail the job with a clear per-wheel diff showing the unexpected and missing entries.
  6. Fails with exit code 1 if any wheel mismatches; exits 0 otherwise. Also fails (exit code 2) if dist/ contains no wheels.

This step MUST run before Archive wheels / Archive sdists so that a leak gates the publish artefact upload.

Concrete changes

1. New script: py/tools/publish_check/check_wheel_metadata.py

Standalone python script (no pants test framework needed — it's a post-build verification). Implementation outline already drafted in the chat discussion that led to this issue; reproduce it faithfully:

"""Verify built wheels in dist/ have Requires-Dist matching setup.cfg.

Runs after `pants package ::` in CI. Walks dist/*.whl, parses METADATA,
locates the package's setup.cfg, and asserts:

    set(Requires-Dist without `; extra == ...`)
        == set(install_requires from setup.cfg)

Exits non-zero with a per-package diff on mismatch.
"""

import glob
import pathlib
import re
import sys
import zipfile
from collections import defaultdict
from email.parser import Parser

from packaging.requirements import Requirement
from packaging.utils import canonicalize_name


def _norm(req_str: str) -> tuple[str, str]:
    r = Requirement(req_str)
    return canonicalize_name(r.name), str(r.specifier)


def _setup_cfg_install_requires(path: pathlib.Path) -> list[str]:
    reqs, in_options, in_ir = [], False, False
    for raw in path.read_text().splitlines():
        s = raw.strip()
        if s.startswith("[") and s.endswith("]"):
            in_options = s == "[options]"
            in_ir = False
            continue
        if not in_options:
            continue
        if s.startswith("install_requires"):
            in_ir = True
            if "=" in raw:
                v = raw.split("=", 1)[1].strip()
                if v:
                    reqs.append(v)
            continue
        if in_ir:
            if raw and not raw[0].isspace():
                in_ir = False
                continue
            if s and not s.startswith("#"):
                reqs.append(s)
    return reqs


def _wheel_runtime_requires(whl: pathlib.Path) -> list[str]:
    with zipfile.ZipFile(whl) as zf:
        meta = next(n for n in zf.namelist()
                    if n.endswith(".dist-info/METADATA"))
        msg = Parser().parsestr(zf.read(meta).decode())
    out = []
    for rd in msg.get_all("Requires-Dist") or []:
        if ";" in rd and "extra" in rd.split(";", 1)[1]:
            continue
        out.append(rd.split(";", 1)[0].strip())
    return out


def _pkg_name_from_wheel(whl: pathlib.Path) -> str:
    return whl.name.split("-")[0]


def _setup_cfg_for(dist_name: str) -> pathlib.Path | None:
    for cfg in pathlib...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

@netlify
Copy link
Copy Markdown

netlify Bot commented May 14, 2026

Deploy Preview for nifty-bassi-e26446 ready!

Name Link
🔨 Latest commit 612cbb2
🔍 Latest deploy log https://app.netlify.com/projects/nifty-bassi-e26446/deploys/6a06386977c58b00084298a8
😎 Deploy Preview https://deploy-preview-4528--nifty-bassi-e26446.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot AI changed the title [WIP] Fix wheel-METADATA leak in aio.core package Gate Python package artifacts on wheel METADATA vs setup.cfg dependency parity May 14, 2026
Copilot finished work on behalf of phlax May 14, 2026 21:04
Copilot AI requested a review from phlax May 14, 2026 21:04
@phlax phlax closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants