diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 78c34156c01..f32ca251fc3 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -616,6 +616,7 @@ peps/pep-0735.rst @brettcannon peps/pep-0736.rst @gvanrossum @Rosuav peps/pep-0737.rst @vstinner peps/pep-0738.rst @encukou +peps/pep-0740.rst @dstufft # ... # peps/pep-0754.rst # ... diff --git a/peps/pep-0740.rst b/peps/pep-0740.rst new file mode 100644 index 00000000000..edb17b2df47 --- /dev/null +++ b/peps/pep-0740.rst @@ -0,0 +1,370 @@ +PEP: 740 +Title: Index support for digital attestations +Author: William Woodruff , + Facundo Tuesca +Sponsor: Donald Stufft +PEP-Delegate: Donald Stufft +Discussions-To: https://discuss.python.org/t/pre-pep-exposing-trusted-publisher-provenance-on-pypi/42337 +Status: Draft +Type: Informational +Topic: Packaging +Created: 08-Jan-2024 + +Abstract +======== + +This PEP proposes a collection of changes related to the upload and distribution +of digitally signed attestations and metadata used to verify them on a Python +package repository, such as PyPI. + +These changes have two subcomponents: + +* Changes to the currently unstandardized PyPI upload API, allowing clients + to upload digital attestations; +* Changes to the :pep:`503` and :pep:`691` "simple" APIs, allowing clients + to retrieve both digital attestations and + `Trusted Publishing `_ metadata + for individual release files. + +This PEP does not recommend a specific digital attestation format, nor does +it make a policy recommendation around mandatory digital attestations on +release uploads or their subsequent verification by installing clients like +``pip``. + +Rationale +========= + +Desire for digital signatures on Python packages has been repeatedly +expressed by both package maintainers and downstream users: + +* Maintainers wish to demonstrate the integrity and authenticity of their + package uploads; +* Individual downstream users wish to verify package integrity and authenticity + without placing additional trust in their index's honesty; +* "Bulk" downstream users (such as Operating System distributions) wish to + perform similar verifications and potentially re-expose or countersign + for their own downstream packaging ecosystems. + +This proposal seeks to accommodate each of the above use cases. + +While this PEP does not recommend a specific digital attestation format, +it does recognize the utility of Trusted Publishing as a pre-existing, +"zero-configuration" source of strong provenance for Python packages. +Consequently this PEP includes a proposed scheme for exposing each release +file's Trusted Publisher metadata, with the expectation that a future digital +attestation format will likely make use of it. + +Design Considerations +--------------------- + +This PEP identifies the following design considerations when evaluating +both its own proposed changes and previous work in the same or adjacent +areas of Python packaging: + +1. Index accessibility: digital attestations for Python packages + are ideally retrievable directly from the index itself, as "detached" + resources. + + This both simplifies some compatibility concerns (by avoiding + the need to modify the distribution formats themselves) and also simplifies + the behavior of potential installing clients (by allowing them to + retrieve each attestation before its corresponding package without needing + to do streaming decompression). + +2. Verification by the index itself: in addition to enabling verification + by installing clients, each digital attestation is *ideally* verifiable + in some form by the index itself. + + This both increases the overall quality + of attestations uploaded to the index (preventing, for example, users + from accidentally uploading incorrect or invalid attestations) and also + enables UI and UX refinements on the index itself (such as a "provenance" + view for each uploaded package). + +3. General applicability: digital attestations should be applicable to + *any and every* package uploaded to the index, regardless of its format + (sdist or wheel) or interior contents. + +4. Metadata support: this PEP refers to "digital attestations" rather than + just "digital signatures" to emphasize the ideal presence of additional + metadata within the cryptographic envelope. + + For example, to prevent domain separation between a distribution's name and + its contents, the digital attestation could be performed over + ``HASH(name || HASH(contents))`` rather than just ``HASH(contents)``. + +5. Consistent release attestations: if a file belonging to a release has a + set of digital attestations, then all of the other files belonging to that + release should also have the same types of attestations. + + This simplifies the downstream use story for digital attestations, and + prevents potentially vulnerable "swiss cheese" release patterns (where + a verifier checks for a valid attestation on ``HolyGrail-1.0.tar.gz`` + but their installing client actually resolves an attacker-controlled, + platform-specific ``.whl`` instead). + + +Previous Work +------------- + +PGP signatures +^^^^^^^^^^^^^^ + +PyPI and other indices have historically supported PGP signatures on uploaded +distributions. These could be supplied during upload, and could be retrieved +by installing clients via the ``data-gpg-sig`` attribute in the :pep:`503` +API, the ``gpg-sig`` key on the :pep:`691` API, or via an adjacent +``.asc``-suffixed URL. + +PGP signature uploads have been disabled on PyPI since +`May 2023 `_, after +`an investigation `_ +determined that the majority of signatures (which, themselves, constituted a +tiny percentage of overall uploads) could not be associated with a public key or +otherwise meaningfully verified. + +In their previously supported form on PyPI, PGP signatures satisfied +considerations (1) and (3) above but not (2) (owing to the need for external +keyservers and key distribution) or (4) (due to PGP signatures typically being +constructed over just an input file, without any associated signed metadata). +Similarly, PyPI's historical implementation of PGP did not satisfy consideration +(5), due to a lack of consistency checks between different release files +(and an inability to perform those checks due to no access to the signer's +public key). + +Wheel signatures +^^^^^^^^^^^^^^^^ + +:pep:`427` (and its :ref:`living PyPA counterpart `) +specify the :term:`wheel format `. + +This format includes accommodations for digital signatures embedded directly +into the wheel, in either JWS or S/MIME format. These signatures are specified +over a :pep:`376` RECORD, which is modified to include a cryptographic digest +for each recorded file in the wheel. + +While wheel signatures are fully specified, they do not appear to be broadly +used; the official `wheel tooling `_ deprecated +signature generation and verification support +`in 0.32.0 `_, which was +released in 2018. + +Additionally, wheel signatures do not satisfy any of +the above considerations (due to the "attached" nature of the signatures, +non-verifiability on the index itself, and support for wheels only). + +Specification +============= + +.. _upload-endpoint: + +Upload endpoint changes +----------------------- + +The current upload API is not standardized. However, we propose the following +changes to it: + +* In addition to the current top-level ``content`` and ``gpg_signature`` fields, + the index **SHALL** accept ``attestations`` as an additional multipart form + field. +* The new ``attestations`` field **SHALL** be a JSON object. +* The JSON object **SHALL** have one or more keys, each identifying an + attestation format known to the index. If any key does not identify an + attestation format known to the index, the index **MUST** reject the upload. +* The value associated with each well-known key **SHALL** be a JSON object. +* Each attestation value **MUST** be verifiable by the index. If the index fails + to verify any attestation in ``attestations``, it **MUST** reject the upload. + +In addition to the above, the index **SHALL** enforce a consistency +policy for release attestations via the following: + +* If the first file under a new release is supplied with ``attestations``, + then all subsequently uploaded files under the same release **MUST** also + have ``attestations``. Conversely, if the first file under a new release + does not have any ``attestations``, then all subsequent uploads under the + same release **MUST NOT** have ``attestations``. +* All files under the same release **MUST** have the same set of well-known + attestation format keys. + +The index **MUST** reject any file upload that does not satisfy these +consistency properties. + +Index changes +------------- + +.. _provenance-object: + +Provenance objects +^^^^^^^^^^^^^^^^^^ + +The index will serve uploaded attestations along with metadata that can assist +in verifying them in the form of JSON serialized objects. + +These "provenance objects" will be available via both the :pep:`503` Simple Index +and :pep:`691` JSON-based Simple API as described below, and will have the +following structure: + +.. code-block:: json + + { + "publisher": { + "type": "important-ci-service", + "claims": {}, + "vendor-property": "foo", + "another-property": 123 + }, + "attestations": { + "some-attestation": {/* ... */}, + "another-attestation": {/* ... */} + } + } + +* ``publisher`` is an **optional** JSON object, containing a + representation of the file's Trusted Publisher configuration at the time + the file was uploaded to the package index. The keys within the ``publisher`` + object are specific to each Trusted Publisher but include, at minimum: + + * A ``type`` key, which **MUST** be a JSON string that uniquely identifies the + kind of Trusted Publisher. + * A ``claims`` key, which **MUST** be a JSON object containing any context-specific + claims retained by the index during Trusted Publisher authentication. + + All other keys in the ``publisher`` object are publisher-specific. A full + illustrative example of a ``publisher`` object is provided in :ref:`appendix-2`. +* ``attestations`` is a **required** JSON object, containing one or + more attestation objects as identified by their keys. This object is + a superset of ``attestations`` object supplied by the uploader through the + ``attestations`` field, as described in :ref:`upload-endpoint`. + + Because ``attestations`` is a superset of the file's original uploaded attestations, + the index **MAY** chose to embed additional attestations of its own. + +Simple Index +^^^^^^^^^^^^ + +* When an uploaded file has one or more attestations, the index **MAY** include a + ``data-provenance`` attribute on its file link, with a value of either + ``true`` or ``false``. +* When ``data-provenance`` is ``true``, the index **MUST** serve a + :ref:`provenance object ` at the same URL, but with + ``.provenance`` appended to it. For example, if ``HolyGrail-1.0.tar.gz`` + exists and has associated attestations, those attestations would be located + within the provenance object hosted at ``HolyGrail-1.0.tar.gz.provenance``. + +JSON-based Simple API +^^^^^^^^^^^^^^^^^^^^^ + +* When an uploaded file has one or more attestations, the index **MAY** include a + ``provenance`` object in the ``file`` dictionary for that file. +* ``provenance``, when present, **MUST** be a :ref:`provenance object `. + +Security Implications +===================== + +This PEP is "mechanical" in nature; it provides only the plumbing for future +digital attestations on package indices, without specifying their concrete +cryptographic details. + +As such, we do not identify any positive or negative security implications +for this PEP. + +Index trust +----------- + +This PEP does **not** increase (or decrease) trust in the index itself: +the index is still effectively trusted to honestly deliver unmodified package +distributions, since a dishonest index capable of modifying package +contents could also dishonestly modify or omit package attestations. +As a result, this PEP's presumption of index trust is equivalent to the +unstated presumption with earlier mechanisms, like PGP and Wheel signatures. + +This PEP does not preclude or exclude future index trust mechanisms, such +as :pep:`458` and/or :pep:`480`. + +Recommendations +=============== + +This PEP does not recommend specific attestation formats. It does, +however, make the following recommendations to package indices seeking +to create new or implement pre-existing attestation formats: + +1. Consult the :ref:`living PyPA specifications ` + first, and determine if any currently defined attestation formats suit + your purpose. +2. If no suitable attestation format is defined under the PyPA specifications, + consider submitting it to the PyPA specifications for longevity and reuse + purposes. + +When designing a new attestation format, we make the following recommendations: + +1. Pick a short, but unique name for your attestation format; this name will + serve as the attestation's identifier in the upload and index APIs. + + When appropriate for an attestation format, we recommend using ``:`` as a + domain separator. For example, an attestation format that provides publish + provenance using `Sigstore `_ might have the + name ``sigstore:publish``. +2. Prefer parsimony in your format: avoid optional fields and functionality, + avoid unnecessary cryptographic agility and message malleability, and ensure + that verifying the attestation communicates something meaningful beyond a + basic integrity check (since the index itself already supplies cryptographic + digests for this purpose). + +.. _appendix-1: + +Appendix 1: Example Uploaded Attestations +========================================= + +This appendix provides a fictional example of the ``attestations`` field +submitted on file upload, with two fictional attestations (``publish`` and +``timestamp``): + +.. code-block:: json + + { + "publish": { + "mediaType": "application/vnd.dev.sigstore.bundle+json;version=0.2", + "verificationMaterial": { /* omitted for brevity */ }, + "messageSignature": { + "messageDigest": { + "algorithm": "some-hash-algo", + "digest": "digest-here" + }, + "signature": "signature-here" + } + }, + "timestamp": { + "cms": "some-long-blob-here" + } + } + +The payloads of these fictional attestations are purely illustrative. + +.. _appendix-2: + +Appendix 2: Example Trusted Publisher Representation +==================================================== + +This appendix provides a fictional example of a ``publisher`` key within +a :pep:`691` ``project.files[].provenance`` listing: + +.. code-block:: json + + "publisher": { + "type": "GitHub", + "claims": { + "ref": "refs/tags/v1.0.0", + "sha": "da39a3ee5e6b4b0d3255bfef95601890afd80709" + }, + "repository_name": "HolyGrail", + "repository_owner": "octocat", + "repository_owner_id": "1", + "workflow_filename": "publish.yml", + "environment": null + } + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive.