Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 70 additions & 55 deletions pep-0480.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
PEP: 480
Title: Surviving a Compromise of PyPI: The Maximum Security Model
Title: Surviving a Compromise of PyPI: End-to-end signing of packages
Version: $Revision$
Last-Modified: $Date$
Author: Trishank Karthik Kuppusamy <karthik@trishank.com>,
Expand All @@ -23,13 +23,23 @@ developers to sign for the distributions that are downloaded by clients. The
minimum security model proposed by PEP 458 supports continuous delivery of
distributions (because they are signed by online keys), but that model does not
protect distributions in the event that PyPI is compromised. In the minimum
security model, attackers may sign for malicious distributions by compromising
the signing keys stored on PyPI infrastructure. The maximum security model,
security model, attackers who have compromised the signing keys stored on PyPI
Infrastructure may sign for malicious distributions. The maximum security model,
described in this PEP, retains the benefits of PEP 458 (e.g., immediate
availability of distributions that are uploaded to PyPI), but additionally
ensures that end-users are not at risk of installing forged software if PyPI is
compromised.

This PEP requires some changes to the PyPI infrastructure, and some suggested
changes for developers who wish to participate in end-to-end signing. These
changes include updating the metadata layout from PEP 458 to include delegations
to developer keys, adding a process to register developer keys with PyPI, and a
change in the upload workflow for developers who take advantage of end-to-end
signing. All of these changes are described in detail later in this PEP. Package
managers that wish to take advantage of end-to-end signing do not need to do any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this all in one paragraph, it's easy to miss that the last sentence is about clients rather than the backend infrastructure. I'd suggest make it a separate paragraph and emphasising the *not*

additional work beyond what is required to consume metadata described in PEP
458.

This PEP discusses the changes made to PEP 458 but excludes its informational
elements to primarily focus on the maximum security model. For example, an
overview of The Update Framework or the basic mechanisms in PEP 458 are not
Expand Down Expand Up @@ -67,6 +77,8 @@ software updaters [5]_ [7]_, such as mix-and-match and extraneous dependencies
attacks, it can be improved to support end-to-end signing and to prohibit
forged distributions in the event that PyPI is compromised.

PEP 480 builds on PEP 458 by adding support for developer signing, and
reducing the reliance on online keys to prevent malicious distributions.
The main strength of PEP 458 and the minimum security model is the automated
and simplified release process: developers may upload distributions and then
have PyPI sign for their distributions. Much of the release process is handled
Expand Down Expand Up @@ -115,25 +127,28 @@ encouraged to read about TUF's design principles [2]_. It is also RECOMMENDED
that the reader be familiar with the TUF specification [3]_, and PEP 458 [1]_
(which this PEP is extending).

Terms used in this PEP are defined as follows:
The following terms used in this PEP are defined in the Python Packaging
Glossary [4]_: *project*, *release*, *distribution*.

* Projects: Projects are software components that are made available for
integration. Projects include Python libraries, frameworks, scripts,
plugins, applications, collections of data or other resources, and various
combinations thereof. Public Python projects are typically registered on the
Python Package Index [4]_.
Terms used in this PEP are defined as follows:

* Releases: Releases are uniquely identified snapshots of a project [4]_.
* Distribution file: A versioned archive file that contains Python packages,
modules, and other resource files that are used to distribute a release. The
terms *distribution file*, *distribution package* [4]_, or simply
*distribution* or *package* may be used interchangeably in this PEP.

* Distributions: Distributions are the packaged files that are used to publish
and distribute a release.
* Simple index: The HTML page that contains internal links to distribution
files.

* Simple index: The HTML page that contains internal links to the
distributions of a project [4]_.
* Target files: As a rule of thumb, target files are all files on PyPI whose
integrity should be guaranteed with TUF. Typically, this includes
distribution files, and PyPI metadata such as simple indices.

* Roles: There is one *root* role in PyPI. There are multiple roles whose
responsibilities are delegated to them directly or indirectly by the *root*
role. The term "top-level role" refers to the *root* role and any role
* Roles: Roles in TUF encompass the set of actions a party is authorized to
perform, including what metadata they may sign and which packages they are
responsible for. There is one *root* role in PyPI. There are multiple roles
whose responsibilities are delegated to them directly or indirectly by the
*root* role. The term "top-level role" refers to the *root* role and any role
delegated by the *root* role. Each role has a single metadata file that it is
trusted to provide.

Expand All @@ -143,14 +158,10 @@ Terms used in this PEP are defined as follows:
* Repository: A repository is a resource comprised of named metadata and target
files. Clients request metadata and target files stored on a repository.

* Consistent snapshot: A set of TUF metadata and PyPI targets that capture the
* Consistent snapshot: A set of TUF metadata and target files that capture the
complete state of all projects on PyPI as they existed at some fixed point in
time.

* The *snapshot* (*release*) role: In order to prevent confusion due to the
different meanings of the term "release" used in PEP 426 [1]_ and the TUF
specification [3]_, the *release* role is renamed to the *snapshot* role.

* Developer: Either the owner or maintainer of a project who is allowed to
update TUF metadata, as well as distribution metadata and files for a given
project.
Expand All @@ -176,7 +187,8 @@ Maximum Security Model
======================

The maximum security model permits developers to sign their projects and to
upload signed metadata to PyPI. If the PyPI infrastructure were compromised,
upload signed metadata to PyPI. In the model proposed in this PEP, if the PyPI
infrastructure were compromised,
attackers would be unable to serve malicious versions of a *claimed* project
without having access to that project's developer key. Figure 1 depicts the
changes made to the metadata layout of the minimum security model, namely that
Expand Down Expand Up @@ -412,7 +424,7 @@ Snapshot Process
The snapshot process is fairly simple and SHOULD be automated. The snapshot
process MUST keep in memory the latest working set of *root*, *targets*, and
delegated roles. Every minute or so the snapshot process will sign for this
latest working set. (Recall that project transaction processes continuously
latest working set. (Recall that project uploads continuously
inform the snapshot process about the latest delegated metadata in a
concurrency-safe manner. The snapshot process will actually sign for a copy of
the latest working set while the latest working set in memory will be updated
Expand All @@ -430,7 +442,7 @@ two directories, /metadata/ (containing delegated targets metadata files) and
/targets/ (containing targets such as the project simple index and
distributions that are signed by the delegated targets metadata).

Whenever the project uploads metadata or targets to PyPI, PyPI SHOULD check the
Whenever the project uploads metadata or target files to PyPI, PyPI SHOULD check the
project TUF metadata for at least the following properties:

* A threshold number of the developers keys registered with PyPI by that
Expand All @@ -446,13 +458,13 @@ project TUF metadata for at least the following properties:
delegator.

If PyPI chooses to check the project TUF metadata, then PyPI MAY choose to
reject publishing any set of metadata or targets that do not meet these
reject publishing any set of metadata or target files that do not meet these
requirements.

PyPI MUST enforce access control by ensuring that each project can only write
to the TUF metadata for which it is responsible. It MUST do so by ensuring that
project transaction processes write to the correct metadata as well as correct
locations within those metadata. For example, a project transaction process for
project upload processes write to the correct metadata as well as correct
locations within those metadata. For example, a project upload process for
an unclaimed project MUST write to the correct target paths in the correct
delegated unclaimed metadata for the targets of the project.

Expand All @@ -464,7 +476,10 @@ invalidate the signatures of the metadata as signed by developer keys.
Instead, package managers SHOULD be written to recognize and handle multiple
incompatible versions of TUF metadata so that claimed and recently-claimed
projects could be offered a reasonable time to migrate their metadata to newer
but backward-incompatible formats.
but backward-incompatible formats. One mechanism for handling this version
change is described in TAP 14__.

__ https://github.com/theupdateframework/taps/blob/master/tap14.md

If PyPI eventually runs out of disk space to produce a new consistent snapshot,
then PyPI MAY then use something like a "mark-and-sweep" algorithm to delete
Expand Down Expand Up @@ -505,68 +520,68 @@ section. The focus of this section is on how PyPI will respond to a project
transaction.

Every metadata and target file MUST include in its filename the `hex digest`__
of its `SHA-256`__ hash, which PyPI may prepend to filenames after the files
of its `BLAKE2b-256`__ hash, which PyPI may prepend to filenames after the files
have been uploaded. For this PEP, it is RECOMMENDED that PyPI adopt a simple
convention of the form: *digest.filename*, where filename is the original
filename without a copy of the hash, and digest is the hex digest of the hash.

__ http://docs.python.org/2/library/hashlib.html#hashlib.hash.hexdigest
__ https://en.wikipedia.org/wiki/SHA-2
__ https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE2

When an unclaimed project uploads a new transaction, a project transaction
process MUST add all new targets and relevant delegated unclaimed metadata.
The project transaction process MUST inform the snapshot process about new
process MUST add all new target files and relevant delegated unclaimed metadata.
The project upload process MUST inform the snapshot process about new
delegated unclaimed metadata.

When a *recently-claimed* project uploads a new transaction, a project
transaction process MUST add all new targets and delegated targets metadata for
the project. If the project is new, then the project transaction process MUST
upload process MUST add all new target files and delegated targets metadata for
the project. If the project is new, then the project upload process MUST
also add new *recently-claimed* metadata with the public keys (which MUST be
part of the transaction) for the project. *recently-claimed* projects have a
threshold value of "1" set by the transaction process. Finally, the project
transaction process MUST inform the snapshot process about new
threshold value of "1" set by the upload process. Finally, the project
upload process MUST inform the snapshot process about new
*recently-claimed* metadata, as well as the current set of delegated targets
metadata for the project.

The transaction process for a claimed project is slightly different in that
The upload process for a claimed project is slightly different in that
PyPI administrators periodically move (a manual process that MAY occur every
two weeks to a month) projects from the *recently-claimed* role to the
*claimed* role. (Moving a project from *recently-claimed* to *claimed* is a
manual process because PyPI administrators have to use an offline key to sign
the claimed project's distribution.) A project transaction process MUST then
the claimed project's distribution.) A project upload process MUST then
add new *recently-claimed* and *claimed* metadata to reflect this migration. As
is the case for a *recently-claimed* project, the project transaction process
MUST always add all new targets and delegated targets metadata for the claimed
project. Finally, the project transaction process MUST inform the consistent
is the case for a *recently-claimed* project, the project upload process
MUST always add all new target files and delegated targets metadata for the claimed
project. Finally, the project upload process MUST inform the consistent
snapshot process about new *recently-claimed* or *claimed* metadata, as well as
the current set of delegated targets metadata for the project.

Project transaction processes SHOULD be automated, except when PyPI
Project upload processes SHOULD be automated, except when PyPI
administrators move a project from the *recently-claimed* role to the *claimed*
role. Project transaction processes MUST also be applied atomically: either all
metadata and targets -- or none of them -- are added. The project transaction
role. Project upload processes MUST also be applied atomically: either all
metadata and target files -- or none of them -- are added. The project transaction
processes and snapshot process SHOULD work concurrently. Finally, project
transaction processes SHOULD keep in memory the latest *claimed*,
upload processes SHOULD keep in memory the latest *claimed*,
*recently-claimed*, and *unclaimed* metadata so that they will be correctly
updated in new consistent snapshots.

The queue MAY be processed concurrently in order of appearance, provided that
the following rules are observed:

1. No pair of project transaction processes may concurrently work on the same
1. No pair of project upload processes may concurrently work on the same
project.

2. No pair of project transaction processes may concurrently work on
2. No pair of project upload processes may concurrently work on
*unclaimed* projects that belong to the same delegated *unclaimed* role.

3. No pair of project transaction processes may concurrently work on new
3. No pair of project upload processes may concurrently work on new
recently-claimed projects.

4. No pair of project transaction processes may concurrently work on new
4. No pair of project upload processes may concurrently work on new
claimed projects.

5. No project transaction process may work on a new claimed project while
another project transaction process is working on a new recently-claimed
5. No project upload process may work on a new claimed project while
another project upload process is working on a new recently-claimed
project and vice versa.

These rules MUST be observed to ensure that metadata is not read from or
Expand Down Expand Up @@ -857,12 +872,12 @@ References
==========

.. [1] https://www.python.org/dev/peps/pep-0458/
.. [2] https://isis.poly.edu/~jcappos/papers/samuel_tuf_ccs_2010.pdf
.. [2] https://theupdateframework.io/papers/survivable-key-compromise-ccs2010.pdf
.. [3] https://github.com/theupdateframework/tuf/blob/develop/docs/tuf-spec.txt
.. [4] http://www.python.org/dev/peps/pep-0426/
.. [4] https://packaging.python.org/glossary
.. [5] https://github.com/theupdateframework/pip/wiki/Attacks-on-software-repositories
.. [6] https://mail.python.org/pipermail/distutils-sig/2013-September/022773.html
.. [7] https://isis.poly.edu/~jcappos/papers/cappos_mirror_ccs_08.pdf
.. [7] https://theupdateframework.io/papers/attacks-on-package-managers-ccs2008.pdf
.. [8] https://mail.python.org/pipermail/distutils-sig/2013-September/022755.html
.. [9] https://pypi.python.org/security
.. [10] https://mail.python.org/pipermail/distutils-sig/2013-August/022154.html
Expand Down