New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New RFC: signing registry index commits #2474

Open
wants to merge 1 commit into
base: master
from

Conversation

@withoutboats
Copy link
Contributor

withoutboats commented Jun 14, 2018

This RFC proposes signing the commits to the index of a registry, and for cargo to automatically verify these signatures. It additionally includes a proposed system for performing key rotations. It is intended to be a minimal, intermediate step toward improving the security of cargo registries (including crates.io) without completely replacing the existing index system.

Thanks to Henry de Valence, André Arko, Yehuda Katz and Tony Arcieri for providing feedback on this RFC prior to publishing.

Rendered

@withoutboats withoutboats added T-lang T-cargo and removed T-lang labels Jun 14, 2018

2. A keys file exists for that registry.

An attempt to update the `HEAD` of a signed registry to a commit that is not
signed by one of the existing committer keys is a hard failure that will

This comment has been minimized.

@Manishearth

Manishearth Jun 14, 2018

Member

modulo key rotation I presume?

(May need to be clearer about "existing" here)

For that reason, crates.io will adopt policy that `can-rotate` keys are stored
in an offline medium, with due security precautions. As a future hardening, we
could also support a threshold signature scheme, requiring signatures from
multiple `can-rotate` keys to perform a key rotation, reducing the impact of

This comment has been minimized.

@Manishearth

Manishearth Jun 14, 2018

Member

This can also be handled out of band; you can have one (potentially more) can-rotate key that are Shamir'd between core team members. While there is a single key it is only accessible if (any) X core team members (or other keyholders) come together.

This scheme does have the vulnerability of introducing a step where they need to share their Shamir keys and decrypt the signing key, which can still be leaked somehow and have zero protection after that.

@rvolgers

This comment has been minimized.

Copy link

rvolgers commented Jun 14, 2018

There is one possible caveat to git's integrity guarantees which doesn't apply to Cargo anyway but might still be worth mentioning. At least I think it shows that while git does provide useful security properties, it is a flexible tool that is not necessarily designed to uphold those security properties in all usage scenarios. This might be worth mentioning in case of future custom Cargo deployments and other new developments.

Apparently git over the file:// or rsync:// transports also copies the index files from the remote repository (and doesn't verify them), which means the integrity guarantees are weaker.

Not sure if this is still the case. I wasn't able to find any updated information quickly and wanted to leave this note here in case I forget.

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented Jun 15, 2018

Thanks @withoutboats for writing this beautiful RFC!

I like that it strikes the balance between improving security on one hand, and applying pragmatic choices at the other hand, e.g. sticking to SHA-1 until our stack of used technology supports SHA-2.
There has recently been an update on progress on that issue on the git mailing list: link.

I also agree that a switch to TUF would indeed be a bad idea. To me, there seems to be little benefit in knowing that some particular author has authored a crates.io package vs knowing that the author logged in to crates.io successfully. TUF also dramatically increases the risks of individual key loss.

As an open question, I wonder how source replacement should be handled after the change. When working on Rust projects offline for example, I need to edit the index to point to my localhost server for storage for example. It would be cool to still enable such local edits without requiring signatures or without me needing to change key set of the local registry or other possibly complicated operations. A flag that can be appended to sources like verify-signatures = false would solve these issues.

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 16, 2018

Just fyi, there is a Rust PGP implementation project: https://sequoia-pgp.org/

I support this direction overall of course. In fact, PGP/GPG brings an enormous if painful array of useful tooling, ala smart cards, which sounds useful but might not work with these keys in .cargo.

We've moved past the pain of PGP with modern messaging tools like signal, etc. which largely replace careful key management with strong automatic forward secrecy measures. I'd propose an automatic forward security solution for repositories together with a slightly simpler PGP integration than proposed here:

We could give all repositories with local commits have an automatically generated private Ed25519 key in the .cargo directory, with the public key placed into the repository itself. We then rotate this signing key with every commit, or publish in this case, with the old key signing the next whole commit, including new public key added to the repository.

These automatic keys may not restrict crates.io uploads of course, but they would provide useful forensic information, even for crates that never utilize PGP. Adversary who compromise a developer's machine create a break in this chain of keys if they rewrite any commit history predating that currently published. Also, we achieve cheap PGP integration with ordinary PGP signed repository commits. In fact, crates can frequently benefit from PGP retroactively, assuming the developer never broke the association between the repository and the key in .cargo. We'd need a scheme to restrict crates.io index uploads to PGP signed commits.

This automated scheme leaks more metadata about developers' workflow: Any breaks the signature chain indicate that repositories were cloned or copied between machines. If two signing chains remain active it indicates the developer works from two different machines. Or worse the absence may indicate they use synchronization tools that themselves create an attack vector, ala dropbox. etc.

tl;dr This RFC proposes a seemingly non-standard PGP integration that supports forward security properties, ala can-rotate. Instead, we should provide those forward security properties with a separate automatic always-on signing scheme, that provides forward security by rotating keys with every commit, while using more standard git PGP integration for access control to crates.io and long term key management.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Jun 16, 2018

The motivation of signing a repository is not clear; it's no more than one more layer of access control. Doing MITM on HTTPS is only common for corporate proxies, and in that case 1) there doesn't seem to be a threat model, and 2) if tamper is required it's likely that the organisation will force an alternative key to be trusted.

We're also different from TUF, in the sense that we're not distributing updates, just metadata. The point is that a registry is full of user uploaded content, in contrast to a package repository where only trusted users package the distributions.

Finally, putting ultimate trust on the centralised registry doesn't make sense, for the same reason as above. Blindly signing someone else's content is basically false sense of security. What we need instead is user signed crate tarballs, and maybe some "officially approved" keys (or signing chains) that helps user identifying genuine crates.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jun 16, 2018

@est31

As an open question, I wonder how source replacement should be handled after the change.

The supported source replacement mechanisms aren't impacted by this change; a registry replacement can be signed just like any registry can, whereas a local-registry or directory replacement have no additional security properties under this RFC.

What you're doing though - having users edit their own index - would stop working. I'm surprised it works today, I guess cargo performs a merge for you entirely automatically when it updates the index again? I think all around ideally your system would have a solution that doesn't require them to edit their index, not sure what that would look like but it seems like a fruitful discussion to have on another thread.


@burdges

I didn't understand your comment. I think the miscommunication is around what git repositories this RFC is talking about. The index information for the crates.io repository, containing all of the information about crates.io packages and used in version resolution for those packages, is distributed as a git repository. End users are never intended to commit to this repository, they only fetch it.


@ishitatsuyuki

When you say that "doing MITM on HTTPS is only common for corporate proxies" I think you misunderstand the motivation of this RFC. You seem to be referring to the sort of aboveboard interception of HTTPS data using a certificate they've required users to trust is common that is common in certain settings. But this RFC is talking about malicious attacks on the transport protocol used to obtain the index updates in general, and a MITM attack in which an attacker obtains a false certificate for github.com and then secretly intercepts traffic between the user and GitHub would be an example of this sort of attack.

In other words, this RFC is about hardening us in the event that an attacker somehow successfully breaches the security of TLS or SSH. This RFC provides a second layer of authentication between cargo and a registry service that isn't dependent on the transport layer.

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Jun 16, 2018

@withoutboats Thanks! This is a great step.

Is the "security considerations" section intended to be a threat model? Could we possibly make it more explicit and label it "threat model"?

Also, I am assuming that we are choosing not to handle a few things in this first pass: namely corruption of the .cargo/pubkeys file (or whatever it is called). For example, if I can somehow cause that file to be corrupted, then user just had to start over and they're stuck with the "trust the first download" thing. Alternately, if I can run a build.rs that adds my key to your ring...

If we're not dealing with this now, that's fine. I still think this is an improvement, but I think those caveats should be more explicit.

(Also, one potential mitigation is to run build.rs as the "nobody" user and give .carrgo/pubkeys a mode like the .ssh dir)

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jun 16, 2018

@mark-i-m Yea, this RFC is not intended to protect against an attacker with unfettered access to a users home directory and the ability to remotely execute code.

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 17, 2018

I described a scheme that applies to anything remotely resembling a repository, but I could surely described it more clearly.

In this RFC, you propose signing crate publications, but suggest doing this by managing a per-crate PGP keys outside the usual PGP ecosystem, right? You should do roughly this, except (a) uses a less complex signature scheme that can be fully automated and rotates keys with every publication, and (b) authenticates these automatically generated chains of signing keys by signing one link using PGP.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Jun 17, 2018

But this RFC is talking about malicious attacks on the transport protocol used to obtain the index updates in general, and a MITM attack in which an attacker obtains a false certificate for github.com and then secretly intercepts traffic between the user and GitHub would be an example of this sort of attack.

If GitHub is compromised, we have other things to worry about. For example, you can steal an arbitrary user's session to gain access to crates.io. Also, to prevent the attack in your example HPKP should be used for HTTPS, and for SSH it only trusts one key per host.

Also, I don't think that we're going to use Git indefinitely; downloading the full index doesn't scale, and what I proposes is in rust-lang/cargo#5597. tldr, with that approach a per-crate/user signing makes much more sense.

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented Jun 17, 2018

HPKP should be used for HTTPS

Github doesn't seem to use HPKP.

@titanous

This comment has been minimized.

Copy link

titanous commented Jun 17, 2018

Github doesn't seem to use HPKP.

HPKP is deprecated: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/he9tr7p3rZ8

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jun 17, 2018

In this RFC, you propose signing crate publications, but suggest doing this by managing a per-crate PGP keys outside the usual PGP ecosystem, right?

This is not what the RFC proposes. This is about signing index commits to the registry index, which cargo uses to resolve dependencies, and which is stored as a git repository. It has nothing to do with per-crate PGP keys (I think you're confusing it with an earlier unrelated proposal, for end users signing their package).

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Jun 17, 2018

If GitHub is compromised, we have other things to worry about. For example, you can steal an arbitrary user's session to gain access to crates.io.

You can also create a new github account to "gain access" to crates.io. I assume you mean that you could gain an access token with that user's privileges, which is true, but out of scope of this RFC. If you can impersonate a user's GitHub account, you can impersonate that user with crates.io's current auth model, its true. But this RFC is about protecting against attempts to impersonate the registry, not any particular user.

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 17, 2018

I see. I skipped the opening sentence, which defines what a registry is. oops ;)

Yes, I believe this all looks good in that case. Go for it! :)

[Insert image saying "SIGN ALL THE REPOS"]

I've already pointed out the sequoia-pgp.org project, which might prove helpful.

Just to be clear, there might still be advantages in rotating keys rapidly here too, but we usually envision forward-security as protecting humans from themselves, and good policies can achieve the same ends for organizations like registries.

We can harden crates.io against attacks at this point by distributing the
current trusted keys with the rustup distribution, allowing any security
mechanisms we put in place for the distribution of Rust and cargo binaries to
also secure this operation.

This comment has been minimized.

@nagisa

nagisa Jun 18, 2018

Contributor

Something I’ve seen done in linux package managers is a simple prompt that asks the user about whether they want to add some key to their key store.

This would require users of cargo on e.g. CI (where almost every use is the first use) to appropriately acknowledge that they are not getting any security or setup the CI so that the keys are cached/signed correctly between runs.

@ishitatsuyuki

This comment has been minimized.

Copy link
Member

ishitatsuyuki commented Jun 23, 2018

What's your threat model?

First, there are only limited ways to compromise the registry.

  1. Compromising GitHub: very unlikely.
  2. MITM without gaining full access to the client: unlikely.
  3. Breaching a rust-lang admin's token: this can happen, and is the only way to write to crates.io-index.
  4. Compromising a rust-lang admin's account: harder. This allows hijacking the crates.io OAuth as well, allowing arbitrary crate uploads/updates.
  5. Breaching the S3 token: harder, I can think of two ways: 1. hack a dev's machine, 2. leak it from Heroku.
  6. Compromising crates.io: unlikely, but if so it's game over.

Second, if you could compromise the registry, what can you do? The real crate file is stored on S3. Which means, you need to compromise both the Git index and S3 to do real harm.

@est31

This comment has been minimized.

Copy link
Contributor

est31 commented Jun 23, 2018

@ishitatsuyuki the git index includes the URL to use to download the crate files from. Right now it points to crates.io, but if you can freely modify the index to change checksums of crates, you can also edit it to point to another URL.

You also forgot another threat: breakage of the TLS connection between Github and the local cargo instance. This is more likely than you think, as many shady people have access to SSL root certificates.

@burdges

This comment has been minimized.

Copy link

burdges commented Jun 23, 2018

I'd consider it a "safe assumption" that for different users GitHub's CA is compromised, GitHub itself is compromised, Amazon is compromised, and some rust-lang accounts and servers are compromised.

Among consumer services ala yahoo, etc. such compromises must be considered a given, post Snowden. About the only argument against this goes "We've seen APTs attack sysadmins, but not developers", which sounds pretty silly. It's also wrong because the Jupiter backdoor being shifted from NSA control to presumably Chinese control presumably occurred through developers being compromised..

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jun 30, 2018

@ishitatsuyuki

TLS is proxied often enough, and often enough by unreliable proxies (consider security appliances of dubious security) that relying on it for verification is non-ideal.


An attempt to update the `HEAD` of a signed registry to a commit that is not
signed by one of the existing committer keys is a hard failure that will
prevent cargo from resolving dependencies in that registry any longer. Until

This comment has been minimized.

@arielb1

arielb1 Jun 30, 2018

Contributor

Leaving malicious data in a git repository is dangerous because it might accidentally be read (consider all the apt-get problems with it skipping verification in some cases). Is there a way of preventing unverified contents from hitting the filesystem?

This comment has been minimized.

@arielb1

arielb1 Jun 30, 2018

Contributor

Also, are we relying on git for downgrade protection? i.e., does git prevent you from updating a commit to a non-child commit?

This comment has been minimized.

@withoutboats

withoutboats Jul 2, 2018

Contributor

Is there a way of preventing unverified contents from hitting the filesystem?

I don't know if libgit2 exposes a way to fetch into memory without saving to the file system. In general, as long as cargo traverses the database from HEAD (after verifying the signature of HEAD for each cargo command), I don't think there's a lot of risk here.

It might be nice to reverse the fetch after verification fails, but we want to keep the repo in a failed state somehow to avoid this being a way of DoSing the index.

does git prevent you from updating a commit to a non-child commit?

We're just pulling, so we do a fetch followed by a merge. We can make sure that the merge is a fast forward. I'm not sure what you mean by downgrade protection; I think this suggests you're thinking we'd want invalid updates to be rejected and for the index to then continue working as normal; this is actually undesirable in my opinion: we want it to fail loudly, so that for example this can't be used as an avenue for keeping a user from receiving valid updates to the index (by always serving them invalid updates, which they experience as having no update).

This comment has been minimized.

@arielb1

arielb1 Jul 2, 2018

Contributor

I don't know if libgit2 exposes a way to fetch into memory without saving to the file system.

I don't mean avoiding using the .git directory, just that we shouldn't check out the new branch before we verify its signature - there were too many git vulnerabilities that relied on checking out files with carefully-crafted names/contents.

Downgrade attacks

For kinds of attacks on package manager, see the classic paper at https://isis.poly.edu/~jcappos/papers/cappos_mirror_ccs_08.pdf

Apparently that paper calls a downgrade a "replay attack". By that, I mean sending a version of the crates.io metadata that is older than the version the client has, hoping that the client would go back to an older version. I think git fetch + merge will not accept downgrades, but it needs to be checked and written down.

we want it to fail loudly, so that for example this can't be used as an avenue for keeping a user from receiving valid updates to the index

An attacker that can defeat the transport encryption can always conduct a freeze attack and pretend that no update had happened yet (up to any expiration dates we might have). I don't see an advantage in attacking using invalid data.

However, still, if an invalid update is detected, cargo should report an error rather than silently not performing an update just to keep us sane.

This comment has been minimized.

@withoutboats

withoutboats Jul 2, 2018

Contributor

I don't mean avoiding using the .git directory, just that we shouldn't check out the new branch before we verify its signature

Definitely! We should make sure cargo never checks out FETCH_HEAD.

An attacker that can defeat the transport encryption can always conduct a freeze attack and pretend that no update had happened yet (up to any expiration dates we might have).

You're right, and I had forgotten I had removed an expiration mechanism from this RFC.

This comment has been minimized.

@est31

est31 Jul 2, 2018

Contributor

Cargo doesn't check out the index any more: rust-lang/cargo#4060. If you have a checkout at ~/.cargo/registry/index/github.com-1ecc6299db9ec823/ then it's because an older version of cargo which still did a check out.

Edit: it's a bit more complicated sadly. rust-lang/cargo#4026 made cargo not check out the index, but that broke older cargo so rust-lang/cargo#4060 made it check out the index again. If we want to make cargo not check out the index, we'd have to drop backwards compatibility.

This comment has been minimized.

@withoutboats

withoutboats Jul 3, 2018

Contributor

Right, we don't do real checkouts, but its about managing what reference we use for our local HEAD. Right now we just get refs/remotes/origin/master, we'll need to track a local master branch as well as the origin/master with this RFC (without doing a full checkout).

This comment has been minimized.

@withoutboats

withoutboats Jul 12, 2018

Contributor

@arielb1 looking at the code, we use a refspec "refs/heads/master:refs/remotes/origin/master"; because this doesn't have a +, we only perform a fast-forward fetch

signed (a 'signed registry'), the signature of HEAD is verified against the
public keys which are permitted to sign commits to the index.

## The keys TOML

This comment has been minimized.

@tarcieri

tarcieri Jul 20, 2018

This section (and by extension the key rotation section) is the only part of this proposal that's incompatible with TUF, because TUF defines a metadata format for the exact same thing.

TUF typically uses JSON (and Canonical JSON) for metadata, however I don't see any reason why it couldn't use TOML instead.

I think you could have a "minimum viable TUF" implementation that just uses a few more files and a slightly different structure for the data which otherwise would work the exact same way, and wouldn't (at least initially) be dependent on anything more heavy handed than just parsing the TOML files and verifying signatures the exact way you otherwise would in this proposal.

This comment has been minimized.

@justincormack

justincormack Aug 4, 2018

It would be trivial to convert the TOML to canonical JSON for signing.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 3, 2018

I have been implementing this RFC on a branch & I've mostly found it successful. I want to make a few small changes to the RFC based on my experience:

  1. Do not reject registries as broken when HEAD is signed but they're missing a keys file. This policy makes deciding to stop signing commits very disruptive for a registry & in retrospect provides little benefit: if they can delete your keys file, they can probably write their own public key into it.
  2. Rather than determining key rotation order by the commits the tags point to, which is potentially very expensive as you have to iterate over every parent ID since your last update, instead require registries to number their key rotations sequentially.
@trishankatdatadog

This comment has been minimized.

Copy link

trishankatdatadog commented Aug 3, 2018

Hi @withoutboats,

Firstly, thanks for working on improving crates.io security! Someone has to start the hard work, and it's great that you're doing it.

@tarcieri, @SantiagoTorres and I have been discussing this. We think you should be able to transparently use the PEP 458 TUF security model we drafted for PyPI, but over git for crates.io. Allow me to elaborate a bit.

TUF is designed to handle the worst-case scenario where the software repository itself (in this case, crates.io) is compromised. I think it's safe to say that a repository compromise is no longer in the theoretical realm.

We understand your concerns that TUF looks a bit complicated, but there are good reasons for its design decisions:

  1. Separation of duties: different "roles" sign different types of metadata (with varying levels of importance) with different keys.
  2. Threshold signatures: metadata for really important roles are signed using m out of n keys.
  3. Explicit and implicit key rotation: there are mechanisms to recover from a key loss or compromise. TUF also provides a mechanism called delegations to transparently distribute and rotate keys for 3rd party developers.
  4. Minimizing risk with offline signatures: keys for really important roles are kept offline, or off the repository, in safe deposit boxes, for example, and used only rarely.
  5. Diversity of cryptographic algorithms: TUF uses multiple signing and hashing algorithms so that a compromise of any one of them is insufficient to cause a repository compromise.

The PEP 458 model basically starts with providing what we call the "minimum security model." In this simple model, automation running the repository (crates.io) will sign for all packages using online keys (or signing keys accessible automatically to the automation). All signed metadata (just JSON files) can be transported over any file transfer protocol, such as HTTP or even git. There are repository tools that will let the automation easily produce signed metadata, and I believe the rust-tuf client can parse this metadata (though @heartsucker should correct me if I am wrong).

Later on, we can talk about ramping up security with better security models such as PEP 480, but I strongly believe PEP 458 should not be too difficult to adapt for crates.io. Please let us know if you have questions, and we will be happy to help however we can.

Thanks,
Trishank

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 4, 2018

Could you provide more material and specific suggestions for how the security properties of this proposal could be improved? My problem with TUF is not that its "a bit complicated," but that it is presented as a wholesale solution which is incompatible with our existing infrastructure. What material changes do you want to see made to this RFC, and how do you believe those changes would improve our security practices?

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 5, 2018

Respectfully, none of these comments are responsive to my request.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 5, 2018

@withoutboats there are a few things I think are vague or unaddressed in your current proposal which TUF addresses:

  • Threshold signatures: at least by my reading of your proposal, anyone with a can-rotate key can singlehandedly compromise the index. TUF, as part of its "survivable key compromise" goal, is designed to support a k-of-n model for sensitive keys like its root keys, preventing unilateral action by any individual keyholder. I think it might be possible to express this via signed git commits as signatures on git commits are just ASCII-armoured OpenPGP signatures. It'd be interesting to see how git would react to having more than one for a given commit (I understand it parses these signatures in an oddly "flexible" way).
  • Secure migration path for E2E signatures: There's already a large body of crates which are not E2E signed. Even if we deploy a scheme where every crate publisher starts signing now, we need a way to ensure that the existing body of crates is still installable, and that these crates aren't subject to downgrade attacks. This is somewhat tricky to solve, and I think TUF solves it well.

Really I think the differences between what you've proposed and what a TUF-enabled version of the same thing could look like ultimately just come down to the format and structure of the key management files (i.e. the TUF metadata). TUF is a flexible framework, so those files could potentially be expressed as git-authenticated TOML.

Update:

Looking at parse_gpg_output() in gpg-interface.c, it appears that if given a sequence of signature packets on the same commit, git will extract the last signature and ignore the others. This would (hopefully) allow threshold signature tooling to utilize all of the signatures, while things like git itself (and hopefully the GitHub UI) would only display one.

@SantiagoTorres

This comment has been minimized.

Copy link

SantiagoTorres commented Aug 6, 2018

Respectfully, none of these comments are responsive to my request.

Sorry, I was following the train of thought of the comments above mine. Let me try to answer some of those questions.

Could you provide more material and specific suggestions for how the security properties of this proposal could be improved?

Beyond what @tarcieri mentioned above, there's also the fact that there's no implicit metadata revocation (or at least I couldn't find it anywhere in the proposed RFC). That is, there needs to be a source of freshness (e.g., a timestamp role) to ensure you are not being fed outdated metadata by a frozen mirror.

This is tangential, but I thought it was worth mentioning: there's also the concern of git metadata tampering attacks. In this case a malicious repository or MiTM can replace valid metadata (e.g., a signed tag refspec) to point to another git object. I've done proofs of concept of this against django repositories. The tags are still valid, signed tags by the right party but not the tag you were looking for (more on this below). If you were to make key rotation here, you'd have to make sure git metadata tampering is not a concern.

My problem with TUF is not that its "a bit complicated," but that it is presented as a wholesale solution which is incompatible with our existing infrastructure.

This is true, the use of git as a metadata transport/snapshotting solution may appear as an issue in the beginning. However, I already see that you're thinking of overriding the gpg-signing program to create different header information. If that is the case then embedding TUF-like metadata in the commit object doesn't sound too hard to do ™️

What material changes do you want to see made to this RFC

As a followup of metadata tampering attacks: git signed pushes are a good way to avoid these, but as far as I'm aware there is no support for them in the github side of things (you'd need a *-receive hook on your side). On that note, having implicit metadata expiration could help. You could expire OpenPGP 4880 signatures but, as per the specification, expiration is not mandatory and it's not added to a signature as a subpacket by default by gpg1 or 2* 🤷‍♂️

I can always help formalizing this if you'd like.

  • (key expiration, on the other hand, is generally added)
@trishankatdatadog

This comment has been minimized.

Copy link

trishankatdatadog commented Aug 7, 2018

Hi @withoutboats,

Sorry for the late reply, got a bit busy with work. Also, thanks for pitching in @tarcieri @heartsucker @SantiagoTorres and @justincormack.

Here are some key material differences between the current proposal, and using what I will call git+tuf://:

  1. The current proposal allows for a security model like PEP 458, where all packages are signed using online keys. This allows packages to be published instantly, but offers security guarantees equivalent pretty much to simply using TLS.

OTOH, as @heartsucker pointed out, TUF allows for building more nuanced security models that allow some packages to be signed by machines, and others to be signed using offline keys belonging to developers. For more information, please see the Diplomat presentation, and PEP 480. Using TUF, you can start with PEP 458 for usability, and slowly but surely move towards security -- protecting users even though crates.io itself has been compromised --- with PEP 480.

To the best of my understanding, the current proposal does not provide any mechanism to protect users if crates.io itself has been compromised.

  1. It is not clear whether the current proposal handles all of the security attacks that TUF does for the same threat model. For example, TUF includes a bandwidth-efficient feature to prevent rollback attacks on packages yet to be installed even if the repository has been compromised.

I am sure that some of us here are happy to write an RFC and code to help crates.io use TUF over git. We strongly believe that TUF will help crates.io to provide much stronger security guarantees while maintaining usability.

Thanks,
Trishank

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 7, 2018

@trishankatdatadog

  1. The goal of this proposal is to protect users that use Cargo through a mirror, CDN, or TLS interceptor. It does not try to survive repository key compromise.

  2. In a Cargo-like setting, I do not understand which attackers the "TUF minimum security" model protects against that the "single repository model" doesn't. I would expect any attacker that can compromise the timestamp key to be also able to compromise all the other repository keys, and therefore to be able to sign their own packages.

    The bins key is the most vulnerable, and in a Cargo-like setting it has to rubber-stamp packages sent to it by @bors.

  3. This is a bit independent of the key assignment, but moving the @bors signing key to a separate hardened server (I think this can even be done on an Amazon KMS if someone can understand how to make it work?) and keeping an secured audit log of all commit signatures (while still "robo-signing" all commits) would be possible in both models.
    With some more intelligence, the hardened server and/or a separate monitor could make sure that the git tree is valid (making sure that any attacker-provided malware is recorded for posterity) and prevent changing the code of already-installed package versions (making the attack only work when victims cargo update).
    In both models, it would make attacks slower and more visible - certificate-transparency style - because attackers couldn't serve infected packages only to some targets.

  4. I think that one of the primary threats Cargo faces is the "generalized eslint attack" - @bors's authentication path is compromised (either at the maintainer end or at @bors's end), and that is used to create a malicious update of a common package, that then takes over many developers' computers.
    The mitigation is that this attack is fairly noisy - it publishes the malware to the entire world.
    The main solution to this attack is hardening the authentication path ala 2FA and TUF maximum security. That requires coordination with developers, and therefore deserves a different RFC than this one.

Moving to a TUF-based model would be a fairly invasive change, and therefore I would really prefer that it will take place in a different RFC with the proper concern for security.

@trishankatdatadog

This comment has been minimized.

Copy link

trishankatdatadog commented Aug 7, 2018

@arielb1 I understand. As I mentioned before, @tarcieri, @SantiagoTorres, and I will be happy to write that different RFC.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 7, 2018

@withoutboats

The greatest strength and weakness of TUF is that it is very customizable and can be made to work with basically any security model.

If we are not using developer-signed packages, then TUF's main advantage is that it allows the use of threshold signatures for the key rotation role.

The second advantage of TUF is that it has a built-in infrastructure for supporting developer-signed packages. I think that if/when we want to implement developer-signed packages, using TUF would probably be a good idea.

BTW, git metadata tampering: according to @withoutboats:

@arielb1 looking at the code, we use a refspec "refs/heads/master:refs/remotes/origin/master"; because this doesn't have a +, we only perform a fast-forward fetch

Not sure how well this works as a security mechanism, but this is allegedly a thing.

@SantiagoTorres

This comment has been minimized.

Copy link

SantiagoTorres commented Aug 7, 2018

BTW, git metadata tampering: according to @withoutboats:

@arielb1 looking at the code, we use a refspec "refs/heads/master:refs/remotes/origin/master"; because this doesn't have a +, we only perform a fast-forward fetch

Not sure how well this works as a security mechanism, but this is allegedly a thing.

AFAIU there are two issues with this:

  • If you're hardcoding the refspec in the client, then it'll be easy to change it later down the line. If not, it the server can always advertise another refspec to the clients when they do the initial fetch.
  • If you are using key rotation using signed tags, the server can simply not serve you the signed tags and skip key rotation altogether.

A key rotation tag should point to the HEAD commit of the registry index at the
time it is made. The commit it points to should be signed using a key which is
in the post-rotation key set.

This comment has been minimized.

@arielb1

arielb1 Aug 7, 2018

Contributor

If we implement some sort of auditing scheme for the can-commit keys using a Cloud HSM, that points out strongly to having the key rotation commit signature be in the pre-rotation key set.

Otherwise, an attacker that had compromised a (quorum of, if we ever use threshold signatures) rotation keys can bypass the audit scheme by creating a rotation commit signed with a key they own.

The problem with using the pre-rotation key set is that the Cloud HSM is a "single point of failure" - losing it will force a "hard update" outside of the secure updater, which might be too bad of an operational cost. I am not familiar enough with Cloud HSMs to be able to judge the risk of that happening.

I am not sure what is the best way to handle this.

This comment has been minimized.

@tarcieri

tarcieri Aug 30, 2018

Unless I'm reading this wrong, it really looks like it should say that the commit is signed by a key in the pre-rotation set. The post-rotation set would mean anyone could add any key so long as the commit is signed by that key.

This comment has been minimized.

@arielb1

arielb1 Aug 31, 2018

Contributor

@tarcieri

There are 2 signature checks on a key rotation commit:

  1. The key rotation commit must correspond to a tag, signed by a key rotation key.
  2. The key rotation commit is itself a commit, and therefore it must be signed by a commit key.

If the commit signature can be done by a "rotated" key, then the commit signature is worthless, but the tag signature still exists (i.e., anyone with a rotation key can rotate keys, but they don't need a commit key).

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 7, 2018

If you are using key rotation using signed tags, the server can simply not serve you the signed tags and skip key rotation altogether.

How is that different from a freeze attack? New commits will not be signed with the old keys, so either:

  1. The attacker had compromised an old key, in which case he can push malicious commits until the old key rotation expires. That seems to be impossible to avoid.
  2. The attacker had not yet compromised an old key, in which case they will "get stuck" after they run out of commits signed with the old key.

OTOH, this means that if keys are rotated in parts instead of in whole, then the attacker can delay key rotations until the key used to sign new commits had been rolled out.

To mitigate that, each commit can include in it the set of keys it trusts, which is verified to be the set of keys that Cargo trusts. That will detect "tag skipping" attacks where the attacker can delay "partial" rotations until they need to use a new signing key.

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 8, 2018

I'm really happy to see the conversation starting to move away from a blanket statement about "using TUF" to a nuanced conversation about specific security practices. I want to address each of these in turn after making a brief note about our general constraints:

  1. We have an existing infrastructure which we need to be backwards compatible with, because to a first approximation all of our users currently rely on it, and it is an unreasonable infrastructural overhead to introduce two non-overlapping services for crates.io. This means the fundamental structure of the git index needs to stay the same.
  2. cargo and registries are a polycentric system. Though crates.io is orders of magnitude larger than any other registry, users can depend on many different registries which use the same index system and which can determine their own security policies (such as not using signatures at all, as crates.io does now). It should remain possible for registries to be unsigned.
  3. If you want to set up a read only registry, it can be as simple creating a GitHub repo and putting some packages in an S3 bucket. Automating publication is the only part that actually requires a programmatic service. Ideally, you could also sign the data in such a registry without requiring you set up a service either.

For these reasons, it is strongly preferable to be able to transfer all data using the git index, which was a design constraint for this RFC. That is, no completely new index system, and no additional APIs to crates.io.

Threshold signatures

I considered incorporating support for threshold signatures into this RFC (using the same mechanism for transport that you describe, @tarcieri - mulitple signature packets in a single ASCII armored detached signature block), but decided against it. Threshold signatures are obviously a good idea to minimize the damage of key compromise when it is institutionally affordable.

However, I think it is unlikely that we would adopt a threshold greater than 1 for crates.io for either commits or rotations. Commits obviously are performed automatically, but I also don't think we can commit to the coordination overhead of requiring multiple administrators to sign off on a key rotation.

Instead, the currently adoptable mitigation for crates.io is to keep admin keys offline and rotate the online key often, which is much easier if rotation is cheap and easy and doesn't require multiple admins to coordinate.

If other registries come to us with a strong desire for threshold signatures, then we could incorporate that into the implementation by allowing them to specify the signature threshold for their registry in the metadata about that registry.

Outdated metadata

A malicious intermediate failing to forward you updates to the index is an outstanding attack that this RFC does not attempt to address. I intended to make it the next phase of hardening registries after this RFC was accepted and implemented.

I have a sketch of a potential solution to this attack:

  1. A signed registry can specify the frequency with which its index is guaranteed to be updated.
  2. Every index update has a signed timestamp (as part of the git commit metadata). If a registry's index hasn't had a new commit within the interval it specifies, cargo will treat it as damaged & broken, failing builds.
  3. crates.io will specify some short interval (e.g. an hour) and guarantee that it makes commits, even if they're null, within that interval (in practice crates.io is published to roughly every few minutes right now).

Its possible there are flaws with this solution that I haven't seen, but it would mean no additional services, roles, or keys. In any event, I think however this attack is solved, it can and should be solved in a separate later round.

Delegating trust

Like the problem of outdated metadata, I considered this a problem to be solved in the future. There are a few interesting trade offs in this space:

  1. It would be nice to have some kind of e2e security for every crate which is automated and zero overhead, so that everyone benefits. On the other hand, some crates will want the overhead of having greater control over how their packages are signed: e.g. they want to use offline keys that they manage themselves, or threshold signatures, and so on. I'm thinking here of security critical core libraries.
  2. Fundamentally, we can provide e2e authentication that even crates.io cannot fabricate by having packages be signed prior to publication. However, we obviously cannot provide e2e authorization because crates.io controls the right to publish. For example, we've discussed designs in which cargo can check that a crate was published by someone with access to a certain GitHub account, but crates.io ultimately has to decide if that GitHub account is allowed to publish any given package.
  3. Similarly, if that delegation is performed by an always-online key, it is as weak as that always-online key. But the infrastructural costs of an administrator manually performing a delegation using an offline key are quite high: how do we field requests for that delegation, since delegation would be a scarce resource, how do we decide how to distribute it if there is too much demand, etc?
  4. Can we provide delegated signatures of package metadata, as well as the package itself, and is it worthwhile? A problem for an ecosystem like crates.io is version resolution: if only the packages are signed, its possible for someone to introduce a constraint like <=7.1.0 on a transitive dependency, forcing version resolution to reject version 7.1.1, which has an important security update. This is one reason why, in this proposal, the registry index is signed rather than the packages themselves - but for delegated trust, signing the metadata for individual versions of individual packages with the existing crates.io registry index format is a bit challenging.

I don't have any particular answer to any of these questions. Just as you all have suggested with your comparison to certain PEPs, I would prefer to leave this problem to a future RFC. I think there is no reason that the proposal here is not forward compatible with that sort of extension.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 8, 2018

For these reasons, it is strongly preferable to be able to transfer all data using the git index

TUF metadata can be TOML files committed to Git (i.e. the git+tuf:// approach @trishankatdatadog was suggesting). In fact this entire proposal can be made "TUF compatible" just by using the TUF file structure rather than the one in this post, and leaving everything else the same.

Why do that? Well...

However, I think it is unlikely that we would adopt a threshold greater than 1 for crates.io for either commits or rotations

TUF would make it easy to support threshold signatures in the future without having to commit to them today.

A malicious intermediate failing to forward you updates to the index is an outstanding attack that this RFC does not attempt to address. I intended to make it the next phase of hardening registries after this RFC was accepted and implemented.

TUF is already designed to make it possible to separate these roles in the future.

Fundamentally, we can provide e2e authentication that even crates.io cannot fabricate by having packages be signed prior to publication. However, we obviously cannot provide e2e authorization because crates.io controls the right to publish.
Similarly, if that delegation is performed by an always-online key, it is as weak as that always-online key.

Delegations can be performed by an online key in a TOFU-ish model where an online key can be used to "claim" packages (i.e. add an initial E2E key), but once trust has been delegated to an end-user key, the online key cannot be used to make further changes to that crate. Instead, an existing owner would need to sign off on adding/removing existing owners. This can use the existing crates.io flows (i.e. cargo owner) but supplement them with end-user signatures.

This would require something like a git pre-receive hook that verifies E2E signatures verify. If crates.io were to violate these rules, it would trip this hook, and the index update would fail.

In general, reading over your response @withoutboats, it sounds like you would like to leave the door open to implement many more features that TUF is already designed to support. If TUF can deliver the minimum viable feature set today using git-transported TOML files, is there a particularly compelling reason not to use their file formats, given they have already thought through all of these problems?

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Aug 8, 2018

@tarcieri can you be more specific about what changes it would mean to "use TUF file formats"? I have to admit if, as you seem to claim, it results in no material changes to the implementation, I also see no reason to do it - but I think you're mistaken in at least some regards. For example, you say "TUF would make it easy to support threshold signatures in the future without having to commit to them today," but I think you misunderstood me: I don't want to bother implementing support for understanding threshold signatures in cargo until there is a need for them, but I believe when you refer to "TUF's file formats," that would include an ability to specify a threshold higher than 1.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 8, 2018

can you be more specific about what changes it would mean to "use TUF file formats"?

TUF uses a set of metadata files which are described by the structs here:

https://github.com/heartsucker/rust-tuf/blob/develop/src/metadata.rs

Serialization is handled here... we could add a TOML serializer (possibly just adding #[derive(Serialize, Deserialize)] to the metadata structs and using toml-rs):

https://github.com/heartsucker/rust-tuf/tree/develop/src/interchange

You'd wind up with a slightly different set of files (separated by TUF roles), but the net effect would be the same.

For example, you say "TUF would make it easy to support threshold signatures in the future without having to commit to them today," but I think you misunderstood me: I don't want to bother implementing support for understanding threshold signatures in cargo until there is a need for them, but I believe when you refer to "TUF's file formats," that would include an ability to specify a threshold higher than 1.

For now, you could require the threshold be 1. But the entire system would already be designed to support increasing that in the future.

Likewise, all the TUF roles could be supported, but for now the keys could be the same. This would allow a single system/service (i.e. @bors) to perform all the TUF roles for now (well, except for the root role) while leaving the door open to separate them in the future.

My general worry here is when I read over what you may want to support in the future, I'm getting a bit of a "Greenspun's tenth rule" vibe, i.e. it sounds like you may want to eventually support many of the features that TUF is already designed to support.

@SantiagoTorres

This comment has been minimized.

Copy link

SantiagoTorres commented Aug 8, 2018

I'm really happy to see the conversation starting to move away from a blanket statement about "using TUF" to a nuanced conversation about specific security practices.

I'm happy about this too, and that we're having a well-tempered(?) discussion about all of this 👍

I do have a couple of concerns here, mostly the fact that this architecture is tying itself to 1) Git as a transport/security foundation and 2) GitHub as an identity management system. While I see the benefit of it in terms of ease of implementation, I think you may eventually end up painting yourself in the corner (e.g., it's starting to look like that in the case of supporting threshold signing and trust delegation).

Having said this, your proposals look like they would work IMHO.

cargo and registries are a polycentric system. Though crates.io is orders of magnitude larger than any other registry, users can depend on many different registries which use the same index system and which can determine their own security policies (such as not using signatures at all, as crates.io does now). It should remain possible for registries to be unsigned.n the long term.

This is a very valid point, but I don't see how it conflicts with anything proposed. Notary uses TUF constructions and everyone can create a registry themselves. Whether the Registry uses Notary to sign or not that's a decision left to the administrator of the registry.

Regarding threshold signatures:

mulitple signature packets in a single ASCII armored detached signature block), but decided against it. Threshold signatures are obviously a good idea to minimize the damage of key compromise when it is institutionally affordable.

This could be a good/interesting solution, however:

If other registries come to us with a strong desire for threshold signatures, then we could incorporate that into the implementation by allowing them to specify the signature threshold for their registry in the metadata about that registry.

I feel that this is slowly turning into TUF. Likewise:

Every index update has a signed timestamp (as part of the git commit metadata). If a registry's index hasn't had a new commit within the interval it specifies, cargo will treat it as damaged & broken, failing builds.

This is quite what a timestamp role does. Whether you'd like to merge this responsibility with the same as a top-level snapshot or targets role (in TUF lingo) to simplify things operationally is an alternative that has been explored before. For example, notary takes the same approach: timestamp and snapshot are kept with the same host/authority for those who want to avoid the burden of maintaining multiple keys.

I do think that your proposals for these issues sound sensible (esp. with @tarcieri 's suggestions for delegating trust). I may be biased, but it'd start looking like TUF in the end. Regardless of the transport and metadata limitations, this doesn't sound like such a daunting thing to do from the get-go, or at least leave as a design target so as to avoid making any changes that could block this from happening later down the line.

@SantiagoTorres

This comment has been minimized.

Copy link

SantiagoTorres commented Aug 8, 2018

How is that different from a freeze attack? New commits will not be signed with the old keys, so either:

  • The attacker had compromised an old key, in which case he can push malicious commits until the old key rotation expires. That seems to be impossible to avoid.
  • The attacker had not yet compromised an old key, in which case they will "get stuck" after they run out of commits signed with the old key.

I think then it'd boil down to implementation details. Have in mind that a server can also add arbitrary metadata (e.g., create a "new" tag refspect that points to an old tag). When this happens it'd depend on how the implementation walks these new tags:

  • if it'd take the commit from where it started and walk back, then you should be fine
  • if instead it just adds the new key sets to the set of trusted keys then you're in trouble

Either way, it may not be such a big concern as long as cases like these are considered and accounted for in the implementation details.

@trishankatdatadog

This comment has been minimized.

Copy link

trishankatdatadog commented Aug 8, 2018

I agree with @tarcieri and @SantiagoTorres that TUF can support the security guarantees that everyone is seeking now and going forward.

@erickt

This comment has been minimized.

Copy link

erickt commented Aug 8, 2018

For reference, OPAM has a proposal and paper for a TUF variant built upon their git-based package repository. It doesn't look like it's been implemented yet though. It sounds pretty similar to this proposal.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 8, 2018

@SantiagoTorres

One thing I like about performing the signatures in git is that the entire git tree is verified, and therefore if we avoid checking out the git repository before we verify signatures on it, attackers without the commit key can't perform "git -> filesystem" attacks (the likes of CVE-2018-11235) or the kinds of "forgotten verification attacks" that plague APT.

Of course, that is not really compatible with delegating trust for specific packages. I think that requiring the root repository to be signed by a certificate-transparency-style mechanism is plenty of mitigation against these sort of attacks, because the attacker would have to deploy their exploit for all the world to see.

Therefore, if we implement a "package signing" mechanism while using git as a transport, I believe we should implement it in addition to signing all commits.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Aug 8, 2018

That basically sums up my position for having separate "package signing" and "commit signing" mechanisms, with the package signing mechanism piggybacking off the commit merkle tree for downgrade protection.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 10, 2018

@arielb1 that sounds good to me

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 29, 2018

I took a stab at modifying this proposal to use TUF file formats and concepts:

withoutboats#7

I think they map pretty naturally: I mapped the can-rotate privilege to the TUF root role, and the can-commit privilege to the TUF timestamp role.

I think there's a lot of value in using TUF for this purpose, particularly for a "Phase 2" of supporting end-to-end signatures on crates created by developers at the time the crates are published. It also doesn't require that many changes: really it's just changes to the file formats.

@tarcieri tarcieri referenced this pull request Aug 29, 2018

Open

Security model / TUF #75

@burdges

This comment has been minimized.

Copy link

burdges commented Aug 31, 2018

Avoid true threshold signatures here, ala BLS, because they anonymize the signers within the key set, which weakens accountability in any application like this or TUFs. There is nothing wrong with ordinary logic that says "approve rotation if k of these n keys signs", but you need the specific k signers to be designated in each signature. In fact, there is nothing even wrong with BLS aggregation via delinearization used as a fake threshold signature scheme, so that verification requires identifying the signers. Apologies for not picking about the terminology.. maybe multi-signature sounds less ambiguous.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Aug 31, 2018

@burdges the threshold signature implementation I think might make sense here is using a sequence of signature packets within a single ASCII "armored" OpenPGP message, particularly since it allows doing threshold signing in a backwards compatible way, where existing tools can still verify the signature but ignore the additional signature packets in the same message. This approach works with the signature algorithm of your choice.

@burdges

This comment has been minimized.

Copy link

burdges commented Sep 1, 2018

Yeah, that's harmless of course.

@Centril Centril added the A-security label Nov 22, 2018

@ebkalderon

This comment has been minimized.

Copy link

ebkalderon commented Dec 4, 2018

Quick question about key rotation and the handling of compromised keys. If a previously trusted key used to sign commits is suddenly known to be compromised or revoked, what is the procedure for recovering from that? For example, are the commits to the registry index signed by that key cherry-picked, verified by the original committer, and re-signed by a new trusted key? I could not determine from the text of this RFC how compromised keys are handled.

@tarcieri

This comment has been minimized.

Copy link

tarcieri commented Dec 4, 2018

@ebkalderon glossing over the weaknesses in SHA-1 for a moment, signing any given commit authenticates the entire history, as the entire history forms a single hash-based data structure.

This is covered on L46-L51 of this proposal. L58 describes how authentication occurs: only the signature on the latest commit (i.e. HEAD) is checked. The other signatures may be invalid (and MUST be treated as invalid if they were created by a compromised key).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment