Add a document describing key revocation #47

mnm678 · 2021-03-02T21:46:36Z

This document lays out some of the options for key revocation that have been discussed. These options might eventually fit better as part of the key management document, but are posted separately for the sake of discussion.

This document may eventually be part of the key management requirements. It describes a few common mechanisms for key revocation. Signed-off-by: Marina Moore <mnm678@gmail.com>

joshuagl · 2021-03-03T11:51:35Z

key-revocation.md

+
+One of the goals of Notary v2 is to build in solutions for key revocation that are easy to use and ensure that users will always use non-compromised keys. This document discusses some potential mechanisms for key revocation.
+
+In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a wholistic approach to key revocation in TUF.


Suggested change

In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a wholistic approach to key revocation in TUF.

In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a holistic approach to key revocation in TUF.

joshuagl · 2021-03-03T11:53:17Z

key-revocation.md

+
+## Distribution of trusted keys
+
+Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique as the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation.


Suggested change

Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique as the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation.

Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique has the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation.

sudo-bmitch

Each of these options has pros and cons. Thinking through them:

Key Expiration: this has the advantage of being automatically enforced, even for disconnected environments. I'd question if this means everything the key has previously signed would need to be resigned. Fairly certain the answer is yes (otherwise an attacker could sign a malicious image for 10 years when they breach a key that only has 2 hours left until the certificate on the key expires). That would result in lots of re-signing of old images for every key rotation. Perhaps that could be made easier by having a single signature for a list of images (digests), rather than a separate signature for each individual image.

Revocation lists: while it's convenient that this has an immediate affect when the revocation is published, I'm seeing multiple downsides. In disconnected environments, that query may fail, or it may be sent to a mirror server the client in that disconnected environment is told to trust. A stale mirror in the disconnected environment could be used to send malicious images, though in those cases it's the client intentionally indicating they want to trust a mirror that shouldn't have been trusted. The more concerning scenario to me are the devices with access to the public internet using the upstream revocation list. What do we do when access to that revocation list goes down? Do we fail insecure and potentially allowing a vulnerability, or fail secure and cause an outage. Last year's Apple scenario showed we can have the worst of both, where the revocation server could be extremely slow to eventually timeout on the response.

For the TUF scenario, I think we want to explore what it would look like for the root key to be eventually expired (with a relatively long lifetime). And with short lifetimes on the timestamp signing, what does that look like for mirrors and popular registries that want to push as much out to CDN's as possible.

And it's bigger than this document scoped, but we also need to explore what key distribution looks like with v2. If we are avoiding TOFU by having clients explicitly trusting a root key for the organization, how is that root key first deployed, how does it get rotated, and if we do in-band rotation, can we trust that chain of rotated root keys that lead back to one or more now expired keys.

sudo-bmitch · 2021-03-03T12:49:00Z

key-revocation.md

+
+However, the user must be able to ensure that the key revocation list is accurate and up to date. If an attacker is able to replay an old revocation list, the user may continue to trust compromised keys. Therefore the distribution of the key revocation list must allow the user to verify authenticity and timeliness.
+
+Also, for security reasons, keys cannot be removed from a key revocation list, so the list will grow larger and larger over time and may eventually have a noticeable bandwidth impact.


To mitigate the risk of ever growing revocation list, there can be a combination, a longish expiration time on keys that can be revoked. Then a revoked key can be removed from the list after it eventually expires.

sudo-bmitch · 2021-03-03T13:17:57Z

I could also see a use case for a middle ground, where there's several types of keys used by an organization. One to identify "this content was produced by this organization" and another key that claims "we believe this content is currently secure". That would allow Docker to use a longer lifetime key on something like a 4 year old ubuntu image, that we know has vulnerabilities, but still want to know that Docker signed it. And since that image isn't changing, it could go through a more controlled offline signing process with a longer lived key. While the ubuntu image that was just updated yesterday, that we think is secure, may get signed with a key that expires in a week.

dlorenc · 2021-03-03T13:30:20Z

I haven't been looking at the problem as broadly as all of you - on purpose - but I'm strongly in favor of expiration approaches rather than revocation ones.

Transparency logs + timestamps are better than revocation IMO - and can also be made compatible with most air-gapped scenarios.

So, overall strong +1 to this entire document @mnm678 :)

Signed-off-by: Marina Moore <mnm678@gmail.com>

Add an initial list of pros and cons for each technique and add a few clarifications Signed-off-by: Marina Moore <mnm678@gmail.com>

dlorenc · 2021-03-21T15:00:07Z

This LGTM!

SteveLasker · 2021-03-23T03:21:03Z

I started writing up some thoughts and concerns for the scaling problem of maintaining all the public and private content we expect to see currently, and the coming years.
Before getting into the tradeoffs of short-lived, long-lived keys, how we might continually update those, or how we might manage revocations or allow lists, I figured I'd outline a sense of scale.
Please see: Scaling Public & Private Registries, and perhaps we can have a discussion there.

My hope is we can have a baseline of scale, and then the various approaches may start to become more obvious.

dlorenc · 2021-03-23T03:57:57Z

To be honest those are still tiny numbers compared to other public-key crypto key management systems in the wild today. Could you be more specific on what you think will be a scaling issue here?

SteveLasker · 2021-03-26T22:50:27Z

@mnm678, I can see this doc being used in one of two ways:

A good generalization of the key revocation scenarios, as general background reference we consider through our designs.
A discussion for how TUF solves some of the problems (not trying to state quantity, just that it's an opinionated approach)

If you can strip this down to general background on the scenarios, we can merge it into the repo as good overview for the reader, regardless of implementation. As we develop the key management specs of Notary v2, we can reference this, and update it to reflect how Notary v2 solves these problems.

Or, we can transfer this as a discussion, capturing an opinionated view on TUF.

The difference allows us to merge content (1) we haven't closed on a direction with, vs. opinionated content on a specific solution we haven't committed to yet (2)

sudo-bmitch · 2021-03-29T17:35:55Z

I think there's a different take on the challenge that TUF provides, but agree that doesn't need to be specifically called out as requiring TUF in this document. Instead I'd keep the last section but rephrase it to not specifically name TUF, but instead describe some of the qualities of an intermediate solution, one that compresses the short expiration certificates into a single signature on a collection of artifacts, which is separate from the individual signatures within that collection.

Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 · 2021-04-02T19:16:36Z

Thanks @SteveLasker and @sudo-bmitch. I updated the description of TUF's approach to key revocation to more generally describe combining explicit and implicit revocation.

SteveLasker · 2021-04-02T23:14:08Z

Thanks @mnm678,
The description has references to TUFs implementation, so I think you're asking for this to move to a discussion for how TUF solves these problems, as opposed to the general background, not referencing specific implementations.
We can transfer "issues" to "discussions", but we can't transfer PRs to "discussions".
Can you create a discussion titled something like "Describing Key Revocation with TUF", link this PR and close this one?

mnm678 · 2021-04-03T01:02:02Z

I guess I wasn't clear. I re-worded the final option to explain how implicit and explicit key revocation can be used in general so that it can be discussed in the context of other approaches to key revocation. The technique of combining implicit and explicit key revocation was introduced in this paper by @JustinCappos and others and refined in this paper by @trishankatdatadog and others. It has only been used widely in TUF and related projects, so I think a mention of TUF is necessary to understand the technique and know where to look for more information.

More fundamentally, when talking about key revocation the specific implementation is important, because like other security systems, it is only as secure as its weakest link. That's the benefit of using existing, well tested security mechanisms instead of attempting to build them from the ground up.

trishankatdatadog · 2021-04-03T03:01:19Z

The description has references to TUFs implementation, so I think you're asking for this to move to a discussion for how TUF solves these problems, as opposed to the general background, not referencing specific implementations.

Steve, I'm hard-pressed to see how you could discuss a comprehensive background without discussing specific implementations. That's like citing papers without naming its authors.

Can you create a discussion titled something like "Describing Key Revocation with TUF", link this PR and close this one?

Is there a good reason to close this PR and open a discussion instead, other than citing specific implementations?

sudo-bmitch · 2021-04-15T01:28:02Z

For the last option, while TUF may be the only solution we know of that takes this approach, we've done such a good job keeping the other sections abstract and not listing the various implementations of each technique that naming TUF in the last section comes across as a sales pitch.

Here's my own attempt to reword this:

Combining explicit and implicit revocation

By using a hierarchical combination of keys, a trusted root key can delegate signing to various keys that expire. Additionally, artifacts may be signed by more than one key, allowing automated tooling to provide short lived signatures that verify the signer and artifact have not been revoked. Clients then verify the necessary collection of signatures is found on the artifact.

This method allows signers to have relatively long lived keys, to simplify their workflow and avoid needing to resign the artifacts themselves, while enabling timely revoking of the signer key or a single artifact signature.

For efficiency, a meta-artifact can be created and maintained, containing references to a collection currently signed artifacts. And the short lived signature can be created for this single artifact, rather than every artifact individually.

Pros:

Keys may be quickly revoked
Individual artifact signatures may be quickly revoked
Signers do not need to frequently resign all artifacts
Verifiers only need to trust the root key, all delegated keys can be verified against this

Cons:

Requires maintenance of an automated system to refresh short lived signatures
A root key compromise requires updating all signers, clients, and signatures on the artifacts
Signatures in disconnected environments and on artifact copies may quickly become stale
Updating short lived signatures on a large number of artifacts may encounter scaling challenges and loses some of the caching efficiencies of content addressable storage in registries

mnm678 · 2021-04-27T18:04:44Z

Thanks @sudo-bmitch. I updated the pr.

gokarnm · 2021-05-25T23:01:45Z

key-revocation.md

+
+## Key Expiration
+
+Adding an expiration time to every key allows keys to automatically be revoked after a certain period of time. The expiration time is usually included with the key so that it is easy for users to find. This technique does not require any action from the key holder, and ensures that users will have to refresh their trusted keys before those keys expire.


It might help to clarify that the key expiry (metadata) needs to be signed by a issuing key that the client trusts.

gokarnm · 2021-05-25T23:03:11Z

key-revocation.md

+
+Cons:
+* Keys can't be revoked before expiration
+* Artifacts must be re-signed after expiration


Timestamping is an option that allows use of signed artifacts after the key expires.

gokarnm · 2021-05-25T23:06:13Z

key-revocation.md

+* Artifacts must be re-signed after expiration
+
+
+## Key revocation lists (Deny lists)


This section and next is similar to Allowlist and Denylist in Repository, the challenge is synchronizing deny lists in multi-registry scenarios. Also we don't cover artifact level revocation in this doc, and specify where allow/deny lists will be stored.

I added a note about synchronization. I purposefully left out artifact level revocation in this document for now, but we can combine those discussions is a later draft.

gokarnm · 2021-05-25T23:18:44Z

threatmodel.md

@@ -10,3 +10,10 @@ It is assumed that an attacker may perform one or more the following actions:

 While it is not always possible to protect against all scenarios, the system should to the extent possible mitigate and/or reduce the damage caused by a successful attack, detect the occurrence of an attack and notify appropriate parties, yet remain usable for parties operating the system.  Furthermore, the system should recover from successful attacks in a way that presents low operational overhead and risk to users.

+Attacker Goals:


Should these be reworded to be less generic terminology, and targeted for artifact registry and consumers? Also is this intended to be an initial version? I think the final threat model and analysis would be detailed.

It looks like these are duplicated in #35, and they are a bit out of scope for this pr, so I'll remove them here and try to address your comment there

mnm678 · 2021-05-27T22:32:24Z

Thanks for the review @gokarnm! I updated the document and responded inline to a couple of comments.

mnm678 · 2021-06-06T13:42:50Z

@sudo-bmitch @gokarnm I updated the intro as discussed in the meeting. Can I get a review/approval to merge?

sudo-bmitch · 2021-06-06T14:58:53Z

Just noticed the DCO validation error. Not sure if that will block the ability to merge.

Still LGTM and hope we can merge and iterate forward. Thanks for driving this @mnm678 !

gokarnm · 2021-06-06T15:53:17Z

@mnm678 LGTM!

SteveLasker

If we can add a con to the meta-artifact section, addressing the multi-registry challenge for moving individual artifacts

SteveLasker · 2021-06-07T16:23:42Z

key-revocation.md

+
+However, the user must be able to ensure that the key revocation list is accurate and up to date. If an attacker is able to replay an old revocation list, or show different versions to different registries, the user may continue to trust compromised keys. Therefore the distribution of the key revocation list must allow the user to verify authenticity and timeliness.
+
+Also, for security reasons, keys cannot be removed from a key revocation list, so the list will grow larger and larger over time and may eventually have a noticeable bandwidth impact, although this can be mitigated by combining key revocation lists with keys that expire.


Is this true, that once in a list, it's never removed? Or, can keys that are known to have expired be removed at a later date? Perhaps 50% longer than the life of the key or something. It does seem like a non-scalable solution that needs mitigation.

This could be combined with strongly enforced key expiration to allow the keys to eventually be deleted.

SteveLasker · 2021-06-07T16:28:44Z

key-revocation.md

+
+This method allows signers to have relatively long lived keys, to simplify their workflow and avoid needing to resign the artifacts themselves, while enabling timely revoking of the signing key or a single artifact signature.
+
+For efficiency, a meta-artifact can be created and maintained, containing references to a collection currently signed artifacts. And the short lived signature can be created for this single artifact, rather than every artifact individually.


How would this interact with individual artifacts moving within and across registries?

it depends on the implementation, but the meta-artifact would have to be updated as the collection changes.

SteveLasker · 2021-06-07T16:29:55Z

key-revocation.md

+Cons:
+* Requires maintenance of an automated system to refresh short lived signatures
+* A root key compromise requires updating all signers, clients, and signatures on the artifacts
+* Updating short lived signatures on a large number of artifacts may encounter scaling challenges and loses some of the caching efficiencies of content addressable storage in registries


Can we add a con here for the meta-artifact collection of keys needs to somehow be parseable for individual artifact movement within and across registries?

The meta-artifact is an optional efficiency feature, so I added the con to the paragraph describing the feature.

SteveLasker · 2021-06-07T16:31:40Z

@mnm678, can you also solve the DCO issue?

@sudo-bmitch

Update the combined key revocation option to remove references to TUF and more generically describe the way it allows for both explicit and implicit key revocation. Thanks to @sudo-bmitch for wording suggestions. Signed-off-by: Marina Moore <mnm678@gmail.com>

Signed-off-by: Marina Moore <mnm678@gmail.com>

dlorenc · 2021-06-25T15:30:56Z

This looks great to me, we can definitely keep iterating after merge.

hallyn · 2021-06-25T15:31:59Z

+1 on merge

SteveLasker

LGTM

Add a document describing key revocation.

20b9189

This document may eventually be part of the key management requirements. It describes a few common mechanisms for key revocation. Signed-off-by: Marina Moore <mnm678@gmail.com>

joshuagl reviewed Mar 3, 2021

View reviewed changes

sudo-bmitch reviewed Mar 3, 2021

View reviewed changes

mnm678 added 2 commits March 4, 2021 10:02

Clarify and fix typos

e1a096b

Signed-off-by: Marina Moore <mnm678@gmail.com>

Add pros and cons

7ec9000

Add an initial list of pros and cons for each technique and add a few clarifications Signed-off-by: Marina Moore <mnm678@gmail.com>

Generalize description of key revocation in TUF

5f9966b

Signed-off-by: Marina Moore <mnm678@gmail.com>

gokarnm reviewed May 25, 2021

View reviewed changes

mnm678 force-pushed the key-revocation branch from 22bbd4b to eb7e969 Compare June 7, 2021 13:34

mnm678 mentioned this pull request Jun 7, 2021

Add scenario for user-specified key #75

Merged

SteveLasker reviewed Jun 7, 2021

View reviewed changes

mnm678 added 2 commits June 8, 2021 10:42

Add clarifications

99ee8db

Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 added 3 commits June 8, 2021 10:44

remove attacker goals from this pr

45a42ae

Signed-off-by: Marina Moore <mnm678@gmail.com>

remove references to solution in the intro for this overview document

3d31406

Signed-off-by: Marina Moore <mnm678@gmail.com>

Add description of meta-artifact downsides for artifact movement

08647d2

Signed-off-by: Marina Moore <mnm678@gmail.com>

mnm678 force-pushed the key-revocation branch from eb7e969 to 08647d2 Compare June 8, 2021 14:45

SteveLasker approved these changes Jun 28, 2021

View reviewed changes

SteveLasker merged commit ab8fd3a into notaryproject:main Jun 28, 2021

iamsamirzon added this to All Close PRs in Test2Project Sep 15, 2021

JustinCappos mentioned this pull request Dec 14, 2022

Health of the Notary V2 project cncf/toc#981

Closed


		One of the goals of Notary v2 is to build in solutions for key revocation that are easy to use and ensure that users will always use non-compromised keys. This document discusses some potential mechanisms for key revocation.

		In existing systems, there are three main approaches to key revocation: automatic revocation through key expiration, key revocation lists, and distribution of trusted keys. I discuss some of the benefits and pitfalls of each of these techniques, and how some of these techniques are combined to provide a wholistic approach to key revocation in TUF.


		## Distribution of trusted keys

		Instead of distributing untrusted keys, this method distributes a list of currently trusted keys. If a key needs to be revoked, it is removed from the list of trusted keys. This technique as the added benefit of ensuring that users have access to the new trusted key as soon as they learn of a revocation.


		However, the user must be able to ensure that the key revocation list is accurate and up to date. If an attacker is able to replay an old revocation list, the user may continue to trust compromised keys. Therefore the distribution of the key revocation list must allow the user to verify authenticity and timeliness.

		Also, for security reasons, keys cannot be removed from a key revocation list, so the list will grow larger and larger over time and may eventually have a noticeable bandwidth impact.


		## Key Expiration

		Adding an expiration time to every key allows keys to automatically be revoked after a certain period of time. The expiration time is usually included with the key so that it is easy for users to find. This technique does not require any action from the key holder, and ensures that users will have to refresh their trusted keys before those keys expire.

		* Artifacts must be re-signed after expiration


		## Key revocation lists (Deny lists)

		@@ -10,3 +10,10 @@ It is assumed that an attacker may perform one or more the following actions:

		While it is not always possible to protect against all scenarios, the system should to the extent possible mitigate and/or reduce the damage caused by a successful attack, detect the occurrence of an attack and notify appropriate parties, yet remain usable for parties operating the system. Furthermore, the system should recover from successful attacks in a way that presents low operational overhead and risk to users.

		Attacker Goals:

Add a document describing key revocation #47

Add a document describing key revocation #47

Conversation

mnm678 commented Mar 2, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sudo-bmitch left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sudo-bmitch commented Mar 3, 2021

dlorenc commented Mar 3, 2021

dlorenc commented Mar 21, 2021

SteveLasker commented Mar 23, 2021

dlorenc commented Mar 23, 2021

SteveLasker commented Mar 26, 2021

sudo-bmitch commented Mar 29, 2021

mnm678 commented Apr 2, 2021

SteveLasker commented Apr 2, 2021

mnm678 commented Apr 3, 2021

trishankatdatadog commented Apr 3, 2021

sudo-bmitch commented Apr 15, 2021

Combining explicit and implicit revocation

mnm678 commented Apr 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mnm678 commented May 27, 2021

mnm678 commented Jun 6, 2021

sudo-bmitch commented Jun 6, 2021

gokarnm commented Jun 6, 2021

SteveLasker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveLasker commented Jun 7, 2021

dlorenc commented Jun 25, 2021

hallyn commented Jun 25, 2021

SteveLasker left a comment

Choose a reason for hiding this comment

sudo-bmitch left a comment •

edited