-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anti replay #1005
Anti replay #1005
Changes from 1 commit
57c380b
9ae609f
5ecece3
39132dc
3685f9e
771ed48
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -489,6 +489,14 @@ informative: | |
author: | ||
- | ||
ins: H. Krawczyk | ||
Mac17: | ||
title: "Security Review of TLS1.3 0-RTT" | ||
date: 2017 | ||
target: https://github.com/tlswg/tls13-spec/issues/1001 | ||
author: | ||
- | ||
ins: C. MacCarthaigh | ||
|
||
|
||
--- abstract | ||
|
||
|
@@ -605,6 +613,9 @@ draft-21 | |
- Add a per-ticket nonce so that each ticket is associated with a | ||
different PSK (*). | ||
|
||
- Add discussion of 0-RTT and replay. Recommend that implementations | ||
implement some anti-replay mechanism. | ||
|
||
draft-20 | ||
|
||
- Add "post_handshake_auth" extension to negotiate post-handshake authentication | ||
|
@@ -1375,7 +1386,7 @@ keys derived using the offered PSK. | |
Unless the server takes special measures outside those provided by TLS, | ||
the server has no guarantee that the same | ||
0-RTT data was not transmitted on multiple 0-RTT connections | ||
(see {{replay-time}} for more details). | ||
(see {{replay-time}} and {{replay-0rtt}} for more details). | ||
This is especially relevant if the data is authenticated either | ||
with TLS client authentication or inside the application layer | ||
protocol. However, 0-RTT data cannot be duplicated within a connection (i.e., the server | ||
|
@@ -2943,62 +2954,6 @@ servers MUST process the client's ClientHello and then immediately | |
send the ServerHello, rather than waiting for the client's | ||
EndOfEarlyData message. | ||
|
||
#### Replay Properties {#replay-time} | ||
|
||
As noted in {{zero-rtt-data}}, TLS provides a limited mechanism for | ||
replay protection for data sent by the client in the first flight. | ||
This mechanism is intended to ensure that attackers cannot replay | ||
ClientHello messages at a time substantially after the original | ||
ClientHello was sent. | ||
|
||
To properly validate the ticket age, a server needs to store | ||
the following values, either locally or by encoding them in | ||
the ticket: | ||
|
||
- The time that the server generated the session ticket. | ||
- The estimated round trip time between the client and server; | ||
this can be estimated by measuring the time between sending | ||
the Finished message and receiving the first message in the | ||
client's second flight, or potentially using information | ||
from the operating system. | ||
- The "ticket_age_add" parameter from the NewSessionTicket message in | ||
which the ticket was established. | ||
|
||
The server can determine the client's view of the age of the ticket by | ||
subtracting the ticket's "ticket_age_add value" from the | ||
"obfuscated_ticket_age" parameter in the client's "pre_shared_key" | ||
extension. The server can independently determine its view of the | ||
age of the ticket by subtracting the the time the ticket was issued | ||
from the current time. If the client and server clocks were running | ||
at the same rate, the client's view of would be shorter than the | ||
actual time elapsed on the server by a single round trip time. This | ||
difference is comprised of the delay in sending the NewSessionTicket | ||
message to the client, plus the time taken to send the ClientHello to | ||
the server. | ||
|
||
The mismatch between the client's and server's views of age is thus | ||
given by: | ||
|
||
~~~~ | ||
mismatch = (client's view + RTT estimate) - (server's view) | ||
~~~~ | ||
|
||
There are several potential sources of error that make an exact | ||
measurement of time difficult. Variations in client and server clock | ||
rates are likely to be minimal, though potentially with gross time | ||
corrections. Network propagation delays are the most likely causes of | ||
a mismatch in legitimate values for elapsed time. Both the | ||
NewSessionTicket and ClientHello messages might be retransmitted and | ||
therefore delayed, which might be hidden by TCP. For browser clients | ||
on the Internet, this implies that an | ||
allowance on the order of ten seconds to account for errors in clocks and | ||
variations in measurements is advisable; other deployment scenarios | ||
may have different needs. Outside the selected range, the | ||
server SHOULD reject early data and fall back to a full 1-RTT | ||
handshake. Clock skew distributions are not | ||
symmetric, so the optimal tradeoff may involve an asymmetric range | ||
of permissible mismatch values. | ||
|
||
## Server Parameters | ||
|
||
The next two messages from the server, EncryptedExtensions and | ||
|
@@ -3593,6 +3548,127 @@ appropriate application traffic key as described in {{updating-traffic-keys}}. | |
In particular, this includes any alerts sent by the | ||
server in response to client Certificate and CertificateVerify messages. | ||
|
||
## 0-RTT and Anti-Replay {#replay-time} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not that it matters much, but should this section be labeled |
||
|
||
As noted in {{zero-rtt-data}}, unlike 1-RTT data, TLS does not provide | ||
inherent replay protections for 0-RTT data. Instead, it provides | ||
mechanisms which allow a server to implement a number of limited | ||
server-side anti-replay defenses. Servers SHOULD implement | ||
either Single-Use Tickets {{single-use-tickets}} or | ||
Client Hello Recording {{client-hello-recording}} | ||
as described | ||
below, and if not, SHOULD implement the stateless mechanism | ||
described in {{stateless-anti-replay}}. | ||
See {{replay-0rtt}} for more information on the limitations | ||
of these mechanisms. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps mention that clients and application protocols must assume in their security profile (and in-terms of what they are willing to send via 0-RTT) that servers are only implementing stateless-anti-replay? (When the "server" spans multiple clusters with non-trivial latency between them, stateless-anti-replay is the only one that works and it has the weakest guarantees.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (This is mentioned below, but repeating it here as well may be worthwhile.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's also possible to use an instance of Client Hello Recording per cluster in a distributed setup, effectively reducing the # of possible replay for a "Large Number" to 1 per cluster. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Aren't the three alternatives in the SHOULD all equal? Servers need not permit 0-RTT at all, but those which do SHOULD implement either Single-Use Tickets {{single-use-tickets}}, Client Hello Recording {{client-hello-recording}}, or the stateless mechanism described in {{stateless-anti-replay}}. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a WG decision ultimately, but IMO they are not. the first two clearly are stronger. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd go further: the third method doesn't work - it doesn't prevent replays. It permits thousands to billions of replays, depending on the amount of bandwidth and hosts available, and it doesn't mitigate several of the attacks. It should be taken out - it's insecure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here I would argue for something stronger: "0-RTT server implementations that must interoperate with third party systems and applications MUST implement a robust anti-replay mechanism". My reasoning here is that these a CDN or TLS accelerator that enables 0-RTT without robust anti-replay, will break other downstream systems. (For example upstream 0-RTT leading to throttle exhaustion down stream). I want clear and strong language so that there can be no ambiguity when a CVE is requested against the upstream component. It's not ok imo to break a basic assumption about the internet like that. |
||
|
||
### Single-Use Tickets | ||
|
||
The simplest form of anti-replay defense is for the server to only | ||
allow each session ticket once. In order to implement this, the server | ||
maintains a database of all outstanding valid tickets; deleting each | ||
ticket from the database as it is used. If an unknown ticket is | ||
provided, the server falls back to a full handshake as normal. | ||
|
||
If the tickets are not self-contained but rather are database keys, | ||
and these PSKs are deleted upon use, then connections established | ||
using one PSK enjoy forward security with respect to other PSKs | ||
established on the same connection. This is a security advantage for | ||
all 0-RTT data and for PSK usage when PSK is used without DH. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure this part really fits in the current exposition as-is. (Also, doesn't the addition of a per-ticket nonce into the PSk ticket derivation give self-contained tickets the same forward secrecy property?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, because compromise of the STEK leads to compromise of all tickets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, database-key tickets certainly have better forward security properties than self-contained tickets. I'm just not sure about what the "with respect to other PSKs established on the same connection" means. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll see what I can do to rewrite it. |
||
|
||
Because this mechanism requires sharing the session database between | ||
server nodes, it may be hard to achieve high rates of PSK and and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor mistake: "and and" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "between server nodes in environments with multiple servers acting as endpoints for the same service" |
||
0-RTT success when compared with self-encrypted tickets which do not | ||
require consistent server-side storage for basic functionality but | ||
only for 0-RTT anti-replay. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps replace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There should probably be some more clarification that 0-RTT success and PSK success are partially independent, and that tickets can still be used for PSK even if the single-use property cannot be guaranteed; that is, PSK can succeed even in cases where 0-RTT must be rejected for safety. (Unless I misunderstand?) |
||
|
||
|
||
### Client Hello Recording | ||
|
||
An alternative form of anti-replay is to record each ClientHello or a | ||
unique value derived from the ClientHello and reject | ||
duplicates. However, recording all ClientHellos causes state to grow | ||
without bound, so in practice the server must instead record | ||
ClientHellos within a given time window based on the | ||
"obfuscated_ticket_age" value provided by the client. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would instead say "Recording all ClientHellos causes state to grow without bound. A server can instead record ClientHellos within a given time window and use the "obfuscated_ticket_age" to ensure that tickets aren't reused outside that window." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. MT's version is better, yes. |
||
|
||
In order to implement this mechanism, a server needs to store the | ||
following values, either locally or by encoding them in the ticket: | ||
|
||
- The time that the server generated the session ticket. | ||
- The estimated round trip time between the client and server; | ||
this can be estimated by measuring the time between sending | ||
the Finished message and receiving the first message in the | ||
client's second flight, or potentially using information | ||
from the operating system. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think lines 3600 - 3604 are probably impractical. If folks do want to use STEK-encrypted tickets for global resumption, then the RTT to one location will be very different than another. Even absent that, RTTs can vary quite a lot for Mobile users. As an implementor, I'd just use a global tolerance value (like 500ms or something). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would make more sense to just say the server should store one thing:
That also makes it more clear that only information in the ticket can be used for these calculations (any RTT information in the resumption handshake can not be trusted). |
||
- The "ticket_age_add" parameter from the NewSessionTicket message in | ||
which the ticket was established. | ||
|
||
The server can determine the client's view of the age of the ticket by | ||
subtracting the ticket's "ticket_age_add value" from the | ||
"obfuscated_ticket_age" parameter in the client's "pre_shared_key" | ||
extension. The server can determine the approximate time that the | ||
client sent the ClientHello as: | ||
|
||
~~~~ | ||
creation time + (client's view - RTT estimate/2) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the division by two is necessary. The ticket gets delayed by half an RTT on the way to the client in the first place, and half an RTT on the way back to the server. So it nets out to one RTT of difference. Means we also needn't worry about any asymmetry between the two directions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought that initially, but I believe that's wrong, because we're interested in the client's claimed sending time. Consider the case where clocks are globally synchronized and we have a 200 ms RTT.
Now, if the client had put the absolute time in CH, it would have been 1100, but it puts in relative time so that's 1000. When we add 1/2 RTT, we get 1100. If we were to add RTT, we would get 1200, which is wrong, because that's when the server got it. This is different from below where we are interested in the mismatch. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh you're right! Wait a second, no. I'm still wrong. But you're not totally right either. If we want to get very pedantic about it (pedantic on a crypto spec? no!) , all we know is that the client is some portion of the RTT behind. We don't know that it's 1/2. The RTT might be asymmetric. Imagine it's 20ms server -> client, but 180ms client to server.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, totally. It's just the best we can do. Remember that we're not trusting this value, we're just using it to have the best chance of getting the time within the window we are saving CH for. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm... Well, we are sort of trusting this, because we are using it to distinguish between hard fail and forced 1-RTT. OTOH, the attacker can always force us into that posture by just delaying the packet until its out of window, so I don't think that it's an issue. But it might be easiest to just use "issue time + obfuscated ticket age".... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry @ekr , the comments are so deep I got lost on which section we're in. I thought this was about the stateless anti-replay. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be that you always want to force 1-RTT and never hard-fail on detecting replays. There may be some legitimate scenarios to receive a replay. For example, TCP FastOpen (or QUIC) combined with 0RTT and packet duplication in the network. Hard-fail could result in some weird race conditions here. Having hard-fail (fatal alert, I assume?) as distinct from forcing 1-RTT could also just give an attacker more information. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. TCP FO is still TCP - a duplicate packet will be rejected by the TCP state machine and shouldn't make it as far as TLS. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tend to agree with Nygren about not giving the attacker more information -- that is, don't hard-fail and just fall back to 1-RTT [unless you're under attack and need to shed load]. The 1-RTT will fail for the attacker's replays, of course. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
~~~~ | ||
|
||
For a given storage window, the server implements anti-replay as | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need to introduce the phrase "storage window" somehow (or reword)? |
||
follows: | ||
|
||
1. If the creation time is outside of the window, then accept the PSK but | ||
reject 0-RTT. | ||
|
||
2. If the ClientHello matches an existing ClientHello, then | ||
abort the handshake using an "illegal_parameter" alert | ||
(this should never happen in a functional system). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To make this more efficient implementations can use some hashing/bloom filter rather than storing the entire client hello. This will have false-positives, in which case zero RTT data should just be rejected, not the connection aborted. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh hipster crypto to the rescue. Efficient storage isn't so much the problem as global synchronization within reasonable time frames. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure the entire client hello can be stored, but why not do it more efficiently? I don't think this state is global (ie one client hello cache per cluster). |
||
|
||
3. Otherwise, store the ClientHello during the window | ||
and accept 0-RTT. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be wise to explicitly recommend storing the PSK binder rather than the entire ClientHello. This has the benefit of essentially being a compact token cryptographically tied to the 0-RTT key (also preventing someone from polluting the replace cache with random data). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the ClientHello needs to be valid, then polluting the cache is as trivial as just creating new and different ClientHello values. In a way, the binder is just a way of having the other side calculate your hash for you. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Kyle is right that you need to validate the binder, because it changes the cost of polluting the cache to the cost of getting a new PSK (and you can use PSK-specific filtering to blacklist bad actors). Maybe it's obvious to others, but we also can't use a hash of the packet because if the CH contains two PSKs, then the attacker can corrupt the second binder without detection and potentially pollute the cache. So, I think you want either CH.Random or the binder. |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be adding something here ala ... It is also critical to the sure that the record of ClientHellos that led to accepted 0-RTT sections from a given window is complete, before accepting any new 0-RTT sections for that same window. For example, if the system recording ClientHellos crashes with no durable record of the ClientHellos previously accepted, then the system needs to wait at least one full window before accepting any ClientHellos. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's probably not going to fly. The window is large and waiting that long would kill all the benefits that 0-RTT provides. I would instead recognize that challenges exist in synchronizing state across participating nodes. See Erik Nygren's comments on the list that amount to basically "not gonna happen", which I agree with. It's fine to recommend this design, but the need to have globally consistent state is a massive hurdle. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The window here is just the clock skew tolerance window (ie 10s of seconds). Waiting 1 window before starting to accept 0-RTT data sounds very reasonable to me. I believe the intent of these is to have 1 of these client hello record caches per cluster, rather than a global state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, my wording confused things. All I was referring to is that when you restart a strike register with a clean slate, you need to wait the window period of time before accepting any new entries. The reason is because of this race: T1: Strike register accepts key K To avoid that, the register needs to have a pause on start-up. Or it can record everything durably, but that's very slow. The register can still respond with microseconds during ordinary operation. |
||
Because this mechanism does not require storing all outstanding | ||
tickets, it may be easier to implement in distributed systems with | ||
high rates of resumption and 0-RTT, at the cost of potentially | ||
weaker anti-replay defense because of the difficulty reliably | ||
storing and retrieving the received ClientHello messages. | ||
|
||
### Stateless Time-Based Anti-Replay {#stateless-anti-replay} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think stateless mitigation is pointless, as it doesn't bound the number of replays, but some notes anyway ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Limited" should probably be in the subsection heading. |
||
|
||
Finally, the server can implement a very rough anti-replay mechanism | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd almost call this a "replay reduction" or "replay limitation" mechanism instead of "anti-replay", as "anti-replay" could be read as meaning it is a stronger mechanism than it actually is. |
||
merely by measuring the mismatch between client and server views of | ||
time. The server can determine its view of the age of the ticket by | ||
subtracting the the time the ticket was issued from the current | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "the the" |
||
time. If the client and server clocks were running at the same rate, | ||
the client's view of would be shorter than the actual time elapsed on | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something missing after There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oops "ticket_age" |
||
the server by a single round trip time. This difference is comprised | ||
of the delay in sending the NewSessionTicket message to the client, | ||
plus the time taken to send the ClientHello to the server. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a little unfortunate to have the discussion of the time calculations both here and in Client Hello Recording, though I don't have an alternative proposal. |
||
|
||
The mismatch between the client's and server's views of age is thus | ||
given by: | ||
|
||
~~~~ | ||
mismatch = (client's view + RTT estimate) - (server's view) | ||
~~~~ | ||
|
||
There are several potential sources of error that make an exact | ||
measurement of time difficult. Variations in client and server clock | ||
rates are likely to be minimal, though potentially with gross time | ||
corrections. Network propagation delays are the most likely causes of | ||
a mismatch in legitimate values for elapsed time. Both the | ||
NewSessionTicket and ClientHello messages might be retransmitted and | ||
therefore delayed, which might be hidden by TCP. For browser clients | ||
on the Internet, this implies that an | ||
allowance on the order of ten seconds to account for errors in clocks and | ||
variations in measurements is advisable; other deployment scenarios | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The more we talk about this I kind of want to drop down to less than 10 seconds, like maybe 5 or even 1. (Yes, I know the text here is just talking order of magnitude.) |
||
may have different needs. Outside the selected range, the | ||
server SHOULD reject early data and fall back to a full 1-RTT | ||
handshake. Clock skew distributions are not | ||
symmetric, so the optimal tradeoff may involve an asymmetric range | ||
of permissible mismatch values. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once the discussion above about storing received ClientHello(-related stuff) settles, we should probably normalize this text with what we end up with there. Also, a server could rate-limit how often it accepts 0-RTT, to provide some reduction in the amount of replay possible. The amount of reduction gained probably is not enough to make it worth doing, but I'll toss it out there. |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here I would suggest adding something that says: Note that while stateless anti-replay can bound over how long in time a packet may be replayed, the total amount of replays tolerated is bounded by bandwidth and system capacity. This can be thousands to billions of replays in real-world settings. And I'd argue for adding this too: Stateless anti-replay SHOULD NOT be used in environments without strong assurance of application and system behavior and MUST NOT be used in environments that must interoperate with third-party systems and applications. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added the clarification but not the normative requirement. |
||
## End of Early Data | ||
|
||
%%% Updating Keys | ||
|
@@ -5594,6 +5670,45 @@ application protocols separately ensuring that confidential | |
information is not inadvertently leaked. | ||
|
||
|
||
## Replay Attacks on 0-RTT {#replay-0rtt} | ||
|
||
Replayable 0-RTT data presents a number of security threats to | ||
TLS-using applications. Specifically, if applications are not | ||
engineered to be idempotent, then duplication of requests | ||
may cause side effects (e.g., purchasing an item or transferring | ||
money) to be duplicated, thus harming the site or the user. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is some HTTPS mindset sneaking on. (Which is not necessarily wrong, just something to be aware of.) |
||
In addition, if data can be replayed a large number of times, | ||
this enables a variety of attacks via side channels such | ||
as cache timing or measuring the speed of cryptographic | ||
operations {{Mac17}}. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps say "idempotent and side-effect free" rather than just "idempotent"? DELETE and PUT are idempotent but do have side-effects and without additional application layer controls an attacker doing 0-RTT replay could reorder them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to side-effect free. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First of all, I would talk about "actions" rather than requests. The things that the server does in response to receiving 0-RTT are what will be exploited. Side-effect free is useful, but forbidding that doesn't really cover it. In @enygren's example, the side effects are relevant, but the primary effect (creation/update of a resource vs. removal) is what we're really concerned with. The only way to ensure that this is perfectly safe is to use the "safe" definition in HTTP - that is the request does nothing but generate a response. And even then, it's rare that such a request is ever free from side effects or side channels. This risks us defining something that is very HTTP-centric. I would prefer that we instead say that idempotency is desirable for the actions that the server takes, but that idempotency could be insufficient. That is more or less what the text here is getting at. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I rewrote this a bit, but note that I'm not sure that any anti-replay mechanism we have considered will handle this in the face of sufficient client and server complicity. Consider the following case:
Ugh |
||
The limited anti-replay described in {{replay-time}} are intended to | ||
prevent large-scale replay but do not provide complete protection | ||
against replays. Specifically, because they fall back to the 1-RTT | ||
handshake when the server does not have any information about the | ||
client, e.g., because it is in a different cluster which does not | ||
share state or because the ticket has been deleted as described in | ||
{{single-use-tickets}}. If the application layer protocol retransmits | ||
data in this setting, then it is possible for an attacker to induce a | ||
replay attack by sending the ClientHello to both the original cluster | ||
(which processes the data immediately) and another cluster which will | ||
fall back to 1-RTT and process the data upon application layer | ||
replay. The scale of this attack is limited by the client's | ||
willingness to replay and therefore only allows a small number of | ||
replays, which will also use different encryption keys. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would remove the "therefore only allows a small number of replays". That's all up to the client. I don't consider 10 to be small, which is where we are at in Firefox. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again I agree with MT There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed "small" to "limited" |
||
|
||
If implemented correctly, | ||
the mechanisms described in {{single-use-tickets}} and | ||
{{client-hello-recording}}, prevent a | ||
replayed ClientHello and its associated 0-RTT data from being accepted | ||
multiple times by any cluster with consistent state. However, if | ||
state is not completely consistent, then an attacker might be able to | ||
have multiple copies of the data be accepted during the replication | ||
window. The stateless mechanism described in | ||
{{stateless-anti-replay}} only prevents replay outside the | ||
time window. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. probably should reiterate that this can be tens or hundreds of thousands of replays. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added that separately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the main body text, not the security considerations, if I'm reading the line numbers correctly. |
||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add something like this which covers the fundamental requirement and responsibilities: "The onus is on clients not to send messages in 0-RTT data which are not safe to have replayed and which they would not be willing to retry across multiple 1-RTT connections. The onus is on servers to protect themselves against attacks employing 0-RTT data replication." (or "___ have responsibility to" instead of "the onus is on"?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is pretty obvious, but worth stating, I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, worth stating again. Maybe also that the application profile should tell the client to do so. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added it to an earlier location |
||
# Working Group Information | ||
|
||
The discussion list for the IETF TLS working group is located at the e-mail | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it's "MacCárthaigh", but maybe RFCs have to be ascii. And now my first comment gets to be super vain! Oh man.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, ASCII only. Feel free to supply some other flattening :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're almost able to put people's names in RFCs, but the road is a long one (for reasons that I won't burden you with).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the 8th fallacy of naming! (Not all names are representable in Unicode.) ((Number probably wrong; I made it up.))