-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless reset fixes #1346
Stateless reset fixes #1346
Conversation
The Stateless Reset is just a bunch of random bytes except the first one and the token, so I simplified the spec to reflect that.
Improved logical ordering of paragraphs.
Otherwise, there is a DoS risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to send a stateless reset in response to a long header? This would not happen very often, but if so we probably ought to send SR with a Long Header, and specify that somewhere in here.
Otherwise, this looks good to me.
draft-ietf-quic-transport.md
Outdated
|
||
This design relies on the peer always sending a connection ID in its packets so | ||
that the endpoint can use the connection ID from a packet to reset the | ||
connection. An endpoint that uses this design cannot allow its peers to use a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... cannot allow its peers to send packets with a zero-length destination connection ID"
Thanks for the quick review. Nothing specifically prevents an endpoints from sending stateless reset in response to a long header. I've opened #1348 in case I missed something, because I have to move on to something else now :(. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit.
draft-ietf-quic-transport.md
Outdated
connection. An endpoint that uses this design cannot allow its peers to use a | ||
zero-length connection ID. | ||
connection. An endpoint that uses this design cannot allow its peers to send | ||
packets with a zero-length connection ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...zero-length destination connection ID"
draft-ietf-quic-transport.md
Outdated
as long as the key is valid. If instances that share a stateless reset key | ||
allow connections with the same connection ID to be created, then the stateless | ||
reset token for one connection could be used to terminate any connection that | ||
has the same connection ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like you're trying to close #1258 and #1259, too. However, I don't think this language is strong enough. It's not just that it MUST allocate from the same space without reusing; it MUST NOT generate a stateless reset for a connection ID that it would not have allocated, even if it has the proper key to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think the "instance identifier" discussed above would mitigate this. It will generate an SR, but it won't be valid from a different instance.
Notes from editor's meeting: the last paragraph can be removed, then we will add another PR to address the routing issue underlying all this.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just a few comments.
draft-ietf-quic-transport.md
Outdated
The message consists of a header octet, followed by random octets of arbitrary | ||
length, followed by a Stateless Reset Token. | ||
|
||
The endpoint SHOULD send a packet with a short header. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a strange SHOULD. The Stateless Reset packet is defined here with a short header. How can this be a long header?
draft-ietf-quic-transport.md
Outdated
|
||
The endpoint SHOULD send a packet with a short header. | ||
|
||
Assuming a short header, the Random Octets field needs to include at least 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the above comment holds, remove "Assuming a short header"
draft-ietf-quic-transport.md
Outdated
header, therefore it cannot set the Destination Connection ID in the stateless | ||
reset packet. The destination connection ID will therefore differ from the | ||
value used in previous packets. A random Destination Connection ID makes the | ||
connection ID appear to be the result of moving to new connection ID that was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"moving to new connection ID" -> "moving to a new connection ID"
draft-ietf-quic-transport.md
Outdated
reset packet. The destination connection ID will therefore differ from the | ||
value used in previous packets. A random Destination Connection ID makes the | ||
connection ID appear to be the result of moving to new connection ID that was | ||
provided using the NEW_CONNECTION_ID frame ({{frame-new-connection-id}}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the/a/
draft-ietf-quic-transport.md
Outdated
An endpoint detects a potential stateless reset when a packet with a short | ||
header either cannot be decrypted or is marked as a duplicate packet. The | ||
endpoint then compares the last 16 octets of the packet with the Stateless Reset | ||
Token provided by its peer, either from the NEW_CONNECTION_ID frame or the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/from/in/
s/the NEW_CONNECTION_ID frame/a NEW_CONNECTION_ID frame/
s/server/server's/
use HMAC {{?RFC2104}} (for example, HMAC(static_key, server_id || | ||
that takes three inputs: the static key, the connection ID chosen by the | ||
endpoint (see {{connection-id}}), and an instance identifier. An endpoint could | ||
use HMAC {{?RFC2104}} (for example, HMAC(static_key, instance_id || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, incorporating an instance_id doesn't work for connections that are diverted to a new instance, which is a common case, possibly even the most common case. Server restarts are unlikely to be a common use case, since usually servers are drained before restarting, and server crashes are not frequent. Short-term routing flaps however are common. I think the text should say that for Stateless Reset to work across instances, the HMAC could exclude the instance_id. (Correct me if my understanding is wrong.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've addressed that below. But I've expanded on it a little in my latest changes. The trick here is that you can recover the instance ID using the connection ID. I will follow up with a change that includes the fixes we discussed. Part of that change involves removing the instance ID from this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And of course, the issue is that if you enable servers to SR after a routing flap, you also enable those servers to be used as oracles. The only defense is to make it impossible for the routing to be affected by the attacker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinthomson : Ah, right, I see it's addressed below. Given this is an example construction, I would suggest making that explicitly clear up at the beginning of this paragraph.
@MikeBishop : Yes, that is true, but defending the routing infrastructure against attacks seems out of scope for us, since similar attacks can be launched against TCP as well, where you'd receive an RST or ICMP unreachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defending it is out of scope. Noting that you create an attack vector if you fail to defend it yourself, however, bears mentioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, modulo my earlier (take-it-or-leave-it) comment.
See #1386 for the follow-up. |
Hi,
But on the other side (section about Connection IDs) :
So if the server looses state, how do you retrieve the Destination Connection Id from a short header ? ('ve been reading your work and discussions since recently, great project!) |
@Pascalh2001 you could imagine one server cluster settling on 16-bit identifiers throughout, and another cluster that uses a 4-bit type prefix from which the length can be derived, and a third cluster that register connection ID's in an in-memory database. You save bandwidth by leaving it up to the endpoint rather than forcing the length to be transported at all times. |
A cluster that wants the length present simply includes the length of the CID as part of the CID it tells the client to send it. The encoding inside is totally up to the server, so that might be the explicit length, or a four-bit encoding of it, or even a single bit that toggles between the two lengths the cluster typically uses. |
Inspired by @martinduke's #1328, I set to cleaning a few things up. You can see separate commits here that:
Closes #466, #1328.