-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless Reset needs "on-path" proof #1230
Comments
Another attack to consider along similar lines: If an active attacker observes the CID being used by a client, sends a packet with that CID to a different server endpoint that shares the same SR token algorithm/key, then it can get the other endpoint to effectively be an oracle for the stateless reset token. It can then inject a Stateless Reset with the token into the current connection. One mitigation would be to say that it's the server deployment's problem -- you should have different SR token-generation keys between different endpoints. Another is to require that a Stateless Reset was sent by someone who had both a specific packet you sent and the token, without disclosing the token in a form that could be adapted to a different rejected packet. (This is a similar case to #1264; since the generator of the SR can't validate the crypto, they also can't know whether it was generated by the client, misdirected by an attacker, or fully generated by an attacker. That means they also can't avoid being a CID->SR oracle.) |
Yes, this is a deployment issue. If packets with the same connection ID can end up at a node that shares a stateless reset key, but can't access connection state so that it generates a stateless reset, we have an oracle. We should document that attack. |
While I agree that this can be considered a deployment issue, I think we should forbid servers deploying the same static secret used for generating the stateless reset token among the servers that do not share the connection state, because it not only goes against what "authenticated" reset is but also has privacy concerns. My understanding is that many of the server-side deployments that care about this are those that use BGP to distribute their packets among multiple POPs. Those operators tend to serve multiple hostnames. That means unless we mandate such server operators to provide stateless reset tokens only in a secure manner, attackers can force a connection to terminate (by using a stateless reset token obtained from a different POP) and then see the SNI carried in the handshake of a new connection. Therefore, I think that we should either forbid server deployments from sharing the static key without sharing connection state, or, look for a technical approach to prevent the attack. One such approach would be something like below:
There could be other ways, but I think sending encrypted data in a stateless reset and verifying that on the client side would be the necessary thing to do here. WDYT? |
Yes, if the extent of the state storage distribution matches the extent of the key distribution, which is a natural design, then this is naturally OK. @kazuho, I'm struggling with your state store ID design. If the state and key storage is co-extant, then the packet that hits a different store will generate a response that the client won't recognize as a valid stateless reset. So I'm failing to see how the extra steps you describe help with the problem. Unless you are suggesting that this key needs to be global (unlike the token, which is scoped to a particular zone). I don't think going global works though. An attacker can obtain a key for a given connection ID in one cluster and use it - in combination with the stateless reset oracle Mike described - to attack another. If we wanted to prove that the server can generate the Stateless Reset without compromising later uses of the Stateless Reset, then we could feed the incoming packet into the process. For example, rather than include the token in the stateless reset directly, it includes a constant value that is generated by using Of course, the risk here is that clients don't track enough from their outbound packets to reconstruct the input to the HMAC. After all, it requires tracking packet ciphertext - no other part of a packet is unpredictable enough. So they might have to treat receipt of a possible stateless reset as a trigger to enable whatever tracking they need, after which they send another packet and hopefully get a stateless reset. Aside from adding a whole round trip to the process, it just more than doubled the cost of a stateless reset at the server. [1] We originally had another hash for stateless resets, but removed that in favour of the current, simpler design. |
That is exactly what I am suggesting, under the assumption that some would be willing to do so (if none are, we should simply ban a single key shared among the servers that do not share the connection state, instead of discussing how a client should act against such servers as proposed in #1259). In my proposal, a stateless reset token becomes an identifier of the connection (or path) that is being reset. It is the tag of the encrypted data (that contains the "state-store ID") that proves that the stateless reset has been sent from a server. Then, a client uses the "state-store ID" to see if the reset was sent by a server that should have known the connection (means a connection reset), or not (means a path being rejected). In other words, a stateless reset token becomes usable once per every "state-store", instead of once per connection. PS. maybe I am too abstract (I am trying to not go into certain design decisions). I can come up with a more concrete description if that's preferable. |
OK, thanks for clearing that up. I think that you have an attack, though it might not be that interesting. Connection IDs are scoped to a state store, whereas your proposed key is global. That means that an attacker that can learn the key for a given connection ID in any state store can use it to attack all other state stores. For high entropy connection IDs, that takes some doing, but it isn't generically safe, especially now we allow as little as 32 bits. What is generically safe is co-extant state. That is, the connection ID has to be valid everywhere the static key is used and thus a packet with a given connection ID either causes a valid stateless reset for that connection or it is accepted. Given the complexity of the additional mechanism and that exposure, I'd rather concentrate on the attack that this issue was originally raised to address: the absence of a proof-of-receipt in the stateless reset packet. Given that we now have a little as 32 bits of entropy (or less) in a connection ID, the potential for a stateless reset oracle is bothering me a little. |
Thank you for considering the approach and pointing out the issue. I had not considered of the attack vector. And considering of the attack vector, I realize that there is a less complex approach. Assuming that the ID of the POP (i.e. the "state-store") is included in the DCID, a server can determine if it should have known the state a connection that the DCID designates, and send a connection reset or a path rejection based on that. With that said, I am personally fine with requiring co-extant state. As expressed in #1230 (comment), my intent behind the proposal has been under the premise that some might want to reject path creation when packets arrive at a POP that does not have access tho the state store. |
Yeah, I don't think that path rejection is going to work that well. Packets that arrive at a POP that doesn't have access to a state store will either enter a black-hole (if they know about the connection ID, in which case it will be dropped after packet protection removal fails), or generate a stateless reset. More of the latter as connection IDs get longer and more sparse. That's what makes me think that we need some sort of verification of intent in addition to a token. A routing flap, misconfiguration, or attack might cause a storm of stateless resets that might be reusable by an attacker. Adding a liveness check is probably a good idea to avoid that. |
@martinduke made a good point about this, which made me reconsider this idea. A man-on-the-side attacker can copy whatever details they need from packets an endpoint sends in its use of a stateless reset oracle. That is, if the stateless reset depends on data in a packet that an endpoint sends (and it can't depend on anything more than that), then the attacker simply copies whatever it needs from a genuine packet. A liveness check therefore only really makes the attackers job harder. Since the fix here that forces the attacker to be live is to make the stateless reset more complex, we should be very careful to consider the trade-off. |
A loose thought, but intuition suggests: With more roundtrips it would probably be possible to design a challenge response scheme. Dead endpoint issues a stateless reset challenge. Live endpoint responds by hashing the challenge with a negotiated reset key. Dead end point issues a new reset by hashing the live response. MITM cannot use observed packets to issue new resets without triggering a roundtrip with proper routing between the endpoints, and the supposedly dead endpoint would not cooperate if it isn't dead. |
Discussed in Kista. Conclusion was that the marginal benefit was minor, if not non-existent, and the cost was significant. A man-on-the-side could build the oracle. We will instead concentrate on documenting the constraints on the routing infrastructure. |
Stateless Reset contains a proof that the Server sent it. However, if the server's key is compromised, Stateless Reset can be forged wholesale off-path. Stateless Reset must also contain a proof that the sender observed the original packet that caused Stateless Reset. Clearly, there are many simple ways to do so.
The text was updated successfully, but these errors were encountered: