New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional security section on fragmentation reassembly attacks #444
Conversation
Describe the equivalent of the Teardrop attack for QUIC, and propose mitigation.
I have lots of emotional things to say about such checks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't talk about flow control. It needs to.
Like #443, I think that this is far more detailed than we need. The point of this is to make an implementer aware that a malicious peer might intentionally fragment the data on receive buffers in order to cause disproportionate memory commitment (either disproportionate to the number of bytes that were transmitted, or disproportionate to the flow control offset that was provided, in practice probably both are necessary to make the attack worthwhile). This can be said more concisely, I think.
The most interesting case for this attack is where receivers over-commit memory and advertise flow control offsets in the aggregate that exceed actual available memory. This strategy works in most cases given that most clients are not attempting denial of service. The very tail of a receive window is rarely needed in practice. Over-commitment fails badly when under this kind of attack.
draft-ietf-quic-transport.md
Outdated
An adversarial client may attempt to | ||
exhaust server memory resource by performing | ||
a stream fragmentation and reassembly attack, similar to the UDP/ICMP | ||
"Teardrop" fragmentation attacks. The adversarial client would open a stream, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
citation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just dropping the name quoting. Could not find a good Teardrop reference.
draft-ietf-quic-transport.md
Outdated
@@ -2697,6 +2697,43 @@ also be forward-secure encrypted. Since the attacker will not have the forward | |||
secure key, the attacker will not be able to generate forward-secure encrypted | |||
packets with ACK frames. | |||
|
|||
## Stream fragmentation and reassembly attacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Title Case
draft-ietf-quic-transport.md
Outdated
@@ -2697,6 +2697,43 @@ also be forward-secure encrypted. Since the attacker will not have the forward | |||
secure key, the attacker will not be able to generate forward-secure encrypted | |||
packets with ACK frames. | |||
|
|||
## Stream fragmentation and reassembly attacks | |||
|
|||
An adversarial client may attempt to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is any endpoint, though I agree that it's (usually) not very interesting for a server to mount the attack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Rewriting to"endpoint" instead of client.
draft-ietf-quic-transport.md
Outdated
This attack can be mitigated by not | ||
committing memory for stream data reassembly, | ||
and simply keeping the STREAM DATA frames until enough fragments have been | ||
received and the data can be delivered to the application in proper sequence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saving STREAM frames only works if the data provided is sufficiently sparse, at some point the overhead of saving the frames exceeds the overheads of assembling the data into a buffer and tracking the holes.
The real mitigation is not to over-commit on flow control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saving frames is the only way the connection-level flow control window makes sense. Otherwise, you'd have to commit (number of streams)*(stream flow control window) memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's true. That suggests a different way to write this: assume that frames are saved (and maybe merged opportunistically). Then the attack is on the overhead associated with saved frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on what is meant by using up system memory, an attack may focus on locking out other connections new, or existing by forcing low congenstion windows. Avoiding overcommits can make this situation worse if the attacker succesfully increases flow control budgets.
For practical large scale solutions, the implementation needs to overcommit very significantly. There can be 100K connections of which only a few hundred are active. Each of these need write capacity immediately. If not, it both impacts responsiveness, and it ties up resources by having more concurrent active work going on. The same applies by number of streams vs active streams in some use cases while other cases normally expect streams to be active or closed. If each connection classified as active gets a connection level budget, it doesn't really matter what the stream budget is - this is more for the application consumption management. If the connection budget is abused with holes, it just hurts throughput of the sender and could limit the ability to start new streams. The real problem is to decide which connections are active and which are sleeping without preventing fast rampup of new and sleeping connections, and how to throttle back when connections are no longer active. The attack is then to appear active while commiting the least possible sender resources. A heuristic could be the age of holes. If retransmission does not kick in timely, packets could be dropped deliberately on that connection despite having a reasonable connection level budget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should perhaps clarify that in the above, a connection level budget is not a linear function of flow control. There is a fixed amount of internal memory, and as that is released, the congestion window is expanded. So storing lots of fragments will use memory faster and release memory slower, and thus reduce the connection level congestion window. And, when the memory fills, packets starts to drop. In this way, the worst case is that the full budget is consumed with holes, whereas a friendly peer would fill the same budget with linear stream data. The adversary can only create so many holes before the cost of whole punching is more expensive than linear data. Of course, there a endless different ways this can be handled, and it depends on the use case, risk, degradation when not under attack, etc., therefore it is hard to provide general advise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say, Mikkel, "it is hard to provide general advice". So maybe that's own we should rewrite the "advice" part of this PR. Something like "It is hard to provide general advice. QUIC deployments SHOULD provide mitigations against the stream fragmentation attack, which MAY be avoiding over-committing memory, delaying reassembly of STREAM DATA frames, or implementing heuristics based the age and duration of reassembly holes."
Shortened the text, added reference to flow control. The point is that (some) receivers will over-commit, and will need to mitigate the attack. This will require some kind of heuristic. I proposed one -- counting holes, and if they are not commensurate with the packet loss rate abort the connection. If you believe there is something smarter to do, please chime in. |
On Wed, Apr 19, 2017 at 9:15 PM, Marten Seemann ***@***.***> wrote:
Saving frames is the only way the connection-level flow control window
makes sense. Otherwise, you'd have to commit (number of streams)*(stream
flow control window) memory.
I don't see it: why would you have to commit more than the connection flow
control window?
|
Aaron: "why would you have to commit more than the connection flow control window?" It happens if the sum of the per stream windows is larger than the congestion window. For example, when the endpoint cannot predict which of the streams the other endpoint will fill first. |
Since there is no "one size fits all" mitigation, simplify the recommendations. The point is to draw attention to the problem, and trust developers to do the right thing.
Modified the mitigations part. Martin, I think that the new text addresses your review. Can you give it a look? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks fine. I'll give others a chance to poke at this a little before merging.
draft-ietf-quic-transport.md
Outdated
The attack is mitigated if flow control windows correspond to | ||
available memory. However, some receivers will over-commit memory and advertise | ||
flow control offsets in the aggregate that exceed actual available memory. | ||
The over-commitment strategy may leads to better performance when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"may leads" -> "can lead"
draft-ietf-quic-transport.md
Outdated
the stream fragmentation attack. | ||
|
||
QUIC deployments SHOULD provide mitigations against the stream fragmentation | ||
attack. Mitigations MAY consist of avoiding over-committing memory, delaying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use "MAY" here, it's not permissive. "could" is fine.
draft-ietf-quic-transport.md
Outdated
|
||
QUIC deployments SHOULD provide mitigations against the stream fragmentation | ||
attack. Mitigations MAY consist of avoiding over-committing memory, delaying | ||
reassembly of STREAM DATA frames, implementing heuristics based on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"STREAM frames"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I believe I fixed all that...
Describe the equivalent of the Teardrop attack for QUIC, and propose mitigation.