Fix race condition between 0-RTT and Incoming #1821

gretchenfrage · 2024-04-14T18:37:21Z

The fix:

Endpoint now maintains a slab with an entry for each pending Incoming to buffer received data.
ConnectionIndex now maps initial DCID to that slab key immediately upon construction of Incoming.
If Incoming is accepted, association is overridden with association with ConnectionHandle, and all buffered datagrams are fed to newly constructed Connection.
If Incoming is refused/retried/ignored, or accepting errors, association and slab entry are cleaned up to prevent memory leak.

Additional considerations:

The Incoming::ignore operation can no longer be implemented as just dropping it. To help prevent incorrect API usage, proto::Incoming is modified to log a warning if it is dropped without being passed to Endpoint::accept/refuse/retry/ignore.
To help protect against memory exhaustion attacks, per-Incoming buffered data is limited to twice the receive window or 10 KB, which- ever is larger. Excessive packets silently dropped.
- Does this introduce a new vulnerability to an attack in which an attacker could spam a server with 0-RTT packets with the same connection ID as it observed a client attempting to initiate a 0-RTT connection to the server? I do think so.
  
  Is this a severe problem? Here's two reasons I don't think so:
  1. The default receive window is set to max value, so this won't actually kick in unless the user is already hardening against adverse conditions. 2. It is already possible for an on-path attacker to distrupt a connection handshake if 0.5-RTT data is being used, so this probably doesn't actually expand the set of situations in which it's vulnerable to this kind of vulnerability.
  Could this be avoided? Possibly by introducing additional state to the buffering state to validate whether these packets are validly encrypted for the associated connection? However, that may risk making these operations costly enough that they start to defeat the DDOS-resistance abilities of the Incoming API.

Ralith

Great catch, thanks! This was very likely to be an issue in practice because 0-RTT packets are likely to sent, and hence received, immediately after the first Initial, and might hence routinely be processed by the Endpoint before the application has even had a chance to see the Incoming.

I think the strategy here is on the right track, but we should consider carefully what the default buffering limit should be. While it's true that the default receive_window is unlimited, the default stream receive window is not, and neither is the default stream concurrency limit, so there are effective limits by default, and we should preserve that pattern here.

quinn-proto/src/endpoint.rs

gretchenfrage · 2024-04-15T04:08:00Z

This was very likely to be an issue in practice because 0-RTT packets are likely to sent, and hence received, immediately after the first Initial, and might hence routinely be processed by the Endpoint before the application has even had a chance to see the Incoming.

Indeed, I discovered this not through thinking about the code really hard but through empirical testing. I was a bit embarrassed when I realized that the problem was caused by me, I initially thought it was pre-existing.

How do we limit the number of these? Buffer size limits will not be an effective defense if the number of buffers is unbounded.

Good catch. And solving this is kind of tricky. But here's an approach I've implemented now that I think works, let me know what you think:

Firstly, we can now move the MAX_INCOMING_CONNECTIONS check from quinn to proto. (Also, rename it to just MAX_INCOMING, which we sort of forgot to do in the original PR).

However, that is not sufficient to prevent memory exhaustion via filling many incoming with buffered early data, because these limits multiply to too high of a number. With the default transport config:

(100 max concurrent bidi streams + 100 max concurrent uni streams) × 1.25 MB stream receive window = 250 MB

× 2^16 max incoming = 1.64 TB

Which I'm pretty sure is a lot of RAM. So to avoid that, we implement another MAX_ALL_INCOMING_BUFFERED limit set to 100 MiB which is a limit to the total number of bytes buffered for incomings across all incoming within the connection. If either the per-incoming limit or the MAX_ALL_INCOMING_BUFFERED limit are exceeded, the packet is dropped.

So to summarize:

Upon receipt of an incoming-creating packet, if the number of pending incoming exceeds MAX_INCOMING = 1^16, endpoint automatically responds with refuse packet.
Upon receipt of an incoming-associated packet, if the number of buffered packet bytes for that incoming exceeds the per-incoming limit calculated as min(receive_window, (max_concurrent_bidi_streams + max_concurrent_uni_streams) * stream_receive_window), endpoint drops the packet.
Upon receipt of an incoming-associated packet, if the number of buffered packet bytes for all incoming collectively exceeds MAX_ALL_INCOMING_BUFFERED = 100 MiB, endpoint drops the packet.

gretchenfrage · 2024-04-16T04:34:15Z

I'm not sure how to debug the freebsd CI failure, I don't see logs for it

Ralith · 2024-04-16T16:54:03Z

That's weird, normally it logs just fine. GitHub infra flake, perhaps? I've restarted it manually.

gretchenfrage · 2024-04-16T23:07:26Z

Thanks. It passed this time, so I guess it was just a spurious CI failure.

Ralith

Should we expose the new limits on ServerConfig? 100MiB makes sense for a lot of applications, but someone might want to use Quinn in a memory-constrained environment (e.g. a router).

quinn-proto/src/endpoint.rs

gretchenfrage · 2024-04-20T21:56:47Z

Should we expose the new limits on ServerConfig? 100MiB makes sense for a lot of applications, but someone might want to use Quinn in a memory-constrained environment (e.g. a router).

Good idea. Now that this is all happening in proto, there's no friction to us doing that. Added three new settings to ServerConfig: max_incoming, max_buffer_bytes_per_incoming, and max_buffer_bytes_all_incoming.

What if a connection is configured for use with application datagrams (or some other future messaging primitive) only, and stream limits are hence set to zero? Might be simplest to only use a dedicated limit.

I made it so the default ServerConfig constructor calculates max_buffer_bytes_per_incoming as was previously calculated there. Although I did switch it to just hard-code the overhead to 1200 bytes--let me know if you want me to not do it like that, but it seems more straightforward than the trick with comparison. Anyways, if a user configures the stream limits to zero, but still wants 0-RTT datagrams in excess of 1200 bytes to work more-so than they otherwise would, they can manually set max_buffer_bytes_per_incoming to some appropriately high number. Let me know if you think this is a pitfall sufficiently likely to be worth documenting.

It's confusing that this clause checks after adding the next datagram whereas the first clause checks before.

Rounding up to the next packet isn't necessarily about allowing for overhead (which might vary dramatically according to sender discretion), just erring on the side of being permissive.

If we use < here, then 0 has a more intuitive effect.

I believe these should all have been fixed in this change.

quinn-proto/src/config.rs

quinn-proto/src/endpoint.rs

quinn-proto/src/config.rs

gretchenfrage · 2024-04-24T05:33:35Z

Thanks for the feedback. My last round of changes was a bit sloppier than I would've liked it to be, especially with me missing the += bug. I think the feedback should be addressed now though.

Reworked the limit checking logic to be based on checked_add in a way I think is a lot cleaner.
Largely rewrote the added config docs.
Made other more minor changes.

djc

Some minorish feedback and a more existential question.

This all seems like substantial complexity. How do we feel about alternative solutions where we force callers to address one Incoming at a time? Do we really need the flexibility of keeping multiple incoming packets in flight at the same time? It feels like a larger change that is sort of flying in here under the radar as a very flexible solution to a problem (the race condition identified in #1820), with the flexibility also causing an increase in fragility.

(The answer might well be yes, but wanted to make this more explicit anyway.)

Minor nit: with the frequent force-pushes, keeping an issue reference in the commit message is a bit of a pain because it adds a whole bunch of backreferences in the issue. Consider leaving the issue reference out of the commit message in the future, in favor of keeping it in the PR description.

quinn-proto/src/endpoint.rs

quinn-proto/src/shared.rs

quinn-proto/src/endpoint.rs

quinn-proto/src/config.rs

gretchenfrage · 2024-04-26T03:19:23Z

This all seems like substantial complexity. How do we feel about alternative solutions where we force callers to address one Incoming at a time? Do we really need the flexibility of keeping multiple incoming packets in flight at the same time? It feels like a larger change that is sort of flying in here under the radar as a very flexible solution to a problem (the race condition identified in #1820), with the flexibility also causing an increase in fragility.

It's a worthwhile question.

Merely limiting there to being only one Incoming at a time would not let us fully avoid buffering early datagrams not associated with a connection handle. It would allow us to indiscriminately put all early datagrams not associated with a connection handle into a single buffer rather than in per-Incoming buffers, but that doesn't seem like much of a simplification.

To avoid having to buffer early datagrams not associated with a connection handle, we would need to be able make a decision whether to accept/refuse/retry/ignore an incoming connection attempt immediately and synchronously when the endpoint receives the connection-creating packet and before it tries to process any further packets (as the decision affects which subsequently received packets get routed to a connection and which get discarded).

This could be achieved by removing the Incoming API as it exists now and instead just putting some sort of incoming_logic: Box<dyn FnMut(&IncomingConnectionInfo) -> IncomingConnectionDecision> field in the server config which the endpoint can call synchronously. I don't think we should do that.

One example of a situation where allowing multiple Incoming to be in motion at the same time would be useful is if there is an IP block list (or allow list) that's stored in a database but has an in-memory Bloom filter to accelerate it:

while let Some(incoming) = endpoint.accept().await {
    let ip = incoming.remote_address().ip();
    if !block_list_bloom_filter.maybe_contains(ip) {
        task::spawn(async move {
            handle_incoming(incoming.accept().expect("TODO put real error handling here")).await;
        });
    } else {
        task::spawn(async move {
            if !block_list_database.contains(ip).await {
                handle_incoming(incoming.accept().expect("TODO put real error handling here")).await;
            }
        });
    }
}

This would be a situation where allowing multiple Incoming in motion simultaneously, and even allowing the application to make decisions on them in a different order than they were produced, could improve performance and/or attack mitigation effectiveness.

djc · 2024-04-26T09:02:28Z

To avoid having to buffer early datagrams not associated with a connection handle, we would need to be able make a decision whether to accept/refuse/retry/ignore an incoming connection attempt immediately and synchronously when the endpoint receives the connection-creating packet and before it tries to process any further packets (as the decision affects which subsequently received packets get routed to a connection and which get discarded).

I was thinking we might have accept() take ownership of the Endpoint, storing the Endpoint in Incoming, and "giving it back" after accept/reject/ignore. This would still allow the caller to handle incoming connections asynchronously, but not in parallel (thus avoiding buffering issues). It's less flexible but simpler in the end.

Let's see what @Ralith thinks.

Ralith · 2024-04-26T20:42:01Z

have accept() take ownership of the Endpoint

This would complicate the async layer considerably. We would need to move the proto::Endpoint out of the quinn::endpoint::State each time a connection is incoming, then restore it and wake the endpoint driver after a decision is made. It's also difficult to allow such an accept to be used concurrently with any other endpoint access (e.g. connect, close, assorted getters). Because typical servers will always be waiting on accept, surfacing such a limitation to the public API is likely to degrade usability.

More importantly, it wouldn't solve anything: between GRO and recvmmsg, we may receive many datagrams instantaneously. One batch might include both the first datagram for an incoming connection and follow-up initial or 0-RTT packets for that same connection. These must be either buffered or (as in the status quo) dropped, and it's most convenient to do so at the proto layer, where we at least have the option to correlate them, discard non-QUIC datagrams, and respond directly for stateless cases.

Finally, if we could suspend receipt of datagrams immediately after receiving a connection's first datagram, that would be at odds with future work to parallelize datagram receipt and other endpoint driver work, which is our major remaining milestone for intra-endpoint scaling.

In sum, the current form of this PR is strongly motivated.

quinn-proto/src/config.rs

Ralith

Implementation LGTM. Remaining items:

Some final nits on the docs
~~Decide whether to discard per-Incoming buffer limits for simplicity~~
Test coverage

gretchenfrage · 2024-04-30T23:21:26Z

Comments tweaked. Tests added. As suggested, new tests are based on zero_rtt_happypath test and validate the number of dropped packets. Added to TestEndpoint / IncomingConnectionBehavior the ability to have Incoming be pushed to a waiting_incoming vec to be dealt with later rather than dealt with immediately.

quinn-proto/src/tests/mod.rs

Closes quinn-rs#1820 The fix: - Endpoint now maintains a slab with an entry for each pending Incoming to buffer received data. - ConnectionIndex now maps initial DCID to that slab key immediately upon construction of Incoming. - If Incoming is accepted, association is overridden with association with ConnectionHandle, and all buffered datagrams are fed to newly constructed Connection. - If Incoming is refused/retried/ignored, or accepting errors, association and slab entry are cleaned up to prevent memory leak. Additional considerations: - The Incoming::ignore operation can no longer be implemented as just dropping it. To help prevent incorrect API usage, proto::Incoming is modified to log a warning if it is dropped without being passed to Endpoint::accept/refuse/retry/ignore. - Three things protect against memory exhaustion attacks here: 1. The MAX_INCOMING_CONNECTIONS limit is moved from quinn to proto, limiting the number of concurrent incoming connections for which datagrams will be buffered before the application decides what to do with them. Also, it is changed from a constant to a field of the server config, max_incoming. 2. Per-incoming buffered data is limited to a new limit stored in the server config, incoming_buffer_size, beyond which subsequent packets are discarded if received in these conditions. 3. The sum total of all incoming buffered data is limited to a new limit stored in the server config, incoming_buffer_size_total, beyond which subsequent packets are discarded if received in these conditions.

djc · 2024-05-01T08:19:35Z

Thanks for all the effort here!

gretchenfrage force-pushed the 1661-bug-fix branch from cb6bab4 to f6a8c5b Compare April 14, 2024 18:50

gretchenfrage marked this pull request as ready for review April 14, 2024 21:31

Ralith requested changes Apr 14, 2024

View reviewed changes

gretchenfrage force-pushed the 1661-bug-fix branch from f5fd635 to bfc9b75 Compare April 15, 2024 04:14

gretchenfrage requested a review from Ralith April 15, 2024 04:16

gretchenfrage force-pushed the 1661-bug-fix branch from bfc9b75 to 89f208d Compare April 16, 2024 00:48

gretchenfrage mentioned this pull request Apr 17, 2024

0.11.0 release planning #1737

Closed

7 tasks

Ralith requested changes Apr 19, 2024

View reviewed changes

gretchenfrage force-pushed the 1661-bug-fix branch 4 times, most recently from cc4bd2e to 0f19a07 Compare April 20, 2024 21:52

gretchenfrage force-pushed the 1661-bug-fix branch from 0f19a07 to d0e8175 Compare April 20, 2024 21:57

gretchenfrage requested a review from Ralith April 20, 2024 22:00

gretchenfrage force-pushed the 1661-bug-fix branch from d0e8175 to ce62a7a Compare April 20, 2024 22:02

Ralith requested changes Apr 22, 2024

View reviewed changes

gretchenfrage force-pushed the 1661-bug-fix branch 2 times, most recently from 676c523 to 3dcc72a Compare April 24, 2024 05:16

gretchenfrage requested a review from Ralith April 24, 2024 05:33

gretchenfrage force-pushed the 1661-bug-fix branch from 3dcc72a to eb372bb Compare April 24, 2024 05:35

djc reviewed Apr 25, 2024

View reviewed changes

quinn-proto/src/endpoint.rs Show resolved Hide resolved

quinn-proto/src/shared.rs Show resolved Hide resolved

quinn-proto/src/endpoint.rs Outdated Show resolved Hide resolved

quinn-proto/src/config.rs Show resolved Hide resolved

gretchenfrage force-pushed the 1661-bug-fix branch 2 times, most recently from c63dc5d to 03294a8 Compare April 26, 2024 04:31

Ralith reviewed Apr 26, 2024

View reviewed changes

quinn-proto/src/config.rs Outdated Show resolved Hide resolved

quinn-proto/src/config.rs Outdated Show resolved Hide resolved

quinn-proto/src/config.rs Outdated Show resolved Hide resolved

quinn-proto/src/config.rs Outdated Show resolved Hide resolved

Ralith requested changes Apr 26, 2024

View reviewed changes

proto: Factor out DatagramConnectionEvent

7877462

gretchenfrage force-pushed the 1661-bug-fix branch 2 times, most recently from fc23130 to eba944e Compare April 30, 2024 23:18

gretchenfrage requested a review from Ralith April 30, 2024 23:21

Ralith approved these changes May 1, 2024

View reviewed changes

quinn-proto/src/tests/mod.rs Outdated Show resolved Hide resolved

gretchenfrage force-pushed the 1661-bug-fix branch from eba944e to b76f7a7 Compare May 1, 2024 04:23

djc merged commit 30e1d6f into quinn-rs:main May 1, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition between 0-RTT and Incoming #1821

Fix race condition between 0-RTT and Incoming #1821

gretchenfrage commented Apr 14, 2024

Ralith left a comment

gretchenfrage commented Apr 15, 2024

gretchenfrage commented Apr 16, 2024

Ralith commented Apr 16, 2024

gretchenfrage commented Apr 16, 2024

Ralith left a comment

gretchenfrage commented Apr 20, 2024

gretchenfrage commented Apr 24, 2024

djc left a comment •

edited

gretchenfrage commented Apr 26, 2024

djc commented Apr 26, 2024 •

edited

Ralith commented Apr 26, 2024 •

edited

Ralith left a comment •

edited

gretchenfrage commented Apr 30, 2024

djc commented May 1, 2024

Fix race condition between 0-RTT and Incoming #1821

Fix race condition between 0-RTT and Incoming #1821

Conversation

gretchenfrage commented Apr 14, 2024

Ralith left a comment

Choose a reason for hiding this comment

gretchenfrage commented Apr 15, 2024

gretchenfrage commented Apr 16, 2024

Ralith commented Apr 16, 2024

gretchenfrage commented Apr 16, 2024

Ralith left a comment

Choose a reason for hiding this comment

gretchenfrage commented Apr 20, 2024

gretchenfrage commented Apr 24, 2024

djc left a comment • edited

Choose a reason for hiding this comment

gretchenfrage commented Apr 26, 2024

djc commented Apr 26, 2024 • edited

Ralith commented Apr 26, 2024 • edited

Ralith left a comment • edited

Choose a reason for hiding this comment

gretchenfrage commented Apr 30, 2024

djc commented May 1, 2024

djc left a comment •

edited

djc commented Apr 26, 2024 •

edited

Ralith commented Apr 26, 2024 •

edited

Ralith left a comment •

edited