Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chttp2] Rollup of fixes for CVE-2023-44487 #34763

Merged
merged 15 commits into from
Oct 23, 2023
Merged

[chttp2] Rollup of fixes for CVE-2023-44487 #34763

merged 15 commits into from
Oct 23, 2023

Conversation

ctiller
Copy link
Member

@ctiller ctiller commented Oct 20, 2023

No description provided.

ctiller and others added 15 commits October 20, 2023 14:42
Really minimal change to make the output buffer for chttp2 be a
`grpc_core::SliceBuffer` so that we can start mixing in the new framer
code.

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Isolate ping callback tracking to its own file.
Also takes the opportunity to simplify keepalive code by applying the
ping timeout to all pings.
Adds an experiment to allow multiple pings outstanding too (this was
originally an accidental behavior change of the work, but one that I
think may be useful going forward).

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Force http/2 rapid reset attacks to read data by randomly sending a ping
on some percentage of reset streams received at the server.
Previously chttp2 would allow infinite requests prior to a settings ack
- as the agreed upon limit for requests in that state is infinite.
Instead, after MAX_CONCURRENT_STREAMS requests have been attempted,
start blanket cancelling requests until the settings ack is received.
This can be done efficiently without allocating request state
structures.
Cap requests per read, rst_stream handled per read.
If these caps are exceeded, offload processing of the connection to a
backing thread pool, and allow other connections to make progress.
Previously our settings changes carried no timeout, fix that.

Default timeout starts at keepalive_timeout*2.
If a request is invalid, take a random amount of time before sending the
RST_STREAM, so that MAX_CONCURRENT_STREAMS remaining becomes
unpredictable.
Experiment 1: On RST_STREAM: reduce MAX_CONCURRENT_STREAMS for one round
trip.
Experiment 2: If a settings frame is outstanding with a lower
MAX_CONCURRENT_STREAMS than is configured, and we receive a new incoming
stream that would exceed the new cap, randomly reject it.

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
…c#34589)

We probably shouldn't count the time it takes us to write out data as
part of the ping timeout
The `TickForDuration()` method was using `grpc_core::Timestamp::Now()`
to get the current time, but that was not in sync with the `now_` value
inside the Fuzzing EE itself, with the result that after two subsequent
250ms increments, timers were not being properly fired. I've added a
test that demonstrates this failure without the fix.
Fix b/304114403

- adds a new experimental tracer useful for diagnosing ping timeout
failures in unit tests
- adds a pair of experimental tracers for fuzzing event engine
- fix the behavior of FuzzingEventEngine so that a RunAfter(0, ...) runs
in the same tick
- up the rate of sends (reduce the send delay) so we guarantee to be
able to send 200kb/sec in fuzzed e2e unit tests

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
…outs (grpc#34647)

Just seeing data flowing in after a ping is enough to establish liveness
of a connection, and so we can limit keepalive timeouts to that. Ping
timeouts are necessary for protocol correctness, but may be stuck behind
other traffic, so give them a little more of a grace period.

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
…34665)

Instead of fixing a target size for writes, try to adapt it a little to
observed bandwidth.

The initial algorithm tries to get large writes within 100-1000ms
maximum delay - this range probably wants to be tuned, but let's see.

The hope here is that on slow connections we can not back buffer so much
and so when we need to send a ping-ack it's possible without great
delay.
…rpc#34697)

Cancel streams if we have too much concurrency on a single channel to
allow that server to recover.

There seems to be a convergence in the HTTP2 community about this being
a reasonable thing to do, so I'd like to try it in some real scenarios.

If this pans out well then I'll likely drop the
`red_max_concurrent_streams` and the `rstpit` experiments in preference
to this.

I'm also considering tying in resource quota so that under high memory
pressure we just default to this path.

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
@ctiller ctiller merged commit e33af6c into grpc:v1.59.x Oct 23, 2023
61 of 69 checks passed
@gnossen gnossen added the release notes: yes Indicates if PR needs to be in release notes label Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants