add write queue to Http2Connection to optimize write handling #39166

geoffkizer · 2020-07-12T18:31:17Z

This gives about 33% improvement for the raw variation and 120% improvement for the full GRPC client variation on the GRPC benchmark at https://github.com/JamesNK/Http2Perf.

@stephentoub @JamesNK @dotnet/ncl

ghost · 2020-07-12T18:31:22Z

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs

stephentoub · 2020-07-12T18:40:46Z

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs

-        private async Task PerformWriteAsync<T>(int writeBytes, T state, Func<T, Memory<byte>, FlushTiming> lockedAction, CancellationToken cancellationToken = default)
+            PerformWriteAsync(0, 0, (_, __) => true, cancellationToken);
+
+        private abstract class WriteQueueEntry : TaskCompletionSource


Doesn't need to be part of this PR, but given we're now allocating this on every write, it'd be really good to look at changing the base type to be one that implements IValueTaskSource and making this reusable.

Agreed, though it gets a little weird because we have a bunch of different write operations with different kinds of state.

There's a tradeoff to be made here. This is similar to what pipelines does. You don't wait on individual writes but you can have backpressure based on how much outstanding buffer is unflushed/unconsumed. That removes the per operation overhead and allocations.

Do you care about cancelling individual writes or do you care about stopping the overall writing operation?

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs

geoffkizer · 2020-07-12T19:36:33Z

Pushed changes to address everything except Channel<T>, which is more involved. Will try that change and see what happens...

JamesNK · 2020-07-12T21:27:45Z

I'm curious why the gRPC client gets such a large performance improvement compared to the "raw HttpClient" scenario. If anything I would have expected a smaller benefit to the gRPC client just because HttpClient+HTTP/2 makes up a smaller percentage of the work happening when it is used.

Have you profiled where the improvement comes from?

geoffkizer · 2020-07-13T01:39:04Z

I'm curious why the gRPC client gets such a large performance improvement compared to the "raw HttpClient" scenario. If anything I would have expected a smaller benefit to the gRPC client just because HttpClient+HTTP/2 makes up a smaller percentage of the work happening when it is used.

My assumption is that the "raw" mode isn't actually equivalent to full GRPC client, in terms of HTTP2 usage. That said, I haven't looked at the code for either in detail. Are you aware of any differences here?

Have you profiled where the improvement comes from?

No.

JamesNK · 2020-07-13T01:48:07Z

gRPC client has a custom HttpContent and reads content from Stream.

If you check out the readme at https://github.com/JamesNK/Http2Perf you can see there are 4 raw options. The closest one to the gRPC client is "r-stream-all"

geoffkizer · 2020-07-13T01:50:25Z

I'll try running with "r-stream-all" when my machine is back in a normal state (hopefully soon)

davidfowl · 2020-07-13T01:54:50Z

Feels like this could use a Channel<T>

edit: Just saw this #39166 (comment)

geoffkizer · 2020-07-13T01:56:01Z

I see similar (but slightly better) results with "r-stream-all" as compared to "g":

r-stream-all
baseline: 56
this pr: 129
difference: 130%

g (full GRPC client)
baseline: 51
this pr: 113
difference: 122%

JamesNK · 2020-07-13T02:02:38Z

Ok! So this change offers bigger perf benefits to custom HttpContent implementations than to ByteArrayContent.

For reference: https://github.com/JamesNK/Http2Perf/blob/65e634321bdb76dcb6311dc2396a9e0504fbe189/GrpcSampleClient/PushUnaryContent.cs#L22-L36

Edit: Interesting, the benchmark still has the older copy of PushUnaryContent with multiple writes. They were merged into one write to speed up the gRPC client. Is there still performance hit from writing to the request stream multiple times?

geoffkizer · 2020-07-13T02:12:05Z

Is there still performance hit from writing to the request stream multiple times?

There could be.

geoffkizer · 2020-07-13T02:32:34Z

The Channel<T> stuff works nicely, and seems to be a wash in terms of perf.

JamesNK

From my HttpClient novice perspective 😄

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs

davidfowl · 2020-07-13T20:07:28Z

@geoffkizer would you be open to try a branch using pipelines internally (even if we don't check it in?) for a performance comparison?

The only reason I ask is because the Pipe implementation is really close to the usage here.

geoffkizer · 2020-07-14T17:30:28Z

@davidfowl I'd be happy to give it a try, but I'm not sure how you see pipelines fitting in here. Can you explain?

davidfowl · 2020-07-14T18:11:00Z

The mix of ArrayBuffer + Channel is basically what the Pipe implementation gives you. This logic introduced into Http2Connection is what Kestrel does with a pipe. Pipe is used as the queue of bytes being shuffled between the application logic (the http2stream) and the logic writing to the socket (http2connection)

geoffkizer · 2020-07-14T18:25:25Z

Yeah, but we are not just shuffling bytes here. The callback is responsible for a bunch of stuff other than just pushing bytes to the connection -- e.g. assigning stream ID. framing, etc.

davidfowl · 2020-07-14T18:48:41Z

Yeah, but we are not just shuffling bytes here. The callback is responsible for a bunch of stuff other than just pushing bytes to the connection -- e.g. assigning stream ID. framing, etc.

In the end its all bytes, you'd just shift from doing it earlier to doing it later. You enqueue the write and then later frame the buffers on the way out. This has the added benefit of potentially batching bigger writes together framed a single HTTP/2 data frame on the wire.

For reference, here's Kestrel's implementation:

This is where the loop is kicked off:

https://github.com/dotnet/aspnetcore/blob/a861d18d244a632281ef4a1402f29ad1051e5bf9/src/Servers/Kestrel/Core/src/Internal/Http2/Http2OutputProducer.cs#L77

This is the loop itself:
https://github.com/dotnet/aspnetcore/blob/a861d18d244a632281ef4a1402f29ad1051e5bf9/src/Servers/Kestrel/Core/src/Internal/Http2/Http2OutputProducer.cs#L392

Http2FrameWriter is the component that does the flow control and adds http framing to the bytes:
https://github.com/dotnet/aspnetcore/blob/a861d18d244a632281ef4a1402f29ad1051e5bf9/src/Servers/Kestrel/Core/src/Internal/Http2/Http2OutputProducer.cs#L451

The callsite writes to the pipe directly and flushes:
https://github.com/dotnet/aspnetcore/blob/a861d18d244a632281ef4a1402f29ad1051e5bf9/src/Servers/Kestrel/Core/src/Internal/Http2/Http2OutputProducer.cs#L233

cc @halter73 to clarify if I got anything wrong

geoffkizer · 2020-07-14T19:03:00Z

Are you talking about specifically for the request body (not headers etc)? Because if so, then I get it and I agree. I'm planning on investigating something like this shortly. There's some additional groundwork that needs to happen first, though.

JamesNK · 2020-07-15T11:24:50Z

Are you talking about specifically for the request body (not headers etc)? Because if so, then I get it and I agree.

Yes, although in Kestrel's case everything is reversed because it is the server. The pipe that is being read from contains body bytes written by the executing RequestDelegate. They are intended to be the response's DATA frames.

The ProcessDataWrites method is writing the bytes it reads from the pipe as DATA frames (I believe the first write will send headers) and waiting for ReadResult.IsComplete so it can end the stream (either with DATA + end stream flag or HEADER trailers + end stream flag)

geoffkizer · 2020-07-16T15:41:33Z

Yes, we do something similar for reading from the connection today; what we don't do is we don't do any buffering when we write to the connection. Instead we just pass the user's buffer along and wait for the write to be processed. Part of what I'm planning to explore is to change this so we buffer the user writes, which potentially enables a couple things: (1) not having to wait for the write queue (assuming buffer space is not exhausted); (2) consolidate user writes, including the end stream flag; (3) simplify some handling in the connection write logic.

add write queue to Http2Connection to optimize write handling

1abdf80

Dotnet-GitSync-Bot added the area-System.Net.Http label Jul 12, 2020

stephentoub reviewed Jul 12, 2020

View reviewed changes

Geoffrey Kizer added 3 commits July 12, 2020 12:32

address feedback

ac4d30b

remove AsyncMutex

2e47cfa

remove AsyncMutex for real

a6f616e

use Channel<T> instead of custom queue

c45ff73

JamesNK approved these changes Jul 13, 2020

View reviewed changes

stephentoub reviewed Jul 13, 2020

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs Outdated Show resolved Hide resolved

stephentoub reviewed Jul 13, 2020

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs Show resolved Hide resolved

make channel options static

c3e5870

stephentoub approved these changes Jul 13, 2020

View reviewed changes

stephentoub merged commit a00b1e2 into dotnet:master Jul 14, 2020

This was referenced Aug 6, 2020

Http/2 headers not getting flushed if there are Continuation frames #860

Closed

HTTP/2 Continuation test #40533

Merged

karelz added this to the 5.0.0 milestone Aug 18, 2020

ManickaP mentioned this pull request Aug 24, 2020

HttpStress tests failing in difference scenarios on sending data to server #40388

Closed

geoffkizer deleted the http2writequeue branch November 7, 2020 23:42

ghost locked as resolved and limited conversation to collaborators Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add write queue to Http2Connection to optimize write handling #39166

add write queue to Http2Connection to optimize write handling #39166

geoffkizer commented Jul 12, 2020

ghost commented Jul 12, 2020

stephentoub Jul 12, 2020

geoffkizer Jul 12, 2020

davidfowl Jul 13, 2020

geoffkizer commented Jul 12, 2020

JamesNK commented Jul 12, 2020

geoffkizer commented Jul 13, 2020

JamesNK commented Jul 13, 2020

geoffkizer commented Jul 13, 2020

davidfowl commented Jul 13, 2020 •

edited

Loading

geoffkizer commented Jul 13, 2020

JamesNK commented Jul 13, 2020 •

edited

Loading

geoffkizer commented Jul 13, 2020

geoffkizer commented Jul 13, 2020

JamesNK left a comment

davidfowl commented Jul 13, 2020

geoffkizer commented Jul 14, 2020

davidfowl commented Jul 14, 2020

geoffkizer commented Jul 14, 2020

davidfowl commented Jul 14, 2020 •

edited

Loading

geoffkizer commented Jul 14, 2020

JamesNK commented Jul 15, 2020

geoffkizer commented Jul 16, 2020

add write queue to Http2Connection to optimize write handling #39166

add write queue to Http2Connection to optimize write handling #39166

Conversation

geoffkizer commented Jul 12, 2020

ghost commented Jul 12, 2020

stephentoub Jul 12, 2020

Choose a reason for hiding this comment

geoffkizer Jul 12, 2020

Choose a reason for hiding this comment

davidfowl Jul 13, 2020

Choose a reason for hiding this comment

geoffkizer commented Jul 12, 2020

JamesNK commented Jul 12, 2020

geoffkizer commented Jul 13, 2020

JamesNK commented Jul 13, 2020

geoffkizer commented Jul 13, 2020

davidfowl commented Jul 13, 2020 • edited Loading

geoffkizer commented Jul 13, 2020

JamesNK commented Jul 13, 2020 • edited Loading

geoffkizer commented Jul 13, 2020

geoffkizer commented Jul 13, 2020

JamesNK left a comment

Choose a reason for hiding this comment

davidfowl commented Jul 13, 2020

geoffkizer commented Jul 14, 2020

davidfowl commented Jul 14, 2020

geoffkizer commented Jul 14, 2020

davidfowl commented Jul 14, 2020 • edited Loading

geoffkizer commented Jul 14, 2020

JamesNK commented Jul 15, 2020

geoffkizer commented Jul 16, 2020

davidfowl commented Jul 13, 2020 •

edited

Loading

JamesNK commented Jul 13, 2020 •

edited

Loading

davidfowl commented Jul 14, 2020 •

edited

Loading