In the test, I set up an http/2 server/client pair, limited the number of TCP connections that could carry HTTP requests and started a large number of HTTP requests. I then measured how long it took to create and cancel one additional request.
What did you expect to see?
I expected the speed of creating and canceling additional requests to not vary based on the number of other HTTP requests waiting on the TCP connection.
What did you see instead?
Creating and canceling an HTTP request gets slow as a function of how many requests are waiting for the same TCP connection when HTTP/2 is active.
The code in net/http/h2_bundle.go that bridges between
the channel for canceling a single request and the sync.Cond that guards the TCP connection (net/http.http2ClientConn.awaitOpenSlotForRequest) responds to request cancelation by waking up every goroutine that's waiting to use the TCP connection (with sync.Cond.Broadcast). Each of those goroutines in sequence will acquire the lock on the *http2ClientConn to check if there's room to send another request.
On top of that, the contention on the sync.Mutex protecting a single connection results in a slowdown on the sync.Mutex protecting the Transport's HTTP/2 connection pool when *http2clientConnPool.getClientConn calls cc.idleState while holding the pool's lock.
In the reproducer, the baseline speed of creating and canceling an HTTP/2 request is 1ms since the test waits that long before canceling to give the RoundTrip goroutine time to find the TCP connection in the pool and start waiting for an available slot.
When there are a small number of outstanding requests (100 or 200), creating and canceling an additional request takes about 1.3ms: that baseline of 1ms plus 300µs of actual time.
As the number of outstanding requests grows past the capacity of the single TCP connection (the default Go http/2 server sets that to 250 HTTP requests), creating and canceling a request wakes up more and more goroutines. The cost of this is still small with 1600 idle requests (1.1ms over the 1ms baseline), but with 6400 idle requests it's grown to 5.9ms over the 1ms baseline. With 100k idle requests, the cost to cancel one is nearly one second of work.
With N idle requests that all time out / are canceled, the total cost is O(N^2). The cost should be O(N).