Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: http/2 throughput is very slow compared to http/1.1 #47840

Open
kgersen opened this issue Aug 20, 2021 · 16 comments
Open

net/http: http/2 throughput is very slow compared to http/1.1 #47840

kgersen opened this issue Aug 20, 2021 · 16 comments
Labels
NeedsInvestigation
Milestone

Comments

@kgersen
Copy link

@kgersen kgersen commented Aug 20, 2021

What version of Go are you using (go version)?

1.7

$ go version
go version go1.17 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Linux, Windows, etc

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/dev/http2issue/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build38525640=/tmp/go-build -gno-record-gcc-switches"

What did you do?

test http/2 vs http1.1 transfert speed with client and server from standard lib

see a complete POC here: https://github.com/nspeed-app/http2issue

The issue is general, loopback (localhost) or over the wire.

What did you expect to see?

same speed order than http1.1

What did you see instead?

http/2 is at least x5 slower or worst

@neild
Copy link
Contributor

@neild neild commented Aug 20, 2021

You're comparing encrypted HTTP/2 with unencrypted HTTP. You need to compare HTTP/2 with HTTPS to compare like with like.

@mknyszek mknyszek added the NeedsInvestigation label Aug 20, 2021
@mknyszek mknyszek added this to the Backlog milestone Aug 20, 2021
@kgersen
Copy link
Author

@kgersen kgersen commented Aug 20, 2021

hi, the POC is using H2C to avoid encryption.
I also compared encrypted versions with the product we're developing and we have the same issue.

@kgersen
Copy link
Author

@kgersen kgersen commented Aug 25, 2021

Caddy a web server written in Go has the same issue. I've updated the POC with a sample Caddy config: https://github.com/nspeed-app/http2issue/tree/main/3rd-party/Caddy

@neild
Copy link
Contributor

@neild neild commented Nov 8, 2021

hi, the POC is using H2C to avoid encryption.

Apologies, I missed this in the original issue and didn't see the followup.

I have not had time to look at this further, adding to my queue. (But no promises on timing.)

@andig
Copy link
Contributor

@andig andig commented Nov 8, 2021

Wouldn‘t the first step be to run this against a non-go server/client to localize it on either side if possible?

@tandr
Copy link

@tandr tandr commented Nov 10, 2021

Robert Engels has a CL related to this https://go-review.googlesource.com/c/net/+/362834

@robaho
Copy link

@robaho robaho commented Nov 10, 2021

Copying my comment from https://groups.google.com/d/msgid/golang-nuts/89926c2f-ec73-43ad-be49-a8bc76a18345n%40googlegroups.com

Http2 is a multiplexed protocol with independent streams. The Go implementation uses a common reader thread/routine to read all of the connection content, and then demuxes the streams and passes the data via pipes to the stream readers. This multithreaded nature requires the use of locks to coordinate. By managing the window size, the connection reader should never block writing to a steam buffer - but a stream reader may stall waiting for data to arrive - get descheduled - only to be quickly rescheduled when reader places more data in the buffer - which is inefficient.

Out of the box on my machine, http1 is about 37 Gbps, and http2 is about 7 Gbps on my system.

Some things that jump out:

  1. The chunk size is too small. Using 1MB pushed http1 from 37 Gbs to 50 Gbps, and http2 to 8 Gbps.

  2. The default buffer in io.Copy() is too small. Use io.CopyBuffer() with a larger buffer - I changed to 4MB. This pushed http1 to 55 Gbs, and http2 to 8.2. Not a big difference but needed for later.

  3. The http2 receiver frame size of 16k is way too small. There is overhead on every frame - the most costly is updating the window.

I made some local mods to the net library, increasing the frame size to 256k, and the http2 performance went from 8Gbps to 38Gbps.

  1. I haven’t tracked it down yet, but I don’t think the window size update code is not working as intended - it seems to be sending window updates (which are expensive due to locks) far too frequently. I think this is the area that could use the most improvement - using some heuristics there is the possibility to detect the sender rate, and adjust the refresh rate (using high/low water marks).

  2. The implementation might need improvements using lock-free structures, atomic counters, and busy-waits in order to achieve maximum performance.

So 38Gbps for http2 vs 55 Gbps for http1. Better but still not great. Still, with some minor changes, the net package could allow setting of a large frame size on a per stream basis - which would enable much higher throughput. The gRPC library allows this.

@robaho
Copy link

@robaho robaho commented Nov 12, 2021

My CL goes a long way to addressing the issue.

Still, some additional testing has shown that the calls to update the window from the client (the downloader) don't seem to be optimal for large transfers - even with the 10x frame size. The window update calls cause contention on the locks.

@robaho
Copy link

@robaho robaho commented Nov 12, 2021

Changing trasport.go:2418 from
if v < transportDefaultStreamFlow-transportDefaultStreamMinRefresh {
to
if v < transportDefaultStreamFlow/2 {
results in a nearly 50% increase in throughput using the default frame size of 16k.

Ideally, this code would make the determination based on receive and consume rates along with the frame size.

@aojea
Copy link

@aojea aojea commented Dec 1, 2021

I don't know if you are referring to this buffer in one of your comments, but using the profiler it shows a lot of contention on the bufWriterPool

go/src/net/http/h2_bundle.go

Lines 3465 to 3468 in 0e1d553

// TODO: pick a less arbitrary value? this is a bit under
// (3 x typical 1500 byte MTU) at least. Other than that,
// not much thought went into it.
const http2bufWriterPoolBufferSize = 4 << 10

Increasing the value there to the maxFrameSize value, and using the shared code in the description, you can go from 6Gbps to 12.6 Gbps

@robaho
Copy link

@robaho robaho commented Dec 1, 2021

As I pointed out above, increasing the frame size can achieve 38 Gbps. The issue is that constant is used for all connections. The 'max frame size' is connection dependent.

More importantly, that constant does not exist in golang.org/x/net/http - which is the basis of the future version.

@aojea
Copy link

@aojea aojea commented Dec 2, 2021

As I pointed out above, increasing the frame size can achieve 38 Gbps. The issue is that constant is used for all connections. The 'max frame size' is connection dependent.

yeah, I've should explained myself better, sorry, in addition to increase the frame size that require manual configuration, it is an user decision to maximize throughput, I just wanted to add that maybe we can improve the default throughput with that change, since it is clear that the author left open that parameter to debate and it is a 2x win (at a cost of increasing memory cost of course, but this is inside a sync.Pool that may alleviate this problem a bit)

More importantly, that constant does not exist in golang.org/x/net/http - which is the basis of the future version.

it does, just with a different name 😄

https://github.com/golang/net/blob/0fccb6fa2b5ce302a9da5afc2513d351bd175889/http2/http2.go#L256-L259

IIUIC the http2 code in golang/go is a bundle created from the x/net/http2

@neild
Copy link
Contributor

@neild neild commented Dec 6, 2021

Thanks for the excellent analysis!

Ideally, we shouldn't require the user to twiddle configuration parameters to get good performance. However, making the maximum client-initiated frame size user-configurable seems like a reasonable first step.

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 6, 2021

Change https://golang.org/cl/362834 mentions this issue: http2: add Transport.MaxReadFrameSize configuration setting

@kgersen
Copy link
Author

@kgersen kgersen commented May 19, 2022

any update to this ? is someone at Google even working on this or is there no point waiting ?

@robaho
Copy link

@robaho robaho commented May 20, 2022

There has been a CL submitted - it is stuck in review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation
Projects
None yet
Development

No branches or pull requests

8 participants