Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http2: possible memory leak in frameScratchBuffer #38049

Open
zoic21 opened this issue Mar 24, 2020 · 7 comments
Open

net/http2: possible memory leak in frameScratchBuffer #38049

zoic21 opened this issue Mar 24, 2020 · 7 comments
Milestone

Comments

@zoic21
Copy link

@zoic21 zoic21 commented Mar 24, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build656777711=/tmp/go-build -gno-record-gcc-switches"

What did you do?

To begin I'am new in goland community so maybe it's an issue in my side... I use this project https://github.com/mmatczuk/go-http-tunnel to make a http2 reverse proxy. I test with about 700 connected clients (but with 0 I get same issue). Every request grow up memory, after some analyse with pprof I got :

(pprof) top
Showing nodes accounting for 449.13MB, 98.24% of 457.15MB total
Dropped 42 nodes (cum <= 2.29MB)
Showing top 10 nodes out of 14
      flat  flat%   sum%        cum   cum%
  446.12MB 97.59% 97.59%   446.12MB 97.59%  golang.org/x/net/http2.(*ClientConn).frameScratchBuffer
    2.51MB  0.55% 98.14%     2.51MB  0.55%  bufio.NewWriterSize (inline)
    0.50MB  0.11% 98.24%     3.01MB  0.66%  golang.org/x/net/http2.(*Transport).newClientConn
         0     0% 98.24%     2.51MB  0.55%  bufio.NewWriter (inline)
         0     0% 98.24%     5.01MB  1.10%  crypto/tls.(*Conn).Handshake
         0     0% 98.24%     5.01MB  1.10%  crypto/tls.(*Conn).serverHandshake
         0     0% 98.24%     5.01MB  1.10%  crypto/tls.(*serverHandshakeStateTLS13).handshake

And deeper :

(pprof) list frameScratchBuffer
Total: 457.15MB
ROUTINE ======================== golang.org/x/net/http2.(*ClientConn).frameScratchBuffer in /root/go/src/golang.org/x/net/http2/transport.go
  446.12MB   446.12MB (flat, cum) 97.59% of Total
         .          .    884:                   cc.mu.Unlock()
         .          .    885:                   return buf[:size]
         .          .    886:           }
         .          .    887:   }
         .          .    888:   cc.mu.Unlock()
  446.12MB   446.12MB    889:   return make([]byte, size)
         .          .    890:}
         .          .    891:
         .          .    892:func (cc *ClientConn) putFrameScratchBuffer(buf []byte) {
         .          .    893:   cc.mu.Lock()
         .          .    894:   defer cc.mu.Unlock()

Same request few minute later show 517.31MB (I always request the http2 server)

What did you expect to see?

Stabilization of memory occupancy

What did you see instead?

Memory grows without stopping

I discover goland univers and I hope that it's a bad analysis on my part but if someone cloud help me it's with pleasure.

Thank in advance

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented Mar 25, 2020

Every new connection has its own set of scratch buffers which can be at most 4 x 512KB.

@zoic21
Copy link
Author

@zoic21 zoic21 commented Mar 25, 2020

Ok so If I understand max memory should be arround 2Mo*number client ? I will make some test to see that.

It's possible to change this buffer to 2 * 512KB for exemple ?

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented Mar 25, 2020

@andybons
Copy link
Member

@andybons andybons commented Mar 25, 2020

Will await test from @zoic21

@andybons andybons added this to the Unplanned milestone Mar 25, 2020
@andybons andybons changed the title net/http2 Possible memory leak in frameScratchBuffer net/http2: possible memory leak in frameScratchBuffer Mar 25, 2020
@zoic21
Copy link
Author

@zoic21 zoic21 commented Mar 26, 2020

After 1 day test with 700 client with 5 requetes in parrallele for 10 hours in continous, memory do that :
image

I think it's ok. We can see that the memory grows very quickly at the beginning (surely the time that the buffers fill up) and that thereafter it calms down until stabilizing around 1.1go.

But if someday someone add I option to change maxBuffer I'll take it, or maybe a system to `remove old client buffer. For information what I have now on go pprof :

Showing nodes accounting for 1205.87MB, 97.59% of 1235.60MB total
Dropped 81 nodes (cum <= 6.18MB)
Showing top 10 nodes out of 27
      flat  flat%   sum%        cum   cum%
 1198.35MB 96.98% 96.98%  1198.35MB 96.98%  golang.org/x/net/http2.(*ClientConn).frameScratchBuffer
    7.03MB  0.57% 97.55%     7.03MB  0.57%  golang.org/x/net/http2/hpack.(*headerFieldTable).addEntry
    0.50MB  0.04% 97.59%    10.54MB  0.85%  golang.org/x/net/http2.(*Transport).newClientConn
         0     0% 97.59%     7.01MB  0.57%  crypto/tls.(*Conn).Handshake
         0     0% 97.59%     7.01MB  0.57%  crypto/tls.(*Conn).serverHandshake
         0     0% 97.59%     7.01MB  0.57%  crypto/tls.(*serverHandshakeStateTLS13).handshake
         0     0% 97.59%     7.17MB  0.58%  github.com/jeedom/gotunnel.(*Server).ServeHTTP
         0     0% 97.59%    18.04MB  1.46%  github.com/jeedom/gotunnel.(*Server).handleClient
         0     0% 97.59%    10.54MB  0.85%  github.com/jeedom/gotunnel.(*connPool).AddConn
         0     0% 97.59%     6.67MB  0.54%  github.com/jeedom/gotunnel.transfer
(pprof) list frameScratchBuffer
Total: 1.21GB
ROUTINE ======================== golang.org/x/net/http2.(*ClientConn).frameScratchBuffer in /root/go/src/golang.org/x/net/http2/transport.go
    1.17GB     1.17GB (flat, cum) 96.98% of Total
         .          .    884:                   cc.mu.Unlock()
         .          .    885:                   return buf[:size]
         .          .    886:           }
         .          .    887:   }
         .          .    888:   cc.mu.Unlock()
    1.17GB     1.17GB    889:   return make([]byte, size)
         .          .    890:}
         .          .    891:
         .          .    892:func (cc *ClientConn) putFrameScratchBuffer(buf []byte) {
         .          .    893:   cc.mu.Lock()
         .          .    894:   defer cc.mu.Unlock()

So for me it's ok after understand how it's work, we can close the issue (sorry for the inconvenience) if everybody is ok.

@fraenkel
Copy link
Contributor

@fraenkel fraenkel commented Mar 26, 2020

Given go-http-tunnel has vendored the http2 code, you could experiment and adjust the value. I would however be curious how it affects throughput as well as memory.

@zoic21
Copy link
Author

@zoic21 zoic21 commented Mar 26, 2020

I already change that by last version of net/http2, my go.mod :

module github.com/jeedom/gotunnel

go 1.14

require (
	github.com/calmh/luhn v2.0.0+incompatible
	github.com/felixge/tcpkeepalive v0.0.0-20160804073959-5bb0b2dea91e
	github.com/go-sql-driver/mysql v1.5.0
	github.com/inconshreveable/go-vhost v0.0.0-20160627193104-06d84117953b
	golang.org/x/net v0.0.0-20200324143707-d3edc9973b7e
	golang.org/x/text v0.1.1-0.20171102192421-88f656faf3f3 // indirect
	gopkg.in/yaml.v2 v2.2.2
	github.com/golang/mock v1.2.0
	github.com/cenkalti/backoff v2.1.1+incompatible
	)

Same result

ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 4, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this was because we were creating a new `*vault.Config` every time we
retrieved credentials from vault so that we would pick up any new changes to
`VAULT_CACERT`. This had the unfortunate side effect of creating, amongst other
things, a new pool of connections for each new config. Each connection has a set
of 4 scratch buffers with a maximum size of 512MB each. I believe that by
creating new connection pools we were proliferating these buffers.

See: golang/go#38049.

I haven't investigated and confirmed this theory fully but in any case, creating
one `*vault.Config` and simply reading config values from the environment with
`config.ReadEnvironment()` stops the leak.

I've also added the UW operational endpoint so that we can profile these issues
live in future.
ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 4, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this was because we were creating a new `*vault.Config` every time we
retrieved credentials from vault so that we would pick up any new changes to
`VAULT_CACERT`. This had the unfortunate side effect of creating, amongst other
things, a new pool of connections for each new config. Each connection has a set
of 4 scratch buffers with a maximum size of 512MB each. I believe that by
creating new connection pools we were proliferating these buffers.

See: golang/go#38049.

I haven't investigated and confirmed this theory fully but in any case, creating
one `*vault.Config` and simply reading config values from the environment with
`config.ReadEnvironment()` stops the leak.

I've also added the UW operational endpoint, which includes pprof so that we can
profile these kinds of issues live in the future.
ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 5, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this is because we were never closing the response body when we were
retrieving a lease from vault. From my testing, it seems that the rate at which
memory was leaking was also exacerbated by the fact that we were creating a new
`*vault.Config` every time we retrieved credentials from vault. Presumably this
was because we were creating a new pool of connections (and therefore a new set
of scratch buffers) for each config, although I haven't fully validated this hypothesis.

See: golang/go#38049.

Ensuring that the response body is closed after decoding the json from it and
using one `*vault.Config` seems to stop the leak.

I've also added the UW operational endpoint so that we can profile these issues
live in future.
ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 5, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this is because we were never closing the response body when we were
retrieving a lease from vault. From my testing, it seems that the rate at which
memory was leaking was also exacerbated by the fact that we were creating a new
`*vault.Config` every time we retrieved credentials from vault. Presumably this
was because we were creating a new pool of connections (and therefore a new set
of scratch buffers) for each config, although I haven't fully validated this hypothesis.

See: golang/go#38049.

Ensuring that the response body is closed after decoding the json from it and
using one `*vault.Config` seems to stop the leak.

I've also added the UW operational endpoint so that we can profile these issues
live in future.
ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 5, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this is because we were never closing the response body when we were
retrieving a lease from vault. From my testing, it seems that the rate at which
memory was leaking was also exacerbated by the fact that we were creating a new
`*vault.Config` every time we retrieved credentials from vault. Presumably this
was because we were creating a new pool of connections (and therefore a new set
of scratch buffers) for each config, although I haven't fully validated this hypothesis.

See: golang/go#38049.

Ensuring that the response body is closed after decoding the json from it and
using one `*vault.Config` seems to stop the leak.

I've also added the UW operational endpoint so that we can profile these issues
live in future.
ribbybibby added a commit to utilitywarehouse/vault-kube-cloud-credentials that referenced this issue May 6, 2020
We were leaking memory steadily in
`golang.org/x/net/http2.(*ClientConn).frameScratchBuffer` every time we
retrieved a new set of credentials.

I believe this is because we were never closing the response body when we were
retrieving a lease from vault. From my testing, it seems that the rate at which
memory was leaking was also exacerbated by the fact that we were creating a new
`*vault.Config` every time we retrieved credentials from vault. Presumably this
was because we were creating a new pool of connections (and therefore a new set
of scratch buffers) for each config, although I haven't fully validated this hypothesis.

See: golang/go#38049.

Ensuring that the response body is closed after decoding the json from it and
using one `*vault.Config` seems to stop the leak.

I've also added the UW operational endpoint so that we can profile these issues
live in future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.