Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: http2 requests blocked after timeout #57478

Open
stampy88 opened this issue Dec 27, 2022 · 3 comments
Open

net/http: http2 requests blocked after timeout #57478

stampy88 opened this issue Dec 27, 2022 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@stampy88
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.18.6 linux/amd64

Does this issue reproduce with the latest release?

Have not attempted yet as it is hard to reproduce. I am trying to attempt to write a reproducer.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/XXXX/.cache/go-build"
GOENV="/home/XXXX/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/XXXX/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/XXXX/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.6"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3899516689=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The application does HTTP POSTs to configured clients endpoints whenever an event is received. The HTTP client is configured with a 5 second timeout so that long running requests don't block an event consumer too long.

What did you expect to see?

When a timeout occurs, I expected all go routines that the std library creates to handle the HTTP request to be cleaned up.

What did you see instead?

The server the app is connecting to supports HTTP 2. During periods where timeouts are occurring we can see the number of go routines steadily increase until the server it is trying to communicate to starts responding again. See stack traces below:

goroutine profile: total 1451
1258 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x81d5af 0x81d59a 0x81bfee 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24			/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:162
#	0x81d5ae	sync.(*Mutex).Lock+0x30e				/usr/local/go/src/sync/mutex.go:81
#	0x81d599	net/http.(*http2clientStream).cleanupWriteRequest+0x2f9	/usr/local/go/src/net/http/h2_bundle.go:8159
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

36 @ 0x438936 0x406a0c 0x406478 0x1005f13 0x468741
#	0x1005f12	main.main.func3.1+0xf2	/XXXXXX/main.go:383

16 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x51af74 0x4c5f7a 0x7fe96e 0x7fe928 0x7ff1d5 0x8214f0 0x82094f 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c		/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97			/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4		/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115	/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e		/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f			/usr/local/go/src/crypto/tls/conn.go:1285
#	0x51af73	bufio.(*Reader).Read+0x1b3			/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99				/usr/local/go/src/io/io.go:331
#	0x7fe96d	io.ReadFull+0x6d				/usr/local/go/src/io/io.go:350
#	0x7fe927	net/http.http2readFrameHeader+0x27		/usr/local/go/src/net/http/h2_bundle.go:1566
#	0x7ff1d4	net/http.(*http2Framer).ReadFrame+0x94		/usr/local/go/src/net/http/h2_bundle.go:1830
#	0x8214ef	net/http.(*http2clientConnReadLoop).run+0x12f	/usr/local/go/src/net/http/h2_bundle.go:8820
#	0x82094e	net/http.(*http2ClientConn).readLoop+0x6e	/usr/local/go/src/net/http/h2_bundle.go:8716

13 @ 0x438936 0x449473 0x44944d 0x464745 0x474172 0x100ab15 0x10099af 0x1006073 0x468741
#	0x464744	sync.runtime_Semacquire+0x24		/usr/local/go/src/runtime/sema.go:56
#	0x474171	sync.(*WaitGroup).Wait+0x51		/usr/local/go/src/sync/waitgroup.go:136
#	0x100ab14	main.(*server).doProcessPacket+0x1034	XXXXXX/processpacket.go:168
#	0x10099ae	main.(*server).processPacket+0x18e	XXXXXX/processpacket.go:45
#	0x1006072	main.main.func3.1+0x252			XXXXXX/main.go:390

12 @ 0x438936 0x448392 0x857275 0x468741
#	0x857274	net/http.(*persistConn).writeLoop+0xf4	/usr/local/go/src/net/http/transport.go:2392

12 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x825fb2 0x825f7e 0x81d545 0x81bfee 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24			/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:162
#	0x825fb1	sync.(*Mutex).Lock+0x51					/usr/local/go/src/sync/mutex.go:81
#	0x825f7d	net/http.(*http2ClientConn).writeStreamReset+0x1d	/usr/local/go/src/net/http/h2_bundle.go:9574
#	0x81d544	net/http.(*http2clientStream).cleanupWriteRequest+0x2a4	/usr/local/go/src/net/http/h2_bundle.go:8145
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

11 @ 0x438936 0x448392 0x81bb71 0x817cb7 0x8264db 0x8264d5 0x84b790 0x832ab9 0x7f43f8 0x7f3c9b 0x7f60d5 0x100e0e6 0x100e0d9 0x468741
#	0x81bb70	net/http.(*http2ClientConn).RoundTrip+0x450		/usr/local/go/src/net/http/h2_bundle.go:7874
#	0x817cb6	net/http.(*http2Transport).RoundTripOpt+0x1b6		/usr/local/go/src/net/http/h2_bundle.go:7218
#	0x8264da	net/http.(*http2Transport).RoundTrip+0x1a		/usr/local/go/src/net/http/h2_bundle.go:7179
#	0x8264d4	net/http.http2noDialH2RoundTripper.RoundTrip+0x14	/usr/local/go/src/net/http/h2_bundle.go:9682
#	0x84b78f	net/http.(*Transport).roundTrip+0x38f			/usr/local/go/src/net/http/transport.go:539
#	0x832ab8	net/http.(*Transport).RoundTrip+0x18			/usr/local/go/src/net/http/roundtrip.go:17
#	0x7f43f7	net/http.send+0x5d7					/usr/local/go/src/net/http/client.go:252
#	0x7f3c9a	net/http.(*Client).send+0x9a				/usr/local/go/src/net/http/client.go:176
#	0x7f60d4	net/http.(*Client).do+0x8f4				/usr/local/go/src/net/http/client.go:725
#	0x100e0e5	net/http.(*Client).Do+0x985				/usr/local/go/src/net/http/client.go:593
#	0x100e0d8	main.(*server).sendRestMessage+0x978			/XXXXXX/rest.go:89

10 @ 0x438936 0x406a0c 0x406438 0x8f57d8 0x468741
#	0x8f57d7	google.golang.org/grpc.(*addrConn).resetTransport+0x477	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/clientconn.go:1155

10 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x51af74 0x4c5f7a 0x86c2ce 0x86c288 0x86cb35 0x889c73 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88					/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31					/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259				/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247						/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28							/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44							/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c					/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97						/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4					/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115				/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e					/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f						/usr/local/go/src/crypto/tls/conn.go:1285
#	0x51af73	bufio.(*Reader).Read+0x1b3						/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99							/usr/local/go/src/io/io.go:331
#	0x86c2cd	io.ReadFull+0x6d							/usr/local/go/src/io/io.go:350
#	0x86c287	golang.org/x/net/http2.readFrameHeader+0x27				/var/go/go/pkg/mod/golang.org/x/net@v0.0.0-20220425223048-2871e0cb64e4/http2/frame.go:237
#	0x86cb34	golang.org/x/net/http2.(*Framer).ReadFrame+0x94				/var/go/go/pkg/mod/golang.org/x/net@v0.0.0-20220425223048-2871e0cb64e4/http2/frame.go:498
#	0x889c72	google.golang.org/grpc/internal/transport.(*http2Client).reader+0x232	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/http2_client.go:1270

10 @ 0x438936 0x448392 0x8790b5 0x879745 0x882045 0x468741
#	0x8790b4	google.golang.org/grpc/internal/transport.(*controlBuffer).get+0x114	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/controlbuf.go:395
#	0x879744	google.golang.org/grpc/internal/transport.(*loopyWriter).run+0x84	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/controlbuf.go:513
#	0x882044	google.golang.org/grpc/internal/transport.newHTTP2Client.func3+0x64	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/http2_client.go:346

9 @ 0x438936 0x448392 0x81c16c 0x81bfde 0x468741
#	0x81c16b	net/http.(*http2clientStream).writeRequest+0x14b	/usr/local/go/src/net/http/h2_bundle.go:7929
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d		/usr/local/go/src/net/http/h2_bundle.go:7904

6 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x85476e 0x51a9a3 0x51aafd 0x85558c 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c		/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97			/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4		/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115	/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e		/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f			/usr/local/go/src/crypto/tls/conn.go:1285
#	0x85476d	net/http.(*persistConn).Read+0x4d		/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51aafc	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x85558b	net/http.(*persistConn).readLoop+0x1ab		/usr/local/go/src/net/http/transport.go:2093

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x85476e 0x51a9a3 0x51b58f 0x7f12a5 0x7f0d28 0x7f0fee 0x8492dc 0x8491a5 0x858ea2 0x517fbf 0x517bbb 0x517818 0xf2b36b 0xf2afba 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88				/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31				/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259			/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247					/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28						/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44						/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c				/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97					/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4				/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115			/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e				/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f					/usr/local/go/src/crypto/tls/conn.go:1285
#	0x85476d	net/http.(*persistConn).Read+0x4d				/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102					/usr/local/go/src/bufio/bufio.go:106
#	0x51b58e	bufio.(*Reader).ReadSlice+0x2e					/usr/local/go/src/bufio/bufio.go:371
#	0x7f12a4	net/http/internal.readChunkLine+0x24				/usr/local/go/src/net/http/internal/chunked.go:129
#	0x7f0d27	net/http/internal.(*chunkedReader).beginChunk+0x27		/usr/local/go/src/net/http/internal/chunked.go:48
#	0x7f0fed	net/http/internal.(*chunkedReader).Read+0x14d			/usr/local/go/src/net/http/internal/chunked.go:98
#	0x8492db	net/http.(*body).readLocked+0x3b				/usr/local/go/src/net/http/transfer.go:844
#	0x8491a4	net/http.(*body).Read+0x124					/usr/local/go/src/net/http/transfer.go:836
#	0x858ea1	net/http.(*bodyEOFSignal).Read+0x141				/usr/local/go/src/net/http/transport.go:2774
#	0x517fbe	encoding/json.(*Decoder).refill+0x17e				/usr/local/go/src/encoding/json/stream.go:165
#	0x517bba	encoding/json.(*Decoder).readValue+0xba				/usr/local/go/src/encoding/json/stream.go:140
#	0x517817	encoding/json.(*Decoder).Decode+0x77				/usr/local/go/src/encoding/json/stream.go:63
#	0xf2b36a	XXXXXX/kuberesolver.(*streamWatcher).Decode+0x6a	XXXXXX/kuberesolver@v2.1.7+incompatible/stream.go:96
#	0xf2afb9	XXXXXX/kuberesolver.(*streamWatcher).receive+0x119	XXXXXX/kuberesolver@v2.1.7+incompatible/stream.go:72

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x8345ed 0x51a9a3 0x51b58f 0x51b7e7 0x7d4ff9 0x82f9b9 0x82f9ba 0x835b8a 0x839f8b 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x8345ec	net/http.(*connReader).Read+0x16c		/usr/local/go/src/net/http/server.go:780
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51b58e	bufio.(*Reader).ReadSlice+0x2e			/usr/local/go/src/bufio/bufio.go:371
#	0x51b7e6	bufio.(*Reader).ReadLine+0x26			/usr/local/go/src/bufio/bufio.go:400
#	0x7d4ff8	net/textproto.(*Reader).readLineSlice+0x98	/usr/local/go/src/net/textproto/reader.go:57
#	0x82f9b8	net/textproto.(*Reader).ReadLine+0x78		/usr/local/go/src/net/textproto/reader.go:38
#	0x82f9b9	net/http.readRequest+0x79			/usr/local/go/src/net/http/request.go:1029
#	0x835b89	net/http.(*conn).readRequest+0x249		/usr/local/go/src/net/http/server.go:988
#	0x839f8a	net/http.(*conn).serve+0x32a			/usr/local/go/src/net/http/server.go:1891

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x85476e 0x51a9a3 0x51aafd 0x85558c 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x85476d	net/http.(*persistConn).Read+0x4d		/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51aafc	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x85558b	net/http.(*persistConn).readLoop+0x1ab		/usr/local/go/src/net/http/transport.go:2093

3 @ 0x438936 0x448392 0x856179 0x468741
#	0x856178	net/http.(*persistConn).readLoop+0xd98	/usr/local/go/src/net/http/transport.go:2213

3 @ 0x438936 0x448392 0x8ed3cf 0x468741
#	0x8ed3ce	google.golang.org/grpc.(*ccBalancerWrapper).watcher+0x8e	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/balancer_conn_wrappers.go:69

3 @ 0x438936 0x448392 0xf29867 0xf289db 0xf2b5de 0xf2b50e 0x468741
#	0xf29866	XXXXXX/kuberesolver.(*kResolver).watch+0x206		XXXXXX/kuberesolver@v2.1.7+incompatible/builder.go:276
#	0xf289da	XXXXXX/kuberesolver.(*kubeBuilder).Build.func1+0x3a	XXXXXX/kuberesolver@v2.1.7+incompatible/builder.go:161
#	0xf2b5dd	XXXXXX/kuberesolver.until.func1+0x3d			XXXXXX/kuberesolver@v2.1.7+incompatible/util.go:19
#	0xf2b50d	XXXXXX/kuberesolver.until+0x4d				XXXXXX/kuberesolver@v2.1.7+incompatible/util.go:20

2 @ 0x438936 0x448392 0x81c9e9 0x81bfde 0x468741
#	0x81c9e8	net/http.(*http2clientStream).writeRequest+0x9c8	/usr/local/go/src/net/http/h2_bundle.go:8042
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d		/usr/local/go/src/net/http/h2_bundle.go:7904

2 @ 0x438936 0x448392 0x858134 0x84bbc9 0x832ab9 0x7f43f8 0x7f3c9b 0x7f60d5 0x100e0e6 0x100e0d9 0x468741
#	0x858133	net/http.(*persistConn).roundTrip+0x973	/usr/local/go/src/net/http/transport.go:2620
#	0x84bbc8	net/http.(*Transport).roundTrip+0x7c8	/usr/local/go/src/net/http/transport.go:594
#	0x832ab8	net/http.(*Transport).RoundTrip+0x18	/usr/local/go/src/net/http/roundtrip.go:17
#	0x7f43f7	net/http.send+0x5d7			/usr/local/go/src/net/http/client.go:252
#	0x7f3c9a	net/http.(*Client).send+0x9a		/usr/local/go/src/net/http/client.go:176
#	0x7f60d4	net/http.(*Client).do+0x8f4		/usr/local/go/src/net/http/client.go:725
#	0x100e0e5	net/http.(*Client).Do+0x985		/usr/local/go/src/net/http/client.go:593
#	0x100e0d8	main.(*server).sendRestMessage+0x978	XXXXXXs/rest.go:89

1 @ 0x40bf14 0x464eaf 0xf2c6d9 0x468741
#	0x464eae	os/signal.signal_recv+0x2e	/usr/local/go/src/runtime/sigqueue.go:151
#	0xf2c6d8	os/signal.loop+0x18		/usr/local/go/src/os/signal/signal_unix.go:23

1 @ 0x438936 0x406a0c 0x406438 0x10055e9 0x438572 0x468741
#	0x10055e8	main.main+0x1028	XXXXXX/main.go:425
#	0x438571	runtime.main+0x211	/usr/local/go/src/runtime/proc.go:250

1 @ 0x438936 0x406a0c 0x406438 0xf38312 0x468741
#	0xf38311	XXXXXX/app.(*Server).handleSignals+0x151	XXXXXX/app/server.go:194

1 @ 0x438936 0x406a0c 0x406478 0xea7614 0x468741
#	0xea7613	github.com/eclipse/paho%2emqtt%2egolang.startComms.func2+0x73	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:430

1 @ 0x438936 0x406a0c 0x406478 0xea76b1 0x468741
#	0xea76b0	github.com/eclipse/paho%2emqtt%2egolang.startComms.func1+0x70	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:407

1 @ 0x438936 0x406a0c 0x406478 0xeaab1c 0x468741
#	0xeaab1b	github.com/eclipse/paho%2emqtt%2egolang.(*router).matchAndDispatch.func2+0xbb	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/router.go:174

1 @ 0x438936 0x406a0c 0x406478 0xf62628 0x468741
#	0xf62627	github.com/streadway/amqp.(*consumers).buffer+0x107	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/consumers.go:61

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x51af74 0x4c5f7a 0xf63993 0xf6396f 0xf6048a 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88			/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31			/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259		/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247				/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28					/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44					/usr/local/go/src/net/net.go:183
#	0x51af73	bufio.(*Reader).Read+0x1b3				/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99					/usr/local/go/src/io/io.go:331
#	0xf63992	io.ReadFull+0x52					/usr/local/go/src/io/io.go:350
#	0xf6396e	github.com/streadway/amqp.(*reader).ReadFrame+0x2e	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/read.go:49
#	0xf60489	github.com/streadway/amqp.(*Connection).reader+0x1c9	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/connection.go:521

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x4c5f7a 0xe8111f 0xe81102 0xea4ce5 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88					/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31					/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259				/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247						/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28							/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44							/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c					/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97						/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4					/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115				/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e					/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f						/usr/local/go/src/crypto/tls/conn.go:1285
#	0x4c5f79	io.ReadAtLeast+0x99							/usr/local/go/src/io/io.go:331
#	0xe8111e	io.ReadFull+0x5e							/usr/local/go/src/io/io.go:350
#	0xe81101	github.com/eclipse/paho.mqtt.golang/packets.ReadPacket+0x41		/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/packets/packets.go:131
#	0xea4ce4	github.com/eclipse/paho%2emqtt%2egolang.startIncoming.func1+0xe4	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:124

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x83411f 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x83411e	net/http.(*connReader).backgroundRead+0x3e	/usr/local/go/src/net/http/server.go:672

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ceb05 0x4ceaf2 0x694229 0x6a5525 0x788528 0x7889f1 0x789cf4 0x8176e9 0x51c5a2 0x825fdc 0x81d545 0x81bfee 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88			/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31			/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ceb04	internal/poll.(*pollDesc).waitWrite+0x2c4		/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4ceaf1	internal/poll.(*FD).Write+0x2b1				/usr/local/go/src/internal/poll/fd_unix.go:391
#	0x694228	net.(*netFD).Write+0x28					/usr/local/go/src/net/fd_posix.go:96
#	0x6a5524	net.(*conn).Write+0x44					/usr/local/go/src/net/net.go:195
#	0x788527	crypto/tls.(*Conn).write+0x107				/usr/local/go/src/crypto/tls/conn.go:920
#	0x7889f0	crypto/tls.(*Conn).writeRecordLocked+0x350		/usr/local/go/src/crypto/tls/conn.go:988
#	0x789cf3	crypto/tls.(*Conn).Write+0x3f3				/usr/local/go/src/crypto/tls/conn.go:1159
#	0x8176e8	net/http.http2stickyErrWriter.Write+0x148		/usr/local/go/src/net/http/h2_bundle.go:7133
#	0x51c5a1	bufio.(*Writer).Flush+0x61				/usr/local/go/src/bufio/bufio.go:628
#	0x825fdb	net/http.(*http2ClientConn).writeStreamReset+0x7b	/usr/local/go/src/net/http/h2_bundle.go:9576
#	0x81d544	net/http.(*http2clientStream).cleanupWriteRequest+0x2a4	/usr/local/go/src/net/http/h2_bundle.go:8145
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4d0a0c 0x4d09f9 0x695ab5 0x6ae528 0x6ad6fd 0x83f2e5 0x83ef1d 0x10069cf 0x100695b 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4d0a0b	internal/poll.(*pollDesc).waitRead+0x22b	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4d09f8	internal/poll.(*FD).Accept+0x218		/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x695ab4	net.(*netFD).accept+0x34			/usr/local/go/src/net/fd_unix.go:172
#	0x6ae527	net.(*TCPListener).accept+0x27			/usr/local/go/src/net/tcpsock_posix.go:139
#	0x6ad6fc	net.(*TCPListener).Accept+0x3c			/usr/local/go/src/net/tcpsock.go:288
#	0x83f2e4	net/http.(*Server).Serve+0x384			/usr/local/go/src/net/http/server.go:3039
#	0x83ef1c	net/http.(*Server).ListenAndServe+0x7c		/usr/local/go/src/net/http/server.go:2968
#	0x10069ce	net/http.ListenAndServe+0xce			/usr/local/go/src/net/http/server.go:3222
#	0x100695a	main.main.func1+0x5a				XXXXXX/main.go:286

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4d0a0c 0x4d09f9 0x695ab5 0x6ae528 0x6ad6fd 0x83f2e5 0xf37dff 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88				/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31				/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4d0a0b	internal/poll.(*pollDesc).waitRead+0x22b			/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4d09f8	internal/poll.(*FD).Accept+0x218				/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x695ab4	net.(*netFD).accept+0x34					/usr/local/go/src/net/fd_unix.go:172
#	0x6ae527	net.(*TCPListener).accept+0x27					/usr/local/go/src/net/tcpsock_posix.go:139
#	0x6ad6fc	net.(*TCPListener).Accept+0x3c					/usr/local/go/src/net/tcpsock.go:288
#	0x83f2e4	net/http.(*Server).Serve+0x384					/usr/local/go/src/net/http/server.go:3039
#	0xf37dfe	XXXXXX/app.(*Server).Start.func1+0xbe	XXXXXX/app/server.go:146

1 @ 0x438936 0x448392 0x1006492 0x468741
#	0x1006491	main.main.func2+0x151	XXXXXX/main.go:356

1 @ 0x438936 0x448392 0xe9c8fb 0x468741
#	0xe9c8fa	github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers.func2+0xfa	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/client.go:663

1 @ 0x438936 0x448392 0xe9ce5c 0x468741
#	0xe9ce5b	github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers.func1+0x17b	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/client.go:628

1 @ 0x438936 0x448392 0xea51cd 0x468741
#	0xea51cc	github.com/eclipse/paho%2emqtt%2egolang.startIncomingComms.func1+0x18c	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:175

1 @ 0x438936 0x448392 0xea6565 0x468741
#	0xea6564	github.com/eclipse/paho%2emqtt%2egolang.startOutgoingComms.func1+0x1e4	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:277

1 @ 0x438936 0x448392 0xea9569 0x468741
#	0xea9568	github.com/eclipse/paho%2emqtt%2egolang.keepalive+0x1e8	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/ping.go:48

1 @ 0x438936 0x448392 0xead22a 0xf480d3 0xf47f0d 0xf48226 0x10017bf 0x100b2c5 0x10099af 0x1006073 0x468741
#	0xead229	github.com/eclipse/paho%2emqtt%2egolang.(*baseToken).WaitTimeout+0x69			/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/token.go:84
#	0xf480d2	XXXXXX/mqtt.(*MqttBuddy).SendMessage.func1+0x32	XXXXXX/mqtt/buddy.go:212
#	0xf47f0c	XXXXXX/mqtt.(*MqttBuddy).SendMessage+0x14c		XXXXXX/mqtt/buddy.go:222
#	0xf48225	XXXXXX/mqtt.BuddySendAndDone+0x105			XXXXXX/mqtt/buddy.go:244
#	0x10017be	main.(*server).sendAzureMessage+0x73e							XXXXXX/azure.go:98
#	0x100b2c4	main.(*server).doProcessPacket+0x17e4							XXXXXX/processpacket.go:157
#	0x10099ae	main.(*server).processPacket+0x18e							XXXXXX/processpacket.go:45
#	0x1006072	main.main.func3.1+0x252									XXXXXX/main.go:390

1 @ 0x438936 0x448392 0xf60738 0x468741
#	0xf60737	github.com/streadway/amqp.(*Connection).heartbeater+0x1b7	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/connection.go:551

1 @ 0x438936 0x448392 0xf73aa5 0x468741
#	0xf73aa4	github.com/assembla/cony.(*Client).Loop.func1+0x104	/var/go/go/pkg/mod/github.com/assembla/cony@v0.3.2/client.go:156

1 @ 0x438936 0x448392 0xf742b9 0x468741
#	0xf742b8	github.com/assembla/cony.(*Consumer).serve+0x1d8	/var/go/go/pkg/mod/github.com/assembla/cony@v0.3.2/consumer.go:83

1 @ 0x438936 0x449473 0x44944d 0x464745 0x474172 0xea750a 0x468741
#	0x464744	sync.runtime_Semacquire+0x24					/usr/local/go/src/runtime/sema.go:56
#	0x474171	sync.(*WaitGroup).Wait+0x51					/usr/local/go/src/sync/waitgroup.go:136
#	0xea7509	github.com/eclipse/paho%2emqtt%2egolang.startComms.func3+0x29	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:438

1 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x81ce1b 0x81cdf1 0x81c4c5 0x81bfde 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24				/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164					/usr/local/go/src/sync/mutex.go:162
#	0x81ce1a	sync.(*Mutex).Lock+0x9a						/usr/local/go/src/sync/mutex.go:81
#	0x81cdf0	net/http.(*http2clientStream).encodeAndWriteHeaders+0x70	/usr/local/go/src/net/http/h2_bundle.go:8064
#	0x81c4c4	net/http.(*http2clientStream).writeRequest+0x4a4		/usr/local/go/src/net/http/h2_bundle.go:7986
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d			/usr/local/go/src/net/http/h2_bundle.go:7904

1 @ 0x462bc5 0xee1f75 0xee1d8d 0xedeccb 0xeed9a5 0xf38167 0x83b78f 0x83d129 0xefa162 0x83ed7b 0x83a237 0x468741
#	0x462bc4	runtime/pprof.runtime_goroutineProfileWithLabels+0x24		/usr/local/go/src/runtime/mprof.go:753
#	0xee1f74	runtime/pprof.writeRuntimeProfile+0xb4				/usr/local/go/src/runtime/pprof/pprof.go:725
#	0xee1d8c	runtime/pprof.writeGoroutine+0x4c				/usr/local/go/src/runtime/pprof/pprof.go:685
#	0xedecca	runtime/pprof.(*Profile).WriteTo+0x14a				/usr/local/go/src/runtime/pprof/pprof.go:332
#	0xeed9a4	net/http/pprof.handler.ServeHTTP+0x4a4				/usr/local/go/src/net/http/pprof/pprof.go:253
#	0xf38166	XXXXXX/app.pprofIndexOverride+0x206	XXXXXX/app/server.go:161
#	0x83b78e	net/http.HandlerFunc.ServeHTTP+0x2e				/usr/local/go/src/net/http/server.go:2084
#	0x83d128	net/http.(*ServeMux).ServeHTTP+0x148				/usr/local/go/src/net/http/server.go:2462
#	0xefa161	github.com/gorilla/mux.(*Router).ServeHTTP+0x201		/var/go/go/pkg/mod/github.com/gorilla/mux@v1.7.2/mux.go:212
#	0x83ed7a	net/http.serverHandler.ServeHTTP+0x43a				/usr/local/go/src/net/http/server.go:2916
#	0x83a236	net/http.(*conn).serve+0x5d6					/usr/local/go/src/net/http/server.go:1966

@dmitshur
Copy link
Contributor

CC @neild.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 28, 2022
@dmitshur dmitshur added this to the Backlog milestone Dec 28, 2022
@rhysh
Copy link
Contributor

rhysh commented Jan 5, 2023

Thanks for splitting this out. Can you clarify what specifically the problem is? It sounds like the number of goroutines is the bug for you ("I expected all go routines that the std library creates to handle the HTTP request to be cleaned up"), but looking at the goroutines stacks I wonder if the larger problem is that the Transport continues to schedule new requests onto the stuck http2 connection even after many requests on that connection have all timed out.


I see goroutines doing the following outbound http/2 work:

16 outbound http/2 connections that exist in the app (rooted at net/http.(*http2ClientConn).readLoop)

11 requests the app is trying to send over http/2 right now (in net/http.(*http2ClientConn).RoundTrip)

1258 + 12 trying to clean up requests that were scheduled onto the stuck connection (in net/http.(*http2clientStream).cleanupWriteRequest)

9+2 in net/http.(*http2clientStream).doRequest calling net/http.(*http2clientStream).writeRequest, with 9 on line 7929 and 2 on line 8042. I think these correspond to the 11 goroutines in net/http.(*http2ClientConn).RoundTrip.

1 more in net/http.(*http2clientStream).doRequest, net/http.(*http2clientStream).writeRequest (line 7986), net/http.(*http2clientStream).encodeAndWriteHeaders, sync.(*Mutex).Lock on cc.wmu. This request holds cc.reqHeaderMu, but does not hold cc.wmu.

1 goroutine trying to write a frame to an http/2 client connection (net/http.(*http2clientStream).doRequest, net/http.(*http2clientStream).cleanupWriteRequest, net/http.(*http2ClientConn).writeStreamReset, bufio.(*Writer).Flush, ..., crypto/tls.(*Conn).Write, ..., internal/poll.(*pollDesc).waitWrite. This request holds cc.wmu.

The following outbound http/1.1 work (I think unrelated, but helpful to classify):
12 outbound http/1.1 connections that exist in the app (rooted at net/http.(*persistConn).writeLoop)
6+3 idle http/1.1 client connections, 6 using https and 3 using http (net/http.(*persistConn).readLoop leading to net/http.(*persistConn).Read)
3 active http/1.1 client connections, reading response body (net/http.(*body).Read leading to net/http.(*persistConn).Read), corresponding to another 3 goroutines in net/http.(*persistConn).readLoop without any Read calls on their stack.
2 outbound http/1.1 requests still waiting for response headers (net/http.(*persistConn).roundTrip).

The following inbound http/2 work (I think unrelated, but helpful to classify):
10 inbound http/2 connections for gRPC, google.golang.org/grpc/internal/transport.(*http2Client).reader


As I understand it:

  1. The request whose goroutine holds cc.wmu and is blocked in crypto/tls.(*Conn).Write remains blocked for a long time because there's a problem with the underlying TCP connection (maybe the network is dropping all packets, or all packets for that peer, or all packets for that five-tuple). This request has been waiting since the kernel's write buffer for the TCP connection filled up / writes to the file descriptor started blocking.
  2. The request whose goroutine holds cc.reqHeaderMu cannot proceed because it's trying to acquire cc.wmu. This request has been waiting since shortly after writes to the file descriptor started blocking.
  3. The 2 goroutines in writeRequest on line 8042 have managed to use cc.reqHeaderMu before their context values timed out. That's interesting .. maybe the TCP connection isn't completely broken, but is instead very, very slow? Or is it possible that these 2 requests didn't have a timeout set, and have been around since the start of the problem?
  4. The 9 goroutines in writeRequest on line 7929 are waiting for cc.reqHeaderMu (or for their context values to expire), but they can't get it because the http/2 connection is busy (items 2 and 1). The requests for these goroutines are very recent / current. The Transport chose this http/2 connection as the right one to use for putting those requests on the wire.
  5. The 1258 + 12 goroutines in cleanupWriteRequest (I haven't investigated the difference between them) are each for a request that was scheduled to a (the?) slow connection and have not yet had a chance to use cc.wmu. That gives an indication of how long the crypto/tls.(*Conn).Write call has been blocked (a long time).

@stampy88
Copy link
Author

stampy88 commented Jan 9, 2023

Hi @rhysh , well, the bug isn't necessarily the go routines themselves, but rather the potential for these resources, e.g. go routines, memory, etc. to keep growing because of the problem you stated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants