Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: standard http client panics listing an evictCount #43051

sergazyyev opened this issue Dec 7, 2020 · 10 comments

x/net/http2: standard http client panics listing an evictCount #43051

sergazyyev opened this issue Dec 7, 2020 · 10 comments


Copy link

@sergazyyev sergazyyev commented Dec 7, 2020

What version of Go are you using (go version)?

go1.15.4 linux/amd64

Does this issue reproduce with the latest release?


What operating system and processor architecture are you using (go env)?


What did you do?

I use standard http client for one host service for multiple time. I configure http client as client = &http.Client{Timeout: timeOut}. In server sometime i gets panic from hpack http2 package and after that service stopping serve all next requests

What did you expect to see?

work without panic and error

What did you see instead?

panic stack trace
runtime/debug.Stack(0xc0005edf38, 0x4e77ba, 0xc0000ea1e0)
/usr/local/go/src/runtime/debug/stack.go:24 +0x9f
/usr/local/go/src/runtime/debug/stack.go:16 +0x25
/app/main.go:75 +0x57
panic(0xba4ca0, 0xc0011e81f0)
/usr/local/go/src/runtime/panic.go:969 +0x1b9
vendor/*headerFieldTable).idToIndex(0xc001290280, 0xa, 0xc0005ee1f0)
/usr/local/go/src/vendor/ +0x12d
vendor/*headerFieldTable).search(0xc001290280, 0xc000582330, 0xd, 0xc00064e000, 0x439, 0x0, 0x17, 0x0)
/usr/local/go/src/vendor/ +0x150
vendor/*Encoder).searchTable(0xc001290280, 0xc000582330, 0xd, 0xc00064e000, 0x439, 0x0, 0x0, 0x400)
/usr/local/go/src/vendor/ +0x92
vendor/*Encoder).WriteField(0xc001290280, 0xc000582330, 0xd, 0xc00064e000, 0x439, 0x0, 0x0, 0x0)
/usr/local/go/src/vendor/ +0x110
net/http.(*http2ClientConn).writeHeader(0xc000ee9500, 0xc000582330, 0xd, 0xc00064e000, 0x439)
/usr/local/go/src/net/http/h2_bundle.go:8118 +0xbc
net/http.(*http2ClientConn).encodeHeaders.func3(0xc9d8ef, 0xd, 0xc00064e000, 0x439)
/usr/local/go/src/net/http/h2_bundle.go:8057 +0x93
/usr/local/go/src/net/http/h2_bundle.go:8023 +0x3d5
net/http.(*http2ClientConn).encodeHeaders(0xc000ee9500, 0xc001034200, 0x1, 0x0, 0x0, 0x0, 0xc0001447b0, 0x15, 0x1b1f1f8, 0x7fa5ab7ea1f8, ...)
/usr/local/go/src/net/http/h2_bundle.go:8055 +0x637
net/http.(*http2ClientConn).roundTrip(0xc000ee9500, 0xc001034200, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/h2_bundle.go:7524 +0x2bb
net/http.(*http2Transport).RoundTripOpt(0xc000f000c0, 0xc001034200, 0xbd3200, 0xc0010302a0, 0xc000f2e1c0, 0x5)
/usr/local/go/src/net/http/h2_bundle.go:6948 +0x170
net/http.http2noDialH2RoundTripper.RoundTrip(0xc000f000c0, 0xc001034200, 0xd77fa0, 0xc000f000c0, 0x0)
/usr/local/go/src/net/http/h2_bundle.go:9119 +0x3e
net/http.(*Transport).roundTrip(0x1ae3a40, 0xc001034200, 0xc824c0, 0x7fa5abb35101, 0xc001034200)
/usr/local/go/src/net/http/transport.go:527 +0xd8c
net/http.(*Transport).RoundTrip(0x1ae3a40, 0xc001034200, 0x1ae3a40, 0xbfebaa82c332827f, 0x1b5cf1f8e91f)
/usr/local/go/src/net/http/roundtrip.go:17 +0x35
net/http.send(0xc001034000, 0xd76e40, 0x1ae3a40, 0xbfebaa82c332827f, 0x1b5cf1f8e91f, 0xc00100c2a0, 0xc000010208, 0xbfebaa82c332827f, 0x1, 0x0)
/usr/local/go/src/net/http/client.go:252 +0x453
net/http.(*Client).send(0xc0010301e0, 0xc001034000, 0xbfebaa82c332827f, 0x1b5cf1f8e91f, 0xc00100c2a0, 0xc000010208, 0x0, 0x1, 0x203000)
/usr/local/go/src/net/http/client.go:176 +0xff
net/http.(*Client).do(0xc0010301e0, 0xc001034000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/client.go:718 +0x45f
subscriptions-service/models/subscriptions.(*ServicesRequest).postGetServices(0xc0001457a0, 0xc0001457d0, 0xc000fbc060, 0x0, 0x0, 0x0, 0x0, 0x0)
/app/models/subscriptions/services.go:174 +0x6e5
subscriptions-service/models/subscriptions.(*ServicesRequest).Services(0xc0001457a0, 0xc0005ef7d0, 0x4, 0xc0001457c0, 0x1)
/app/models/subscriptions/services.go:128 +0xd4
subscriptions-service/controllers/subscriptions.Services(0xc000e95180, 0xc0005a3290, 0x9)
/app/controllers/subscriptions/services.go:29 +0x3a6*App).next(0xc0010005a0, 0xc000e95180, 0xc00064e000, 0x439, 0x480)
/go/pkg/mod/ +0x1b9*Ctx).Next(0xc000e95180, 0xc9d8ef, 0xd)
/go/pkg/mod/ +0x88
subscriptions-service/controllers/middleware.CheckAuthHeaderMiddleware(0xc000e95180, 0xca7818, 0x1b)
/app/controllers/middleware/auth_header.go:18 +0x16f*Ctx).Next(0xc000e95180, 0xca7818, 0x1b)
/go/pkg/mod/ +0x59, 0xc000ffad80, 0xc0005a3290)
/go/pkg/mod/ +0x305*Ctx).Next(0xc000e95180, 0x1, 0xc00105c6f8)
/go/pkg/mod/ +0x59
main.main.func2(0xc000e95180, 0x0, 0x0)
/app/main.go:82 +0x5f*App).next(0xc0010005a0, 0xc000e95180, 0xc000e95180, 0x32e249f, 0x32e249f00000010)
/go/pkg/mod/ +0x1b9*App).handler(0xc0010005a0, 0xc000eaaf00)
/go/pkg/mod/ +0x11a*Server).serveConn(0xc000f5e900, 0xd92840, 0xc0005aa008, 0x0, 0x0)
/go/pkg/mod/ +0x11ac*workerPool).workerFunc(0xc001008140, 0xc00128de00)
/go/pkg/mod/ +0xc2*workerPool).getCh.func1(0xc001008140, 0xc00128de00, 0xb8a540, 0xc00128de00)
/go/pkg/mod/ +0x35
created by*workerPool).getCh
/go/pkg/mod/ +0x115
time="2020-12-07T19:47:25+06:00" level=error msg="id (10) <= evictCount (12)" function=func1 line=56 package=main.main

Copy link

@mvdan mvdan commented Dec 7, 2020

Why did you duplicate #43048 (comment)? This is likely still not enough information to tell if this is a bug in Go or not.

Copy link

@sergazyyev sergazyyev commented Dec 7, 2020

I closed my previous ticket. This bug produce only on my prod server and I cannot reproduce it manually. What information I can provide to you for analyse this issue?

@cagedmantis cagedmantis changed the title x/net/http2: x/net/http2: standard http client panics listing an evictCount Dec 7, 2020
Copy link

@cagedmantis cagedmantis commented Dec 7, 2020

As @mvdan stated in #43048, we need some manner of reproducing this failure in order to isolate the problem.

Copy link

@sergazyyev sergazyyev commented Dec 8, 2020

I couldn't reproduce this issue manually. I tried to do it in local machine, in docker container and under difference loads. As i mentioned before it happening in kuber cluster when i send request outside from standard http client. Now i change value of Transport. ForceAttemptHTTP2 to false and it seems like it helped but i don't understand the reason of this problem.

Copy link

@sergazyyev sergazyyev commented Dec 29, 2020

I got http2 debug trace log when cause this panic, maybe it will help to solve this issue

Copy link

@liandeliang liandeliang commented Feb 10, 2021

Hi, i got this error message in recent days: panic: id (34) <= evictCount (47)
go 1.15.8

"panic: id (34) <= evictCount (47)

goroutine 2059472 [running]:
vendor/*headerFieldTable).idToIndex(0x400247a180, 0x22, 0x40001d3958)
\tvendor/ +0x114
vendor/*headerFieldTable).search(0x400247a180, 0x4337b3, 0xa, 0x40000e2408, 0x11, 0x0, 0x1, 0xc5100)
\tvendor/ +0x104
vendor/*Encoder).searchTable(0x400247a180, 0x4337b3, 0xa, 0x40000e2408, 0x11, 0x0, 0x0, 0x200)
\tvendor/ +0x98
vendor/*Encoder).WriteField(0x400247a180, 0x4337b3, 0xa, 0x40000e2408, 0x11, 0x0, 0x40000d2000, 0x40000d2050)
\tvendor/ +0xe8
net/http.(*http2ClientConn).writeHeader(0x4002820300, 0x4337b3, 0xa, 0x40000e2408, 0x11)
\tnet/http/h2_bundle.go:8124 +0x8c
net/http.(*http2ClientConn).encodeHeaders.func3(0x4337b3, 0xa, 0x40000e2408, 0x11)
\tnet/http/h2_bundle.go:8063 +0x7c
\tnet/http/h2_bundle.go:7962 +0x8c
net/http.(*http2ClientConn).encodeHeaders(0x4002820300, 0x4001c10b00, 0x1, 0x0, 0x0, 0x5d, 0xc9, 0x0, 0x0, 0x4b3, ...)
\tnet/http/h2_bundle.go:8061 +0x494
net/http.(*http2ClientConn).roundTrip(0x4002820300, 0x4001c10b00, 0x0, 0x0, 0x0, 0x0)
\tnet/http/h2_bundle.go:7524 +0x230
net/http.(*http2Transport).RoundTripOpt(0x40000e0720, 0x4001c10b00, 0x3a5200, 0x400019a690, 0x40000e2400, 0x5)
\tnet/http/h2_bundle.go:6948 +0x144
net/http.http2noDialH2RoundTripper.RoundTrip(0x40000e0720, 0x4001c10b00, 0x4b6bc0, 0x40000e0720, 0x0)
\tnet/http/h2_bundle.go:9127 +0x34
net/http.(*Transport).roundTrip(0x40000ca640, 0x4001c10a00, 0x4002a6a798, 0x1bff0, 0xffffba710e98)
\tnet/http/transport.go:527 +0xa44
net/http.(*Transport).RoundTrip(0x40000ca640, 0x4001c10a00, 0x40000ca640, 0x0, 0x0)
\tnet/http/roundtrip.go:17 +0x30
net/http.send(0x4001c10a00, 0x4b5f60, 0x40000ca640, 0x0, 0x0, 0x0, 0x0, 0x40017a63a0, 0x4002a6a9b8, 0x1)
\tnet/http/client.go:252 +0x34c
net/http.(*Client).send(0x400019a510, 0x4001c10a00, 0x0, 0x0, 0x0, 0x40017a63a0, 0x0, 0x1, 0x4002a6aa38)
\tnet/http/client.go:176 +0xc8
net/http.(*Client).do(0x400019a510, 0x4001c10a00, 0x0, 0x0, 0x0)
\tnet/http/client.go:718 +0x340
... my user code ...

goroutine 73 [select]:
net/http.(*http2ClientConn).roundTrip(0x4002820300, 0x4000189500, 0x0, 0x0, 0x0, 0x0)
\tnet/http/h2_bundle.go:7609 +0x7bc
net/http.(*http2Transport).RoundTripOpt(0x40000e0720, 0x4000189500, 0x3a5200, 0x400019a690, 0x40000ba900, 0x5)
\tnet/http/h2_bundle.go:6948 +0x144
net/http.http2noDialH2RoundTripper.RoundTrip(0x40000e0720, 0x4000189500, 0x4b6bc0, 0x40000e0720, 0x0)
\tnet/http/h2_bundle.go:9127 +0x34
net/http.(*Transport).roundTrip(0x40000ca640, 0x4000189500, 0x40001d1188, 0x1bff0, 0xffffba711c28)
\tnet/http/transport.go:527 +0xa44
net/http.(*Transport).RoundTrip(0x40000ca640, 0x4000189500, 0x40000ca640, 0x0, 0x0)
\tnet/http/roundtrip.go:17 +0x30
net/http.send(0x4000189500, 0x4b5f60, 0x40000ca640, 0x0, 0x0, 0x0, 0x0, 0x400188b9f0, 0x40001d13a8, 0x1)
\tnet/http/client.go:252 +0x34c
net/http.(*Client).send(0x400019a510, 0x4000189500, 0x0, 0x0, 0x0, 0x400188b9f0, 0x0, 0x1, 0x40001d1428)
\tnet/http/client.go:176 +0xc8
net/http.(*Client).do(0x400019a510, 0x4000189500, 0x0, 0x0, 0x0)
\tnet/http/client.go:718 +0x340
... my user code ...

Copy link

@krishnakumar4a4 krishnakumar4a4 commented Jul 12, 2021

We have similar issue reproduced on lower and prod environments. After couple of days of trying different combinations of configurations. Finally, I was able to figure out minimal configuration that fixed this panic.

defaultTransport := &http.Transport{}
defaultTransport.DialContext = (&net.Dialer{}).DialContext
http.Client{ Timeout: time.Second * 10, Transport: defaultTransport }

Although this doesn't seem to look like a fix. It helped my scenario where our transport was only http but not http/2. Here is a comment that kind of supported this fix which is documented for ForceAttemptHTTP2 flag

	// ForceAttemptHTTP2 controls whether HTTP/2 is enabled when a non-zero
	// Dial, DialTLS, or DialContext func or TLSClientConfig is provided.
	// By default, use of any those fields conservatively disables HTTP/2.
	// To use a custom dialer or TLS config and still attempt HTTP/2
	// upgrades, set this to true.
	ForceAttemptHTTP2 bool

It kind of says, if we override DialContext function with a Custom Dialer. Conservatively HTTP/2 will be disabled and we no longer see panic. I would like to spend some effort reproducing it locally and share it here If I can(I am doubtful at the moment).

Copy link

@krishnakumar4a4 krishnakumar4a4 commented Sep 10, 2021

Hi @mvdan and @cagedmantis ,

Since I have personally faced this problem, I have done a rough analysis and I see a potential problem in http2 encoder which could be causing this panic intermittently.

All the panics happened so far, prints the id (%v) <= evictCount (%v) from here. The failing condition is id <= t.evictCount from idToIndex function which is part of searchTable operation.

Here is what I understood about these fields after some quick reading.

  • id - is unique id assigned to every header which is 1-indexed
  • evictCount - is a total number of evictions happened so far on headers table

Value of unique id is always calculated as len(headers table) + evicCount + 1 as mentioned here . This tells us that any id that is generated cannot be less than the current evictCount itself. This is true as long as evictCount is not changed.

But evictCount would be changed, Please refer to these two points in the code

func (e *Encoder) WriteField(f HeaderField) error {
	e.buf = e.buf[:0]

	if e.tableSizeUpdate {
		e.tableSizeUpdate = false
		if e.minSize < e.dynTab.maxSize {
			e.buf = appendTableSize(e.buf, e.minSize)
		e.minSize = uint32Max
		e.buf = appendTableSize(e.buf, e.dynTab.maxSize)

	idx, nameValueMatch := e.searchTable(f) // This function throws panic
	if nameValueMatch {
		e.buf = appendIndexed(e.buf, idx)
	} else {
		indexing := e.shouldIndex(f)
		if indexing {
			e.dynTab.add(f) // This function can trigger eviction

		if idx == 0 {
			e.buf = appendNewName(e.buf, f, indexing)
		} else {
			e.buf = appendIndexedName(e.buf, f, idx, indexing)
	n, err := e.w.Write(e.buf)
	if err == nil && n != len(e.buf) {
		err = io.ErrShortWrite
	return err

While encoding a new duplicate header, e.dynTab.add(f) function could have triggered an eviction in the previous run, which in turn could have increased the evictCount. So when searchTable(f) is called in the current run, idToIndex function would throw this panic as id <= t.evictCount will no longer be true.

I may not have all details of the protocol or implementation. Would be interested to learn more.

Copy link

@neild neild commented Oct 15, 2021

Do you ever write requests with trailers? CL 349594 fixes a race condition with HPACK encoding, but only for requests which contain trailers.

If you can try the most recent to see if it fixes the problem, that would be helpful; there have been a number of recent client fixes. You can replace the vendored stdlib HTTP/2 implementation with the one from x/net with:

import ""
import "http"

Copy link

@neild neild commented Oct 15, 2021

@krishnakumar4a4 e.dynTab.add(f) can trigger an eviction, but eviction removes the evicted entry from the table so a future call to e.searchTable(f) won't find that entry. See (*headerFieldTable).evictOldest.

I've looked through the HPACK encoding code and if there's a way for this state to get out of sync without racy calls to the encoder, I'm not seeing it. I also too a shot at fuzzing the encoder, but that hasn't turned up anything yet. (I tried introducing a couple bugs to verify my fuzz function was working correctly, and the fuzzer found them instantly, so that was nice.)

So my best hypothesis at the moment is the known (and fixed) race condition in trailer encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants