Skip to content

x/net/http2: header modifications after WriteHeader(http.StatusEarlyHints) cause a map race #67940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Timer opened this issue Jun 12, 2024 · 11 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@Timer
Copy link

Timer commented Jun 12, 2024

Go version

go version go1.22.1 linux/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/home/timer/.cache/go-build'
GOENV='/home/timer/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/timer/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/timer/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go1.22.1.linux-arm64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go1.22.1.linux-arm64/pkg/tool/linux_arm64'
GOVCS=''
GOVERSION='go1.22.1'
GCCGO='gccgo'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/timer/Development/proxy/https-terminator/go.mod'
GOWORK='/home/timer/Development/proxy/go.work'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1654315752=/tmp/go-build -gno-record-gcc-switches'

What did you do?

We run a HTTP/2 server that flushes an early headers map (status 103 - http.StatusEarlyHints) using func (http.ResponseWriter) WriteHeader(statusCode int).

After calling WriteHeader, we may need to mutate the Header map for the final 2xx response based on information we discover. In our application, we simply clear the Headers after flushing the early hint:

			r.W.Header().Set("Server", "...")
			r.W.Header().Set("Link", "...")

			r.W.WriteHeader(http.StatusEarlyHints)

			// remove the headers after sending the early hints to avoid
			// sending duplicate headers when the final response is sent:
			r.W.Header().Del("Server")
			r.W.Header().Del("Link")

What did you see happen?

After >60 minutes of runtime and millions of requests (this is hard to reproduce), we were faced with this crash:

fatal error: concurrent map read and map write
goroutine 44382232 [running]:
golang.org/x/net/http2.encodeHeaders(0xc09549f798?, 0x1412ef5?, {0x0?, 0xc0416e72b9?, 0xc07fbd2f18?})
golang.org/x/net@v0.24.0/http2/write.go:343 +0x156
golang.org/x/net/http2.(*writeResHeaders).writeFrame(0xc04d077c00, {0x15edb08, 0xc08528a000})
golang.org/x/net@v0.24.0/http2/write.go:217 +0xe9
golang.org/x/net/http2.(*serverConn).writeFrameAsync(0xc08528a000, {{0x15e9140?, 0xc04d077c00?}, 0xc01a73d860?, 0xc0326e0c60?}, 0xc070d666a8?)
golang.org/x/net@v0.24.0/http2/server.go:852 +0x74
created by golang.org/x/net/http2.(*serverConn).startFrameWrite in goroutine 44380821
golang.org/x/net@v0.24.0/http2/server.go:1266 +0x390
goroutine 1 [semacquire, 87 minutes]:

This appears to be due to writeFrameAsync running on a separate goroutine and the encodeHeaders function doing a naive iteration over the values (vv := h[k], line 343 from above).

What did you expect to see?

Encoding headers needs to be concurrent-safe now that a server can send and mutate its headers multiple times before sending the final response (e.g. 103, 103, 103, 200).

@gopherbot gopherbot added this to the Unreleased milestone Jun 12, 2024
@prattmic prattmic changed the title x/net: header modifications after WriteHeader(http.StatusEarlyHints) cause a map race x/net/http2: header modifications after WriteHeader(http.StatusEarlyHints) cause a map race Jun 12, 2024
@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 12, 2024
@prattmic
Copy link
Member

cc @neild @tombergan

@ruyi789
Copy link

ruyi789 commented Jun 12, 2024

r.W.Header().Del("Server")
Header doesn't need to be deleted? Because Response creates a new Header each time.

@neild
Copy link
Contributor

neild commented Jun 13, 2024

WriteHeader should block until the headers have been sent, to avoid precisely this type of race. I've looked through this code path a few times now, and I don't see a way for the header map to be touched after WriteHeader returns, but I might be missing something.

Do you possibly have a middleware layer wrapping the ResponseWriter that might be deferring the WriteHeader call to a later time?

If you can run this under the race detector (go build -race), it might make the failure more reproducible.)

@neild neild added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 17, 2024
@gopherbot
Copy link
Contributor

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2024
@Timer
Copy link
Author

Timer commented Aug 9, 2024

No, we are not wrapping the ResponseWriter with a middleware that'd delay the write. I'm enabling the race detector in hopes that it prints the information about the access and will report back today!

@Timer
Copy link
Author

Timer commented Aug 9, 2024

Got the race! Sharing it here while I continue to evaluate it.

2024-08-09T19:19:38.955255887Z ==================
2024-08-09T19:19:38.955262912Z WARNING: DATA RACE
2024-08-09T19:19:38.955265201Z Write at 0x00c00bfa5740 by goroutine 7675145:
2024-08-09T19:19:38.955267226Z   runtime.mapdelete_faststr()
2024-08-09T19:19:38.955269405Z       /usr/local/go/src/runtime/map_faststr.go:301 +0x0
2024-08-09T19:19:38.955271096Z   net/textproto.MIMEHeader.Del()
2024-08-09T19:19:38.955272820Z       /usr/local/go/src/net/textproto/header.go:55 +0x4d
2024-08-09T19:19:38.955274358Z   net/http.Header.Del()
2024-08-09T19:19:38.955275804Z       /usr/local/go/src/net/http/header.go:81 +0x27
2024-08-09T19:19:38.955277621Z   main.main.ProxyUpstream.func14()
2024-08-09T19:19:38.955279964Z       /usr/src/https-terminator/middlewares/proxyhttp/http_handler.go:562 +0x17cb
2024-08-09T19:19:38.955281075Z   main.main.REDACTED.func15()
2024-08-09T19:19:38.955282521Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:183 +0xc7e
2024-08-09T19:19:38.955283567Z   main.main.REDACTED.func16()
2024-08-09T19:19:38.955284607Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:26 +0xba4
2024-08-09T19:19:38.955285748Z   main.main.REDACTED.func17()
2024-08-09T19:19:38.955286753Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:24 +0x9cf
2024-08-09T19:19:38.955287988Z   main.main.REDACTED.func18()
2024-08-09T19:19:38.955289160Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:53 +0x2e1
2024-08-09T19:19:38.955290119Z   main.main.REDACTED.func19()
2024-08-09T19:19:38.955291243Z       /usr/src/https-terminator/middlewares/threats/REDACTED.go:23 +0x273
2024-08-09T19:19:38.955292407Z   main.main.REDACTED.func20()
2024-08-09T19:19:38.955295540Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:8 +0x19c
2024-08-09T19:19:38.955298336Z   main.main.REDACTED.func21()
2024-08-09T19:19:38.955299811Z       /usr/src/https-terminator/middlewares/proxyhttp/REDACTED.go:17 +0x116
2024-08-09T19:19:38.955301227Z   github.com/vercel/proxy/https-terminator/middlewares/proxyhttp.(*server).ServeHTTP()
2024-08-09T19:19:38.955302888Z       /usr/src/https-terminator/middlewares/proxyhttp/server.go:201 +0x571
2024-08-09T19:19:38.955305221Z   net/http.serverHandler.ServeHTTP()
2024-08-09T19:19:38.955306537Z       /usr/local/go/src/net/http/server.go:2943 +0x2a1
2024-08-09T19:19:38.955307491Z   net/http.initALPNRequest.ServeHTTP()
2024-08-09T19:19:38.955308475Z       /usr/local/go/src/net/http/server.go:3551 +0x35e
2024-08-09T19:19:38.955309699Z   net/http.(*initALPNRequest).ServeHTTP()
2024-08-09T19:19:38.955310705Z       <autogenerated>:1 +0x7b
2024-08-09T19:19:38.955311715Z   net/http.Handler.ServeHTTP-fm()
2024-08-09T19:19:38.955312916Z       <autogenerated>:1 +0x67
2024-08-09T19:19:38.955313934Z   golang.org/x/net/http2.(*serverConn).runHandler()
2024-08-09T19:19:38.955314966Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2414 +0x25a
2024-08-09T19:19:38.955315980Z   golang.org/x/net/http2.(*serverConn).scheduleHandler.func1()
2024-08-09T19:19:38.955322658Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2348 +0x5d
2024-08-09T19:19:38.955323910Z 
2024-08-09T19:19:38.955325457Z Previous read at 0x00c00bfa5740 by goroutine 7675260:
2024-08-09T19:19:38.955327005Z   runtime.mapiterinit()
2024-08-09T19:19:38.955328549Z       /usr/local/go/src/runtime/map.go:816 +0x0
2024-08-09T19:19:38.955329933Z   golang.org/x/net/http2.(*sorter).Keys()
2024-08-09T19:19:38.955331759Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/http2.go:344 +0x8f
2024-08-09T19:19:38.955333741Z   golang.org/x/net/http2.encodeHeaders()
2024-08-09T19:19:38.955334817Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/write.go:340 +0x13d
2024-08-09T19:19:38.955335846Z   golang.org/x/net/http2.(*writeResHeaders).writeFrame()
2024-08-09T19:19:38.955337101Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/write.go:217 +0x1cd
2024-08-09T19:19:38.955338182Z   golang.org/x/net/http2.(*serverConn).writeFrameAsync()
2024-08-09T19:19:38.955339244Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:890 +0x116
2024-08-09T19:19:38.955340382Z   golang.org/x/net/http2.(*serverConn).startFrameWrite.func2()
2024-08-09T19:19:38.955341415Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1304 +0x6e
2024-08-09T19:19:38.955342347Z 
2024-08-09T19:19:38.955343398Z Goroutine 7675145 (running) created at:
2024-08-09T19:19:38.955356990Z   golang.org/x/net/http2.(*serverConn).scheduleHandler()
2024-08-09T19:19:38.955358243Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2348 +0x28a
2024-08-09T19:19:38.955359222Z   golang.org/x/net/http2.(*serverConn).processHeaders()
2024-08-09T19:19:38.955360262Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:2070 +0xd30
2024-08-09T19:19:38.955361329Z   golang.org/x/net/http2.(*serverConn).processFrame()
2024-08-09T19:19:38.955362448Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1571 +0x3f9
2024-08-09T19:19:38.955363452Z   golang.org/x/net/http2.(*serverConn).processFrameFromReader()
2024-08-09T19:19:38.955364574Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1509 +0x324
2024-08-09T19:19:38.955365563Z   golang.org/x/net/http2.(*serverConn).serve()
2024-08-09T19:19:38.955366580Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:998 +0x17bc
2024-08-09T19:19:38.955367960Z   golang.org/x/net/http2.(*Server).serveConn()
2024-08-09T19:19:38.955369001Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:572 +0x1a90
2024-08-09T19:19:38.955370090Z   golang.org/x/net/http2.(*Server).ServeConn()
2024-08-09T19:19:38.955371192Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:436 +0x185
2024-08-09T19:19:38.955372317Z   golang.org/x/net/http2.ConfigureServer.func1()
2024-08-09T19:19:38.955373408Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:355 +0xbe
2024-08-09T19:19:38.955374353Z   net/http.(*conn).serve()
2024-08-09T19:19:38.955375307Z       /usr/local/go/src/net/http/server.go:1922 +0x1d41
2024-08-09T19:19:38.955376720Z   net/http.(*Server).Serve.func3()
2024-08-09T19:19:38.955377697Z       /usr/local/go/src/net/http/server.go:3091 +0x4f
2024-08-09T19:19:38.955378552Z 
2024-08-09T19:19:38.955379499Z Goroutine 7675260 (finished) created at:
2024-08-09T19:19:38.955380584Z   golang.org/x/net/http2.(*serverConn).startFrameWrite()
2024-08-09T19:19:38.955381555Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1304 +0x77b
2024-08-09T19:19:38.955382677Z   golang.org/x/net/http2.(*serverConn).scheduleFrameWrite()
2024-08-09T19:19:38.955383614Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1406 +0x3c8
2024-08-09T19:19:38.955384616Z   golang.org/x/net/http2.(*serverConn).writeFrame()
2024-08-09T19:19:38.955389980Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:1252 +0x3c7
2024-08-09T19:19:38.955391151Z   golang.org/x/net/http2.(*serverConn).serve()
2024-08-09T19:19:38.955392125Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:985 +0x15b1
2024-08-09T19:19:38.955393350Z   golang.org/x/net/http2.(*Server).serveConn()
2024-08-09T19:19:38.955394737Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:572 +0x1a90
2024-08-09T19:19:38.955396303Z   golang.org/x/net/http2.(*Server).ServeConn()
2024-08-09T19:19:38.955398538Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:436 +0x185
2024-08-09T19:19:38.955399815Z   golang.org/x/net/http2.ConfigureServer.func1()
2024-08-09T19:19:38.955401363Z       /go/pkg/mod/golang.org/x/net@v0.26.0/http2/server.go:355 +0xbe
2024-08-09T19:19:38.955402833Z   net/http.(*conn).serve()
2024-08-09T19:19:38.955404301Z       /usr/local/go/src/net/http/server.go:1922 +0x1d41
2024-08-09T19:19:38.955405880Z   net/http.(*Server).Serve.func3()
2024-08-09T19:19:38.955407377Z       /usr/local/go/src/net/http/server.go:3091 +0x4f
2024-08-09T19:19:38.955408869Z ==================

@Timer
Copy link
Author

Timer commented Aug 9, 2024

I may be missing it, but I don't see where WriteHeader blocks until the write is complete.

The http2 WriteHeader implementation (golang.org/x/net@v0.26.0/http2/server.go) uses the writeHeader impl in responseWriterState (:2943).

For 1xx status codes, it calls rws.conn.writeHeaders:

func (rws *responseWriterState) writeHeader(code int) {
	if rws.wroteHeader {
		return
	}

	checkWriteHeaderCode(code)

	// Handle informational headers
	if code >= 100 && code <= 199 {
		// Per RFC 8297 we must not clear the current header map
		h := rws.handlerHeader

		_, cl := h["Content-Length"]
		_, te := h["Transfer-Encoding"]
		if cl || te {
			h = h.Clone()
			h.Del("Content-Length")
			h.Del("Transfer-Encoding")
		}

		rws.conn.writeHeaders(rws.stream, &writeResHeaders{
			streamID:    rws.stream.id,
			httpResCode: code,
			h:           h,
			endStream:   rws.handlerDone && !rws.hasTrailers(),
		})

		return
	}

writeHeaders submits the call to writeFrameFromHandler (golang.org/x/net@v0.26.0/http2/server.go:2440). This function does not wait for any signals in the success case:

func (sc *serverConn) writeHeaders(st *stream, headerData *writeResHeaders) error {
	sc.serveG.checkNotOn() // NOT on
	var errc chan error
	if headerData.h != nil {
		// If there's a header map (which we don't own), so we have to block on
		// waiting for this frame to be written, so an http.Flush mid-handler
		// writes out the correct value of keys, before a handler later potentially
		// mutates it.
		errc = errChanPool.Get().(chan error)
	}
	if err := sc.writeFrameFromHandler(FrameWriteRequest{
		write:  headerData,
		stream: st,
		done:   errc,
	}); err != nil {
		return err
	}
	if errc != nil {
		select {
		case err := <-errc:
			errChanPool.Put(errc)
			return err
		case <-sc.doneServing:
			return errClientDisconnected
		case <-st.cw:
			return errStreamClosed
		}
	}
	return nil
}

And the writeFrameFromHandler impl simply drops it on the wantWriteFrameCh channel for writing (a buffered channel).

// writeFrameFromHandler sends wr to sc.wantWriteFrameCh, but aborts
// if the connection has gone away.
//
// This must not be run from the serve goroutine itself, else it might
// deadlock writing to sc.wantWriteFrameCh (which is only mildly
// buffered and is read by serve itself). If you're on the serve
// goroutine, call writeFrame instead.
func (sc *serverConn) writeFrameFromHandler(wr FrameWriteRequest) error {
	sc.serveG.checkNotOn() // NOT
	select {
	case sc.wantWriteFrameCh <- wr:
		return nil
	case <-sc.doneServing:
		// Serve loop is gone.
		// Client has closed their connection to the server.
		return errClientDisconnected
	}
}

Perhaps the header map sent in to rws.conn.writeHeaders needs to clone h since it all seems to happen async? Or am I missing what causes the blocking write?

@Timer
Copy link
Author

Timer commented Aug 12, 2024

cc @neild in case you unsubscribed from the issue after being closed 🙏, hope you don't mind.

@neild neild reopened this Aug 12, 2024
@seankhliao seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 12, 2024
@neild
Copy link
Contributor

neild commented Aug 13, 2024

Thanks for the race detector output; that was very useful.

The problem is that while writeHeaders usually waits for the write to complete before returning, it stops waiting if the stream is reset by the peer. The race occurs when the header write has already started at the time the peer's reset is processed. (This is the <-st.cw path in writeHeaders.)

I'm still trying to figure out the right fix; there's a fairly complex set of interactions between the different goroutines involved in writing a frame. Most of this complexity is in service of efficiency, and I suspect the more obvious fixes leave us with all the complexity and none of the efficiency.

@Timer
Copy link
Author

Timer commented Aug 14, 2024

If it's of any anecdotal value, for latency/efficiency sake, we'd prefer WriteHeader to be async so we can continue processing the request without introducing a blocking operation.

Also, I did not catch that errc was a chan and thus responsible for the blocking operation. Makes sense!

In either case, the most trivial fix appears to be moving the h = h.Clone() out of the special condition that activates for CL & TE... but perhaps not the most performant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants