Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Errors from persistConn.readLoop not returned to client still writing a request #62447

Open
mtrmac opened this issue Sep 4, 2023 · 1 comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@mtrmac
Copy link

mtrmac commented Sep 4, 2023

What version of Go are you using (go version)?

$ go version
go version go1.21.0 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/mitr/Library/Caches/go-build'
GOENV='/Users/mitr/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/mitr/Go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/mitr/Go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.0'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/mitr/Go/src/github.com/containers/image/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/gx/746zv1352fl7clbfh5q24grc0000gn/T/go-build66149756=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Consider a HTTP server facing a client which is slow to upload a request body, and the HTTP server times out and closes the socket.

https://go.dev/play/p/zu0cZuiGxDZ
package main

import (
	"context"
	"fmt"
	"io"
	"log"
	"net"
	"net/http"
	"time"
)

func dieOnError(err error) {
	if err != nil {
		panic(err)
	}
}

var timeStep = time.Second

func server(ln net.Listener) {
	for {
		conn, err := ln.Accept()
		if err != nil {
			return
		}
		log.Printf("%v: Accepted", conn.RemoteAddr())
		time.Sleep(1 * timeStep)
		if false {
			_, err = conn.Write([]byte("HTTP/1.1 200 OK\r\n\r\n")) // Just enough for readLoop to return a response with some body
			dieOnError(err)
		}
		err = conn.Close()
		dieOnError(err)
		log.Printf("%v: Closed", conn.RemoteAddr())
	}
}

func main() {
	ln, err := net.Listen("tcp", "127.0.0.1:0")
	dieOnError(err)
	go server(ln)

	sourceReader, sourceWriter := io.Pipe()
	go func() {
		_, err := sourceWriter.Write([]byte("line 1\n"))
		dieOnError(err)

		time.Sleep(2 * timeStep)
		_, err = sourceWriter.Write([]byte("line 2\n"))
		dieOnError(err)
		err = sourceWriter.Close()
		dieOnError(err)
		log.Printf("Producer finished")
	}()

	req, err := http.NewRequestWithContext(context.Background(), http.MethodPatch, fmt.Sprintf("http://%s/", ln.Addr().String()), io.NopCloser(sourceReader))
	dieOnError(err)
	res, err := http.DefaultClient.Do(req)
	dieOnError(err)
	body, err := io.ReadAll(res.Body)
	log.Printf("Reading body: %#v, %v", body, err)

}

What did you expect to see?

On the client side, a real error cause of some kind; such as the one in https://go.dev/play/p/fKwUQwOjTDW (with s/if false/if true):

2009/11/10 23:00:00 127.0.0.1:34699: Accepted
2009/11/10 23:00:01 127.0.0.1:34699: Closed
2009/11/10 23:00:01 Reading body: []byte{}, read tcp 127.0.0.1:34699->127.0.0.1:44764: read: connection reset by peer

contains a useful “connection reset by peer” error.

What did you see instead?

2009/11/10 23:00:00 127.0.0.1:47676: Accepted
2009/11/10 23:00:01 127.0.0.1:47676: Closed
panic: Patch "http://127.0.0.1:22356/": write tcp 127.0.0.1:47676->127.0.0.1:22356: use of closed network connection

contains net.ErrClosed, with no indication of the actual underlying root cause. And I think it's possibly controversial but plausible point that net.ErrClosed is considered an indication of a programming bug.

The reports in #4373 obliquely hint at net/http encountering this issue, but never quite spell it out, hence this report.

The underlying cause, at least for this specific reproducer, is that persistConn.readLoop and persistConn.writeLoop are quite independent; in this case, readLoop encounters a socket read error, terminates, and closes persistConn.conn. Meanwhile, writeLoop is deep in transferWriter.doBodyCopy, continuing to issue writes and flushes to persistConn.conn, and that triggers net.ErrClosed on the write side. Arguably that’s the primary bug, issuing a write on a closed connection.

And then, persistConn.roundTrip actually sees the read-side error first (re := <-resc:); but it proceeds to mapRoundTripError, and sees req.err set, and returns that value. Alternatively, if issuing a write on a closed connection is considered fine (or the performance cost to avoid that would be too great), the logic in mapRoundTripError could see that there is both a read-side and write-side error, and decide that net.ErrClosed is not the one that should be returned.


There also seems to be a dual of this issue, where persistConn.writeLoop can encounter an error and close the connection, causing readLoop to fail with net.ErrClosed instead, and in that case, again, an error with the “use of closed network connection” text is returned. I’m afraid I lost the reproducer I had for that one.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 5, 2023
@bcmills
Copy link
Contributor

bcmills commented Sep 5, 2023

(attn @neild)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

2 participants