Skip to content

net: UnixListener blocks forever in Close() if File() is used to get the file descriptor #29277

@prashantv

Description

@prashantv

What version of Go are you using (go version)?

$ go version
go version go1.11.3 linux/amd64

Does this issue reproduce with the latest release?

Yes, it also reproduces with the latest tip.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/prashant/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/prashant/go"
GOPROXY=""
GORACE=""
GOROOT="/home/prashant/.gimme/versions/go1.11.3.linux.amd64"
GOTMPDIR=""
GOTOOLDIR="/home/prashant/.gimme/versions/go1.11.3.linux.amd64/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build250167096=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Created a unix listener, called File() to get the underlying file descriptor, then ran Accept in a goroutine. After some while, tried to call Close on the listener, but the Close blocks indefinitely.

Repro:
https://github.com/prashantv/unix-close-race/blob/master/main.go

What did you expect to see?

Close to return, and cause the Accept (in a separate goroutine) to unblock, and return an error.

What did you see instead?

In Go 1.11.3, the Close blocks indefinitely:

goroutine 1 [semacquire]:
internal/poll.runtime_Semacquire(0xc0000a4028)
        /home/prashant/.gimme/versions/go1.11.3.linux.amd64/src/runtime/sema.go:61 +0x39
internal/poll.(*FD).Close(0xc0000a4000, 0xc0000a4000, 0x0)
        /home/prashant/.gimme/versions/go1.11.3.linux.amd64/src/internal/poll/fd_unix.go:109 +0xdb
net.(*netFD).Close(0xc0000a4000, 0xc0000b5e58, 0xc00008a030)
        /home/prashant/.gimme/versions/go1.11.3.linux.amd64/src/net/fd_unix.go:184 +0x4f
net.(*UnixListener).close(0xc000080540, 0xc0000d0020, 0x5)
        /home/prashant/.gimme/versions/go1.11.3.linux.amd64/src/net/unixsock_posix.go:186 +0x65
net.(*UnixListener).Close(0xc000080540, 0x1, 0x1)
        /home/prashant/.gimme/versions/go1.11.3.linux.amd64/src/net/unixsock.go:270 +0x47
main.testCloseAfterAccept(0x511ec0, 0xc000080540)
        /home/prashant/go/src/github.com/prashantv/unix-close-race/main.go:66 +0x111
main.main()
        /home/prashant/go/src/github.com/prashantv/unix-close-race/main.go:43 +0x1ce

In Go 1.10.6, the Close returns, but the Accept goroutine is not unblocked,

2018/12/14 14:58:02 Got fd from unix ln5
2018/12/14 14:58:02 wait for accept
2018/12/14 14:58:02 about to accept
2018/12/14 14:58:02 close
2018/12/14 14:58:02 wait for accept to end
<blocks>

Other Notes

It looks like the behaviour change between 1.10 and 1.11 may be caused by https://go-review.googlesource.com/c/go/+/119955/ (fix for #24942)

I added a flag to the repro, if you pass --use-new-fd, instead of calling Accept and Close on the original unix listener (which we called File() on), it uses a new listener from the copied file descriptor. This mitigates the issue (both in Go 1.10 and Go 1.11). It seems like calling File() duplicates the file descriptor, but somehow affects the original file descriptor causing issues with Accept + Close.

cc @witriew who originally ran into this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.help wanted

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions