Skip to content

net/http: Hijack() deadlocks with Unix domain sockets #69741

Closed as not planned
Closed as not planned
@depau

Description

@depau

Go version

go version go1.23.1 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/davide.depau/.cache/go-build'
GOENV='/home/davide.depau/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/davide.depau/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/davide.depau/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/lib/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/lib/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/davide.depau/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/davide.depau/Projects/unix-socket-ws-hang/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3677999444=/tmp/go-build -gno-record-gcc-switches'

What did you do?

I was writing a simple Unix socket HTTP + WebSocket proxy when I stumbled upon this issue. Roughly 1/3 of the times the server gets stuck when upgrading the connection to WebSocket. The culprit seems to be Hijack(), which attempts to abortPendingRead() but the background read goroutine doesn't seem to receive the signal and therefore deadlocks.

I wrote a reproducer here, the steps to reproduce are in the README: https://github.com/depau/golang-unix-websocket-deadlock

In short:

  • Run the server
  • Run the client over and over until the server deadlocks (on my machine this happens at least once every 5 tries)

Although the sample code uses gorilla websockets, the issue doesn't seem related to it. To demonstrate it I additionally wrote a basic websocket implementation that you can trigger on the server by adding the --no-gorilla command line argument.

What did you see happen?

This is a Goroutine dump from when the issue occurs on the server. Observe how the hijacking goroutine gets stuck in abortPendingRead, while the backgroundRead goroutine remains stuck on the read syscall.

This looks similar to #19747, but it's not the same issue since my system time is up to date.

goroutine profile: total 6
1 @ 0x40dc09 0x471969 0x665633 0x477b41
#       0x471968        os/signal.signal_recv+0x28      /usr/lib/go/src/runtime/sigqueue.go:152
#       0x665632        os/signal.loop+0x12             /usr/lib/go/src/os/signal/signal_unix.go:23

1 @ 0x431151 0x46eafd 0x66ba71 0x66b8a5 0x6686cb 0x6804ce 0x477b41
#       0x66ba70        runtime/pprof.writeRuntimeProfile+0xb0  /usr/lib/go/src/runtime/pprof/pprof.go:793
#       0x66b8a4        runtime/pprof.writeGoroutine+0x44       /usr/lib/go/src/runtime/pprof/pprof.go:752
#       0x6686ca        runtime/pprof.(*Profile).WriteTo+0x14a  /usr/lib/go/src/runtime/pprof/pprof.go:374
#       0x6804cd        main.dumpGoroutinesAfter+0x4d           /home/davide.depau/Projects/unix-socket-ws-hang/cmd/server/main.go:99

1 @ 0x46fbce 0x4084bc 0x408092 0x680345 0x477b41
#       0x680344        main.main.func1+0xe4    /home/davide.depau/Projects/unix-socket-ws-hang/cmd/server/main.go:79

1 @ 0x46fbce 0x4345f7 0x46eec5 0x4cb127 0x4cca35 0x4cca23 0x56ee29 0x581e56 0x581270 0x649bec 0x68015d 0x43bc2b 0x477b41
#       0x46eec4        internal/poll.runtime_pollWait+0x84             /usr/lib/go/src/runtime/netpoll.go:351
#       0x4cb126        internal/poll.(*pollDesc).wait+0x26             /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84
#       0x4cca34        internal/poll.(*pollDesc).waitRead+0x294        /usr/lib/go/src/internal/poll/fd_poll_runtime.go:89
#       0x4cca22        internal/poll.(*FD).Accept+0x282                /usr/lib/go/src/internal/poll/fd_unix.go:620
#       0x56ee28        net.(*netFD).accept+0x28                        /usr/lib/go/src/net/fd_unix.go:172
#       0x581e55        net.(*UnixListener).accept+0x15                 /usr/lib/go/src/net/unixsock_posix.go:172
#       0x58126f        net.(*UnixListener).Accept+0x2f                 /usr/lib/go/src/net/unixsock.go:260
#       0x649beb        net/http.(*Server).Serve+0x30b                  /usr/lib/go/src/net/http/server.go:3330
#       0x68015c        main.main+0x2dc                                 /home/davide.depau/Projects/unix-socket-ws-hang/cmd/server/main.go:90
#       0x43bc2a        runtime.main+0x28a                              /usr/lib/go/src/runtime/proc.go:272

1 @ 0x46fbce 0x471199 0x471179 0x47f725 0x6404c5 0x63edef 0x646a77 0x63d314 0x680e92 0x6521ce 0x645890 0x477b41
#       0x471178        sync.runtime_notifyListWait+0x138               /usr/lib/go/src/runtime/sema.go:587
#       0x47f724        sync.(*Cond).Wait+0x84                          /usr/lib/go/src/sync/cond.go:71
#       0x6404c4        net/http.(*connReader).abortPendingRead+0xa4    /usr/lib/go/src/net/http/server.go:738
#       0x63edee        net/http.(*conn).hijackLocked+0x2e              /usr/lib/go/src/net/http/server.go:321
#       0x646a76        net/http.(*response).Hijack+0xd6                /usr/lib/go/src/net/http/server.go:2170
#       0x63d313        net/http.(*ResponseController).Hijack+0x193     /usr/lib/go/src/net/http/responsecontroller.go:71
#       0x680e91        main.(*server).ServeHTTP+0x991                  /home/davide.depau/Projects/unix-socket-ws-hang/cmd/server/main.go:170
#       0x6521cd        net/http.serverHandler.ServeHTTP+0x8d           /usr/lib/go/src/net/http/server.go:3210
#       0x64588f        net/http.(*conn).serve+0x5cf                    /usr/lib/go/src/net/http/server.go:2092

1 @ 0x489c05 0x488778 0x4cbe6e 0x4cbe56 0x4cbcf1 0x56dc65 0x577505 0x6402d7 0x477b41
#       0x489c04        syscall.Syscall+0x24                            /usr/lib/go/src/syscall/syscall_linux.go:73
#       0x488777        syscall.read+0x37                               /usr/lib/go/src/syscall/zsyscall_linux_amd64.go:736
#       0x4cbe6d        syscall.Read+0x2ad                              /usr/lib/go/src/syscall/syscall_unix.go:183
#       0x4cbe55        internal/poll.ignoringEINTRIO+0x295             /usr/lib/go/src/internal/poll/fd_unix.go:745
#       0x4cbcf0        internal/poll.(*FD).Read+0x130                  /usr/lib/go/src/internal/poll/fd_unix.go:161
#       0x56dc64        net.(*netFD).Read+0x24                          /usr/lib/go/src/net/fd_posix.go:55
#       0x577504        net.(*conn).Read+0x44                           /usr/lib/go/src/net/net.go:189
#       0x6402d6        net/http.(*connReader).backgroundRead+0x36      /usr/lib/go/src/net/http/server.go:690

What did you expect to see?

I expected it to not deadlock :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions