Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: deadlock in runtime_pollWait #30957

Open
dieselburner opened this Issue Mar 20, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@dieselburner
Copy link

commented Mar 20, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="arm"
GOBIN=""
GOCACHE="/home/andrejs/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/andrejs/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
GOARM="7"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/data/dev/andrejs-tools/imud/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -marm -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build038904013=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Following function executes in a TCP connection handler. This code is triggered every 10ms by a ticker (time.NewTicker(10 * time.Millisecond)).

func get_imu_data(f *os.File) (data, error) {

	var d data

	err := f.SetDeadline(time.Now().Add(time.Millisecond))
	if err != nil {
		return d, err
	}

	b := make([]byte, 6)
	n, err := f.Read(b)
	if err != nil {
		return d, err
	}
	if n != 6 {
		return d, errors.New("Read error: bad read length")
	}

	r := bytes.NewReader(b)
	err = binary.Read(r, binary.LittleEndian, &d)
	if err != nil {
		return d, err
	}

	return d, nil
}

What did you expect to see?

No deadlock.

What did you see instead?

Deadlock. Hard to reproduce (~5 times in 2 days of spontaneous testing).

kill -3 <pid> gives following:

# ./imud 
Listening on 0.0.0.0:2948
Connection accepted: 172.16.71.100:48764
SIGQUIT: quit
PC=0x6a8f0 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x312cb4, 0x80, 0x0, 0x0, 0x0, 0x0, 0x19701, 0x0, 0x0, 0x435a0, ...)
        /usr/local/go/src/runtime/sys_linux_arm.s:334 +0x1c
runtime.futexsleep(0x312cb4, 0x0, 0xffffffff, 0xffffffff)
        /usr/local/go/src/runtime/os_linux.go:46 +0x78
runtime.notesleep(0x312cb4)
        /usr/local/go/src/runtime/lock_futex.go:151 +0xc4
runtime.stopm()
        /usr/local/go/src/runtime/proc.go:1936 +0x7c
runtime.findrunnable(0x416000, 0x0)
        /usr/local/go/src/runtime/proc.go:2399 +0x2a0
runtime.schedule()
        /usr/local/go/src/runtime/proc.go:2525 +0x1e4
runtime.park_m(0x4009a0)
        /usr/local/go/src/runtime/proc.go:2605 +0x5c
runtime.mcall(0x3129f8)
        /usr/local/go/src/runtime/asm_arm.s:289 +0x5c

goroutine 1 [IO wait, 15 minutes]:
internal/poll.runtime_pollWait(0xa6c4beb8, 0x72, 0x2094f8)
        /usr/local/go/src/runtime/netpoll.go:182 +0x4c
internal/poll.(*pollDesc).wait(0x474014, 0x72, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x88
internal/poll.(*pollDesc).waitRead(0x474014, 0xffffff00, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 +0x3c
internal/poll.(*FD).Accept(0x474000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:384 +0x2f4
net.(*netFD).accept(0x474000, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/fd_unix.go:238 +0x5c
net.(*TCPListener).accept(0x40e440, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock_posix.go:139 +0x50
net.(*TCPListener).Accept(0x40e440, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock.go:260 +0x70
main.main()
        /data/dev/andrejs-tools/imud/main.go:165 +0xb6c

goroutine 6 [runnable]:
internal/poll.runtime_pollWait(0xa6c4bfc0, 0x72, 0xb)
        /usr/local/go/src/runtime/netpoll.go:182 +0x4c
internal/poll.(*pollDesc).wait(0x438594, 0x72, 0x209401, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x88
internal/poll.(*pollDesc).waitRead(0x438594, 0x4bb201, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92 +0x3c
internal/poll.(*FD).Read(0x438580, 0x4bb2a6, 0x6, 0x6, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:169 +0x320
os.(*File).read(0x40e368, 0x4bb2a6, 0x6, 0x6, 0x0, 0x0, 0x0)
        /usr/local/go/src/os/file_unix.go:263 +0x64
os.(*File).Read(0x40e368, 0x4bb2a6, 0x6, 0x6, 0x0, 0x0, 0x0)
        /usr/local/go/src/os/file.go:108 +0xb0
main.get_imu_data(0x40e368, 0x0, 0x0, 0x0, 0x0)
        /data/dev/andrejs-tools/imud/main.go:378 +0x15c
main.prepare_message(0x0, 0x0, 0x0, 0x0)
        /data/dev/andrejs-tools/imud/main.go:330 +0x48
main.connection_handler(0x20a7e8, 0x40e450)
        /data/dev/andrejs-tools/imud/main.go:300 +0x164
created by main.main
        /data/dev/andrejs-tools/imud/main.go:170 +0xc94

trap    0x0
error   0x0
oldmask 0x0
r0      0x312cb4
r1      0x80
r2      0x0
r3      0x0
r4      0x0
r5      0x0
r6      0x0
r7      0xf0
r8      0x7
r9      0x0
r10     0x3129f8
fp      0x7
ip      0x448c6d
sp      0xbe9d8cc0
lr      0x3b6e0
pc      0x6a8f0
cpsr    0x20000010
fault   0x0

@ianlancetaylor ianlancetaylor changed the title Deadlock in runtime_pollWait runtime: deadlock in runtime_pollWait Mar 20, 2019

@ianlancetaylor ianlancetaylor added this to the Go1.13 milestone Mar 20, 2019

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

What is GOMAXPROCS on your system?

Can you share a complete program that demonstrates the problem?

@dieselburner

This comment has been minimized.

Copy link
Author

commented Mar 20, 2019

Hi Ian,

As you can see from go env, I'm cross-compiling. Target is a single core ARM CPU.

I will prepare a minimal working example, and will try to reproduce this on my workstation.

@dieselburner

This comment has been minimized.

Copy link
Author

commented Mar 22, 2019

@ianlancetaylor I believe I have made a code that has 100% reproducibility on my machine. Deadlock is instant.

Code:

package main

import (
	"fmt"
	"net"
	"os"
	"runtime"
	"time"
)

var tty *os.File

func main() {

	var err error

	// Open TTY

	tty, err = os.Open("/dev/tty")
	if err != nil {
		fmt.Println("Error opening TTY:", err.Error())
		os.Exit(1)
	}
	defer tty.Close()

	// Listen for incoming connections

	l, err := net.Listen("tcp", "0.0.0.0:2019")
	if err != nil {
		fmt.Println("Error listening:", err.Error())
		os.Exit(1)
	}

	fmt.Println("GOMAXPROCS =", runtime.GOMAXPROCS(0))

	// Close the listener when the application closes

	defer l.Close()
	fmt.Println("Listening on port 2019")

	// Listen for an incoming connection

	for {
		conn, err := l.Accept()
		if err != nil {
			fmt.Println("Error accepting:", err.Error())
			os.Exit(1)
		}
		go connection_handler(conn)
	}
}

func connection_handler(conn net.Conn) {

	fmt.Println("Connection accepted:", conn.RemoteAddr())

	// Close connection on completion

	defer conn.Close()

	// Use ticker to send data at fixed intervals

	ticker := time.NewTicker(10 * time.Millisecond)
	defer ticker.Stop()

	// Connection loop

	for {
		// Read data

		msg, err := read_tty(tty)
		if err != nil {
			if os.IsTimeout(err) {
				continue
			}
			fmt.Println("Error preparing message:", err.Error())
			break
		}

		// Send data

		_, err = conn.Write([]byte(msg))
		if err != nil {
			break
		}

		fmt.Print(".")

		// Wait for ticket trigger

		<-ticker.C
	}

	// Done

	fmt.Println("Connection closed:", conn.RemoteAddr())
}

func read_tty(f *os.File) ([]byte, error) {

	// Set tiny read timeout

	err := f.SetDeadline(time.Now().Add(time.Millisecond))
	if err != nil {
		return nil, err
	}

	// Read few bytes

	data := make([]byte, 10)

	_, err = f.Read(data)
	if err != nil {
		return nil, err
	}

	// Done

	return data, nil
}

Instructions:

Terminal 1:

$ go build
$ ./pollwait 
GOMAXPROCS = 1
Listening on port 2019
Connection accepted: 127.0.0.1:37586

Terminal 2:

nc 127.0.0.1 2019

This results to immediate deadlock. Killing application from terminal 3 with kill -3 <pid> gives following:

SIGQUIT: quit
PC=0x458da1 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x605728, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fff0ba8f4d8, 0x40b0d1, ...)
	/usr/local/go/src/runtime/sys_linux_amd64.s:535 +0x21
runtime.futexsleep(0x605728, 0x7fff00000000, 0xffffffffffffffff)
	/usr/local/go/src/runtime/os_linux.go:46 +0x4b
runtime.notesleep(0x605728)
	/usr/local/go/src/runtime/lock_futex.go:151 +0xa1
runtime.stopm()
	/usr/local/go/src/runtime/proc.go:1936 +0xc1
runtime.findrunnable(0xc000022000, 0x0)
	/usr/local/go/src/runtime/proc.go:2399 +0x54a
runtime.schedule()
	/usr/local/go/src/runtime/proc.go:2525 +0x21c
runtime.park_m(0xc000001500)
	/usr/local/go/src/runtime/proc.go:2605 +0xa1
runtime.mcall(0x0)
	/usr/local/go/src/runtime/asm_amd64.s:299 +0x5b

goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7fed68f51e38, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:182 +0x56
internal/poll.(*pollDesc).wait(0xc00007c018, 0x72, 0x0, 0x0, 0x51e295)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc00007c000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:384 +0x1ba
net.(*netFD).accept(0xc00007c000, 0xc000010001, 0x0, 0x0)
	/usr/local/go/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc000010040, 0xc000047e80, 0xc000047e88, 0x10)
	/usr/local/go/src/net/tcpsock_posix.go:139 +0x32
net.(*TCPListener).Accept(0xc000010040, 0x525ed0, 0x541780, 0xc000010050, 0x0)
	/usr/local/go/src/net/tcpsock.go:260 +0x48
main.main()
	/home/test/pollwait/pollwait.go:45 +0x2a6

goroutine 5 [runnable]:
internal/poll.runtime_pollWait(0x7fed68f51f08, 0x72, 0xffffffffffffffff)
	/usr/local/go/src/runtime/netpoll.go:182 +0x56
internal/poll.(*pollDesc).wait(0xc000056198, 0x72, 0x1, 0xa, 0xffffffffffffffff)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000056180, 0xc0000ee540, 0xa, 0xa, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b
os.(*File).read(...)
	/usr/local/go/src/os/file_unix.go:263
os.(*File).Read(0xc000010030, 0xc0000ee540, 0xa, 0xa, 0x0, 0x0, 0x604dc0)
	/usr/local/go/src/os/file.go:108 +0x70
main.read_tty(0xc000010030, 0xc0000ed080, 0x1, 0x0, 0x53fc00, 0xc0000ed080)
	/home/test/pollwait/pollwait.go:113 +0xc5
main.connection_handler(0x541780, 0xc000010050)
	/home/test/pollwait/pollwait.go:72 +0x157
created by main.main
	/home/test/pollwait/pollwait.go:50 +0x292

rax    0xca
rbx    0x6055e0
rcx    0x458da3
rdx    0x0
rdi    0x605728
rsi    0x80
rbp    0x7fff0ba8f4a0
rsp    0x7fff0ba8f458
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0x55
r13    0x54
r14    0x200
r15    0x0
rip    0x458da1
rflags 0x286
cs     0x33
fs     0x0
gs     0x0

$ go env:

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/test/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/test/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build981214745=/tmp/go-build -gno-record-gcc-switches"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.