Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Deadlock while writing to UDP socket when laptop goes to sleep #61555

Closed
nbrownus opened this issue Jul 24, 2023 · 7 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@nbrownus
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.20.4 darwin/arm64

Does this issue reproduce with the latest release?

Have not tried yet, difficult to reproduce reliably.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOOS="darwin"
GOARCH=arm64
CGO_ENABLED="0"

What did you do?

Ran nebula, put laptop to sleep, came back after a while and woke laptop, goroutine handling UDP writes was hung.

What did you expect to see?

UDP writes happening as expected

What did you see instead?

We have had a few instances of this and I am not sure where else to look. Sending SIGQUIT to the process, they all appear to have a goroutine blocked in the same way.

goroutine 36 [IO wait, 352 minutes]:
runtime.gopark(0x140003a9238?, 0x102fd3198?, 0x38?, 0x92?, 0x102fd345c?)
	runtime/proc.go:381 +0xe4 fp=0x140003a9200 sp=0x140003a91e0 pc=0x102f35e24
runtime.netpollblock(0x140003a9298?, 0x2fd3d48?, 0x1?)
	runtime/netpoll.go:527 +0x158 fp=0x140003a9240 sp=0x140003a9200 pc=0x102f2f2c8
internal/poll.runtime_pollWait(0x10b178e00, 0x77)
	runtime/netpoll.go:306 +0xa0 fp=0x140003a9270 sp=0x140003a9240 pc=0x102f616c0
internal/poll.(*pollDesc).wait(0x140002aa000?, 0x1400018ec60?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140003a92a0 sp=0x140003a9270 pc=0x102fd0498
internal/poll.(*pollDesc).waitWrite(...)
	internal/poll/fd_poll_runtime.go:93
internal/poll.(*FD).WriteToInet6(0x140002aa000, {0x1400018ec60, 0x101, 0x120}, 0x0?)
	internal/poll/fd_unix.go:479 +0x1dc fp=0x140003a9340 sp=0x140003a92a0 pc=0x102fd3ddc
net.(*netFD).writeToInet6(0x140002aa000, {0x1400018ec60?, 0xffff000000000000?, 0xdf278303?}, 0x0?)
	net/fd_posix.go:114 +0x28 fp=0x140003a9390 sp=0x140003a9340 pc=0x1030539d8
net.(*UDPConn).writeTo(0x140002a8018, {0x1400018ec60, 0x101, 0x120}, 0x30?)
	net/udpsock_posix.go:133 +0xec fp=0x140003a94d0 sp=0x140003a9390 pc=0x10306d1bc
net.(*UDPConn).WriteToUDP(0x140002a8018, {0x1400018ec60?, 0x140003a9608?, 0x5?}, 0x14000336930)
	net/udpsock.go:215 +0x30 fp=0x140003a9520 sp=0x140003a94d0 pc=0x10306b8b0
github.com/slackhq/nebula/udp.(*Conn).WriteTo(...)
	github.com/slackhq/nebula/udp/udp_generic.go:39
github.com/slackhq/nebula.(*HandshakeManager).handleOutbound.func1(0x140000ca120, 0x0?)
	github.com/slackhq/nebula/handshake_manager.go:173 +0x138 fp=0x140003a9b20 sp=0x140003a9520 pc=0x1034180c8
github.com/slackhq/nebula.(*RemoteList).ForEach(0x140002aa200, {0x0?, 0x0, 0x140000b2048?}, 0x140003a9f18)
	github.com/slackhq/nebula/remote_list.go:248 +0xd0 fp=0x140003a9ba0 sp=0x140003a9b20 pc=0x103440de0
github.com/slackhq/nebula.(*HandshakeManager).handleOutbound(0x140002ba000, 0xa80050d, {0x1036e8c18, 0x140002ac7e0}, 0x0)
	github.com/slackhq/nebula/handshake_manager.go:171 +0x260 fp=0x140003abe90 sp=0x140003a9ba0 pc=0x103415ff0
github.com/slackhq/nebula.(*HandshakeManager).NextOutboundHandshakeTimerTick(0x140002ba000, {0x140003abf14?, 0x1036e8c18?, 0x103aef1c0?}, {0x1036e8c18, 0x140002ac7e0})
	github.com/slackhq/nebula/handshake_manager.go:99 +0x70 fp=0x140003abed0 sp=0x140003abe90 pc=0x103415d30
github.com/slackhq/nebula.(*HandshakeManager).Run(0x140002ba000, {0x1036e8978, 0x14000036190}, {0x1036e8c18, 0x140002ac7e0})
	github.com/slackhq/nebula/handshake_manager.go:87 +0x11c fp=0x140003abf90 sp=0x140003abed0 pc=0x103415bfc
github.com/slackhq/nebula.Main.func4()
	github.com/slackhq/nebula/main.go:323 +0x38 fp=0x140003abfd0 sp=0x140003abf90 pc=0x10342f278
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140003abfd0 sp=0x140003abfd0 pc=0x102f67bb4
created by github.com/slackhq/nebula.Main
	github.com/slackhq/nebula/main.go:323 +0x1ee8
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 24, 2023
@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 31, 2023
@dr2chase
Copy link
Contributor

@ianlancetaylor @neild
A headscratcher for you.

@brad-defined
Copy link

Working with a user who more reliably encountered this issue, we've identified a Mac program called Little Snitch (https://www.obdev.at/products/littlesnitch/index.html) which seems to be the source of weirdness.

I wrote a stub program that sends out UDP packets to many different port numbers, on the theory that each of these packets look like a new connection to Little Snitch. Little Snitch then (in one of its modes) triggers when detecting these new connections, and prompts me to allow or deny the connection. When it does this, the UDP packets block while being sent until the dialog is responded to. If I either deny the connection or don't allow it quickly enough, the program remains frozen.

Reproduction steps:

  1. Install Little Snitch on your MacOS host
  2. Run the included stub program
  3. Little Snitch prompts to allow/deny the packets being sent by the stub program (around packet 36). Program freezes at this point.
  4. Do not interact with the Little Snitch pop-up for 10 seconds
  5. Click 'allow' in the Little Snitch pop-up
  6. Program remains frozen
  7. Send QUIT signal to stub program, and see the stack trace shows runtime.gopark called by runtime.netpollblock

Stub program (edit the code with your favorite IP address to control which host you spam with the UDP packets):

package main

import (
	"fmt"
	"net"
)

func main() {
	uconn, err := net.ListenUDP("udp", &net.UDPAddr{IP: net.IPv4(0, 0, 0, 0), Port: 0, Zone: ""})
	if err != nil {
		fmt.Printf("err=%v", err)
		return
	}
	ipDest := net.IPv4(, , , ) // <- put your favorite IP address in here to spam it with UDP packets
	for i := 0; i < 50000; i++ {
		fmt.Printf("(%v) Write crappy UDP packet to destination: %v:%v\n", i, ipDest, i)
		v, err := uconn.WriteToUDP([]byte("hello, jello"), &net.UDPAddr{IP: ipDest, Port: i, Zone: ""})
		if err != nil || v != len("hello, jello") {
			fmt.Printf("Failed to write to UDP: %v %v", v, err)
		}
	}
}

We have emailed the authors of Little Snitch to push this issue we encountered forward. I'm not sure at this point that this is a golang problem. I'm suspicious that Little Snitch / MacOS's API that Little Snitch is using is somehow not marking the fd as writable again after blocking it. (Could be a Little Snitch bug or MacOS bug.)

@ianlancetaylor
Copy link
Contributor

@nbrownus Are you working with @brad-defined ? Does the Little Snitch problem apply to your case as well? Thanks.

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 31, 2023
@brad-defined
Copy link

@ianlancetaylor yes - @nbrownus and I work at the same place. The user who reported the initial issue to us which generated the originally reported stack trace in this ticket was running Little Snitch.

@dr2chase
Copy link
Contributor

I'm a former Little Snitch user, might try to repro on my old personal laptop.

@nbrownus
Copy link
Author

@ianlancetaylor yes, @brad-defined and I work together. Little snitch was installed in all cases that we've seen so far.

@ianlancetaylor
Copy link
Contributor

Thanks, I'm going to optimistically close this as something that can't be fixed by changes to Go. Please comment if you disagree.

@ianlancetaylor ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants