Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: all goroutines are asleep - deadlock! since 2023-07-18 #61454

Closed
prattmic opened this issue Jul 19, 2023 · 4 comments
Closed
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented Jul 19, 2023

#!watchflakes
post <- `fatal error: all goroutines are asleep - deadlock!` && date >= "2023-07-18" && date <= "2023-07-19"

cc @ianlancetaylor

@prattmic prattmic added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Jul 19, 2023
@prattmic prattmic added this to the Go1.22 milestone Jul 19, 2023
@prattmic prattmic self-assigned this Jul 19, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 19, 2023
@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- `fatal error: all goroutines are asleep - deadlock!` && date >= "2023-07-18"
2023-07-18 19:54 openbsd-386-72 tools@03275ec1 go@813aad51 x/tools/go/packages.TestParseFileModifyAST (log)
--- FAIL: TestParseFileModifyAST (33.41s)
    --- FAIL: TestParseFileModifyAST/Modules (1.32s)
        packages_test.go:917: err: exit status 2: stderr: fatal error: all goroutines are asleep - deadlock!

panic: runtime error: index out of range [0] with length 0 [recovered]
	panic: runtime error: index out of range [0] with length 0

goroutine 4024 [running]:
testing.tRunner.func1.2({0x82e67a0, 0x6deda3e0})
	/tmp/workdir/go/src/testing/testing.go:1545 +0x2a5
testing.tRunner.func1()
	/tmp/workdir/go/src/testing/testing.go:1548 +0x438
panic({0x82e67a0, 0x6deda3e0})
	/tmp/workdir/go/src/runtime/panic.go:914 +0x1e0
golang.org/x/tools/go/packages_test.testParseFileModifyAST(0x6aed42d0, {0x83696b8, 0x84e56e0})
	/tmp/workdir/gopath/src/golang.org/x/tools/go/packages/packages_test.go:921 +0x2f1
golang.org/x/tools/go/packages_test.testAllOrModulesParallel.func1(0x6aed42d0, {0x83696b8, 0x84e56e0})
	/tmp/workdir/gopath/src/golang.org/x/tools/go/packages/packages_test.go:81 +0x185
golang.org/x/tools/go/packages/packagestest.TestAll.func1(0x6aed42d0)
	/tmp/workdir/gopath/src/golang.org/x/tools/go/packages/packagestest/export.go:162 +0x51
testing.tRunner(0x6aed42d0, 0x6a6a9e30)
	/tmp/workdir/go/src/testing/testing.go:1595 +0x11a
created by testing.(*T).Run in goroutine 77
	/tmp/workdir/go/src/testing/testing.go:1648 +0x3d7
2023-07-18 19:55 linux-ppc64le-buildlet go@5d481abc runtime.TestFakeTime (log)
--- FAIL: TestFakeTime (0.31s)
    crash_test.go:141: running /workdir/go/bin/go build -o /workdir/tmp/go-build1342463730/testfaketime_-tags=faketime.exe -tags=faketime
    time_test.go:31: building testfaketime [-tags=faketime]: exit status 2
        fatal error: all goroutines are asleep - deadlock!
2023-07-18 21:24 darwin-amd64-13 go@813aad51 cmd/cgo/internal/test/issue24161e2 [build] (log)
fatal error: all goroutines are asleep - deadlock!

runtime stack:
runtime.fatal({0x11ecc3c?, 0x700005633d58?})
	runtime/panic.go:1096 +0x5c fp=0x700005633d20 sp=0x700005633cf0 pc=0x10339fc
runtime.checkdead()
	runtime/proc.go:5498 +0x236 fp=0x700005633d80 sp=0x700005633d20 pc=0x1040f96
runtime.mput(0x1042659?)
	runtime/proc.go:5933 +0x2b fp=0x700005633d90 sp=0x700005633d80 pc=0x10423ab
runtime.stopm()
	runtime/proc.go:2534 +0x67 fp=0x700005633dc0 sp=0x700005633d90 pc=0x103a367
runtime.findRunnable()
	runtime/proc.go:3229 +0xb9c fp=0x700005633ed0 sp=0x700005633dc0 pc=0x103bcbc
runtime.schedule()
	runtime/proc.go:3582 +0xb1 fp=0x700005633f08 sp=0x700005633ed0 pc=0x103cab1
runtime.park_m(0xc0000061a0?)
	runtime/proc.go:3745 +0x11f fp=0x700005633f50 sp=0x700005633f08 pc=0x103cfbf
runtime.mcall()
	runtime/asm_amd64.s:458 +0x4e fp=0x700005633f68 sp=0x700005633f50 pc=0x10622ce
2023-07-18 23:25 darwin-amd64-10_15 tools@84f829e2 go@5fe3f0a2 x/tools/gopls/internal/regtest/diagnostics.TestNoMod (log)
#### Start Gopls Test Logs for "TestNoMod/initialized/default"
[Trace - 08:48:05.340 AM] Sending request 'initialize - (1)'.
Params: {"processId":0,"rootUri":"","capabilities":{"workspace":{"workspaceEdit":{"resourceOperations":["rename"]},"didChangeConfiguration":{},"didChangeWatchedFiles":{"dynamicRegistration":true},"configuration":true},"textDocument":{"completion":{"completionItem":{"snippetSupport":true,"tagSupport":{"valueSet":[1]}}},"documentSymbol":{"hierarchicalDocumentSymbolSupport":true},"codeAction":{"codeActionLiteralSupport":{"codeActionKind":{"valueSet":null}}},"publishDiagnostics":{},"semanticTokens":{"requests":{"range":null,"full":true},"tokenTypes":["namespace","type","class","enum","interface","struct","typeParameter","parameter","variable","property","enumMember","event","function","method","macro","keyword","modifier","comment","string","number","regexp","operator"],"tokenModifiers":["declaration","definition","readonly","static","deprecated","abstract","async","modification","documentation","defaultLibrary"],"formats":null}},"window":{"workDoneProgress":true}},"initializationOptions":{"completionBudget":"10s","diagnosticsDelay":"10ms","env":{"GO111MODULE":"","GOMODCACHE":"","GOPACKAGESDRIVER":"off","GOPATH":"/tmp/buildlet/tmp/gopls-regtest-3223691568/TestNoMod/initialized/default/gopath","GOPROXY":"file:///tmp/buildlet/tmp/gopls-regtest-3223691568/TestNoMod/initialized/default/proxy","GOSUMDB":"off"},"verboseWorkDoneProgress":true},"trace":"messages","workspaceFolders":[{"uri":"file:///tmp/buildlet/tmp/gopls-regtest-3223691568/TestNoMod/initialized/default/work","name":"work"}]}


[Trace - 08:48:05.341 AM] Received response 'initialize - (1)' in 0ms.
Result: {"capabilities":{"textDocumentSync":{"openClose":true,"change":2,"save":{}},"completionProvider":{"triggerCharacters":["."]},"hoverProvider":true,"signatureHelpProvider":{"triggerCharacters":["(",","]},"definitionProvider":true,"typeDefinitionProvider":true,"implementationProvider":true,"referencesProvider":true,"documentHighlightProvider":true,"documentSymbolProvider":true,"codeActionProvider":true,"codeLensProvider":{},"documentLinkProvider":{},"workspaceSymbolProvider":true,"documentFormattingProvider":true,"renameProvider":true,"foldingRangeProvider":true,"selectionRangeProvider":true,"executeCommandProvider":{"commands":["gopls.add_dependency","gopls.add_import","gopls.apply_fix","gopls.check_upgrades","gopls.edit_go_directive","gopls.fetch_vulncheck_result","gopls.gc_details","gopls.generate","gopls.go_get_package","gopls.list_imports","gopls.list_known_packages","gopls.mem_stats","gopls.regenerate_cgo","gopls.remove_dependency","gopls.reset_go_mod_diagnostics","gopls.run_go_work_command","gopls.run_govulncheck","gopls.run_tests","gopls.start_debugging","gopls.start_profile","gopls.stop_profile","gopls.test","gopls.tidy","gopls.toggle_gc_details","gopls.update_go_sum","gopls.upgrade_dependency","gopls.vendor","gopls.workspace_stats"]},"callHierarchyProvider":true,"semanticTokensProvider":{"legend":{"tokenTypes":["namespace","type","class","enum","interface","struct","typeParameter","parameter","variable","property","enumMember","event","function","method","macro","keyword","modifier","comment","string","number","regexp","operator"],"tokenModifiers":["declaration","definition","readonly","static","deprecated","abstract","async","modification","documentation","defaultLibrary"]},"range":true,"full":true},"inlayHintProvider":{},"workspace":{"workspaceFolders":{"supported":true,"changeNotifications":"workspace/didChangeWorkspaceFolders"}}},"serverInfo":{"name":"gopls","version":"{\"GoVersion\":\"devel 5fe3f0a265c90a9c0346403742c6cafeb154503b\",\"Path\":\"\",\"Main\":{\"Path\":\"\",\"Version\":\"\",\"Sum\":\"\",\"Replace\":null},\"Deps\":null,\"Settings\":null,\"Version\":\"master\"}"}}


[Trace - 08:48:05.341 AM] Sending notification 'initialized'.
...
--- FAIL: TestNoMod (8.29s)
    --- FAIL: TestNoMod/initialized (2.32s)
        --- FAIL: TestNoMod/initialized/default (2.32s)
            diagnostics_test.go:288: waiting on:
                Unmeetable: once "All of:\ncompleted work \"diagnosing files changed on disk\" at least 1 time(s)" is met, must have:
                no diagnostics for file "main.go"
                any diagnostics at the first position matching `x` in "bob/bob.go"

                err:condition has final verdict Unmeetable

...
                	main.go (version 0):
                	go.mod (version 0):
                		(0, 0) [go list]: packages.Load error: err: exit status 2: stderr: fatal error: all goroutines are asleep - deadlock!

                #### outstanding work:
                	Error loading workspace: 0.00
                #### completed work:
                	diagnosing files changed on disk: 1
                	diagnosing initial workspace load: 1
                	Setting up workspace: 1
2023-07-18 23:25 darwin-amd64-12_0 tools@84f829e2 go@70072715 x/tools/gopls/internal/regtest/completion.TestPackageCompletion (log)
#### Start Gopls Test Logs for "TestPackageCompletion/package_completion_for_empty_file/default"
[Trace - 20:06:23.684 PM] Sending request 'initialize - (1)'.
Params: {"processId":0,"rootUri":"","capabilities":{"workspace":{"workspaceEdit":{"resourceOperations":["rename"]},"didChangeConfiguration":{},"didChangeWatchedFiles":{"dynamicRegistration":true},"configuration":true},"textDocument":{"completion":{"completionItem":{"snippetSupport":true,"tagSupport":{"valueSet":[1]}}},"documentSymbol":{"hierarchicalDocumentSymbolSupport":true},"codeAction":{"codeActionLiteralSupport":{"codeActionKind":{"valueSet":null}}},"publishDiagnostics":{},"semanticTokens":{"requests":{"range":null,"full":true},"tokenTypes":["namespace","type","class","enum","interface","struct","typeParameter","parameter","variable","property","enumMember","event","function","method","macro","keyword","modifier","comment","string","number","regexp","operator"],"tokenModifiers":["declaration","definition","readonly","static","deprecated","abstract","async","modification","documentation","defaultLibrary"],"formats":null}},"window":{"workDoneProgress":true}},"initializationOptions":{"completionBudget":"10s","diagnosticsDelay":"10ms","env":{"GO111MODULE":"","GOMODCACHE":"","GOPACKAGESDRIVER":"off","GOPATH":"/tmp/buildlet/tmp/gopls-regtest-649311811/TestPackageCompletion/package_completion_for_empty_file/default/gopath","GOPROXY":"file:///tmp/buildlet/tmp/gopls-regtest-649311811/TestPackageCompletion/package_completion_for_empty_file/default/proxy","GOSUMDB":"off"},"verboseWorkDoneProgress":true},"trace":"messages","workspaceFolders":[{"uri":"file:///tmp/buildlet/tmp/gopls-regtest-649311811/TestPackageCompletion/package_completion_for_empty_file/default/work","name":"work"}]}


[Trace - 20:06:23.684 PM] Received response 'initialize - (1)' in 0ms.
Result: {"capabilities":{"textDocumentSync":{"openClose":true,"change":2,"save":{}},"completionProvider":{"triggerCharacters":["."]},"hoverProvider":true,"signatureHelpProvider":{"triggerCharacters":["(",","]},"definitionProvider":true,"typeDefinitionProvider":true,"implementationProvider":true,"referencesProvider":true,"documentHighlightProvider":true,"documentSymbolProvider":true,"codeActionProvider":true,"codeLensProvider":{},"documentLinkProvider":{},"workspaceSymbolProvider":true,"documentFormattingProvider":true,"renameProvider":true,"foldingRangeProvider":true,"selectionRangeProvider":true,"executeCommandProvider":{"commands":["gopls.add_dependency","gopls.add_import","gopls.apply_fix","gopls.check_upgrades","gopls.edit_go_directive","gopls.fetch_vulncheck_result","gopls.gc_details","gopls.generate","gopls.go_get_package","gopls.list_imports","gopls.list_known_packages","gopls.mem_stats","gopls.regenerate_cgo","gopls.remove_dependency","gopls.reset_go_mod_diagnostics","gopls.run_go_work_command","gopls.run_govulncheck","gopls.run_tests","gopls.start_debugging","gopls.start_profile","gopls.stop_profile","gopls.test","gopls.tidy","gopls.toggle_gc_details","gopls.update_go_sum","gopls.upgrade_dependency","gopls.vendor","gopls.workspace_stats"]},"callHierarchyProvider":true,"semanticTokensProvider":{"legend":{"tokenTypes":["namespace","type","class","enum","interface","struct","typeParameter","parameter","variable","property","enumMember","event","function","method","macro","keyword","modifier","comment","string","number","regexp","operator"],"tokenModifiers":["declaration","definition","readonly","static","deprecated","abstract","async","modification","documentation","defaultLibrary"]},"range":true,"full":true},"inlayHintProvider":{},"workspace":{"workspaceFolders":{"supported":true,"changeNotifications":"workspace/didChangeWorkspaceFolders"}}},"serverInfo":{"name":"gopls","version":"{\"GoVersion\":\"devel 700727151fe3772ecc2315af101d2e5d93269c0c\",\"Path\":\"\",\"Main\":{\"Path\":\"\",\"Version\":\"\",\"Sum\":\"\",\"Replace\":null},\"Deps\":null,\"Settings\":null,\"Version\":\"master\"}"}}


[Trace - 20:06:23.686 PM] Sending notification 'initialized'.
...
--- FAIL: TestPackageCompletion (12.80s)
    --- FAIL: TestPackageCompletion/package_completion_for_empty_file (1.30s)
        --- FAIL: TestPackageCompletion/package_completion_for_empty_file/default (1.30s)
            completion_test.go:207: completion item mismatch (-want +got):
                  []string(
                - 	{
                - 		"package apple", "package apple_test", "package fruits", "package fruits_test",
                - 		"package main",
                - 	},
                + 	nil,
                  )
2023-07-18 23:25 openbsd-amd64-72 tools@84f829e2 go@cd6c4e05 x/tools/go/packages.TestConfigFlags (log)
--- FAIL: TestConfigFlags (0.00s)
    --- FAIL: TestConfigFlags/Modules (1.95s)
        packages_test.go:537: err: exit status 2: stderr: fatal error: all goroutines are asleep - deadlock!
2023-07-18 23:51 darwin-amd64-longtest go@8e1ec1cb cmd/vet.TestVet (log)
--- FAIL: TestVet (0.00s)
    --- FAIL: TestVet/assign (1.74s)
        vet_test.go:147: error check failed: 
            assign.go:18: missing error "self-assignment of x to x"
            assign.go:20: missing error "self-assignment of s.x to s.x"
            assign.go:22: missing error "self-assignment of s.l.0. to s.l.0."
            Unmatched Errors:
            fatal error: all goroutines are asleep - deadlock!
            runtime stack:
            runtime.fatal({0x1693bfb?, 0x700002e7bd58?})
...
            cmd/go/internal/work.ActorFunc.Act(0x178ab90?, 0x1b2dbe0?, {0x178ab90?, 0x1b2dbe0?}, 0x103b6bc?)
            	cmd/go/internal/work/action.go:78 +0x2d fp=0xc0005cbd70 sp=0xc0005cbd40 pc=0x14daeed
            cmd/go/internal/work.(*Builder).Do.func3({0x178ab90, 0x1b2dbe0}, 0xc0001474a0)
            	cmd/go/internal/work/exec.go:151 +0x7e4 fp=0xc0005cbf20 sp=0xc0005cbd70 pc=0x14e9724
            cmd/go/internal/work.(*Builder).Do.func4()
            	cmd/go/internal/work/exec.go:219 +0xa8 fp=0xc0005cbfe0 sp=0xc0005cbf20 pc=0x14e8d68
            runtime.goexit()
            	runtime/asm_amd64.s:1650 +0x1 fp=0xc0005cbfe8 sp=0xc0005cbfe0 pc=0x106e561
            created by cmd/go/internal/work.(*Builder).Do in goroutine 1
            	cmd/go/internal/work/exec.go:205 +0x405
2023-07-18 23:51 freebsd-arm-paulzhol go@8e1ec1cb cmd/cgo/internal/test/issue30527 [build] (log)
fatal error: all goroutines are asleep - deadlock!

runtime stack:
runtime.fatal({0x20273b, 0x25})
	runtime/panic.go:1096 +0x4c fp=0x28c91ea4 sp=0x28c91e90 pc=0x4bbec
runtime.checkdead()
	runtime/proc.go:5498 +0x334 fp=0x28c91ef0 sp=0x28c91ea4 pc=0x5b5c4
runtime.mput(0x28c80000)
	runtime/proc.go:5933 +0x44 fp=0x28c91ef4 sp=0x28c91ef0 pc=0x5d1c0
runtime.stopm()
	runtime/proc.go:2534 +0x60 fp=0x28c91f08 sp=0x28c91ef4 pc=0x53708
runtime.findRunnable()
	runtime/proc.go:3229 +0xd78 fp=0x28c91fa4 sp=0x28c91f08 pc=0x55328
runtime.schedule()
	runtime/proc.go:3582 +0xac fp=0x28c91fc0 sp=0x28c91fa4 pc=0x56644
runtime.park_m(0x28c02100)
	runtime/proc.go:3745 +0x160 fp=0x28c91fdc sp=0x28c91fc0 pc=0x56e8c
runtime.mcall()
	runtime/asm_arm.s:265 +0x48 fp=0x28c91fe4 sp=0x28c91fdc pc=0x7f5f4
2023-07-18 23:51 openbsd-386-72 go@8e1ec1cb cmd/go.TestScript (log)
vcs-test.golang.org rerouted to http://127.0.0.1:6906
https://vcs-test.golang.org rerouted to https://127.0.0.1:27007
go test proxy running at GOPROXY=http://127.0.0.1:13589/mod
--- FAIL: TestScript (0.49s)
    --- FAIL: TestScript/doc (8.59s)
        script_test.go:132: 2023-07-19T00:18:47Z
        script_test.go:134: $WORK=/tmp/workdir/tmp/cmd-go-test-1792972832/tmpdir3603819314/doc2905635927
        script_test.go:156: 
            # go doc --help (0.790s)
            # go help doc (0.133s)
...
            fatal error: all goroutines are asleep - deadlock!

            runtime stack:
            runtime.fatal({0x81b9055, 0x25})
            	runtime/panic.go:1096 +0x4d fp=0xcf7edb14 sp=0xcf7edb00 pc=0x807d3ed
            runtime.checkdead()
            	runtime/proc.go:5498 +0x294 fp=0xcf7edb60 sp=0xcf7edb14 pc=0x808a784
            runtime.mput(0x82b74a0)
            	runtime/proc.go:5933 +0x2e fp=0xcf7edb64 sp=0xcf7edb60 pc=0x808bf2e
            runtime.stopm()
            	runtime/proc.go:2534 +0x68 fp=0xcf7edb78 sp=0xcf7edb64 pc=0x80839e8
            runtime.findRunnable()
            	runtime/proc.go:3229 +0xb6d fp=0xcf7edc14 sp=0xcf7edb78 pc=0x808524d
            runtime.schedule()
            	runtime/proc.go:3582 +0xb1 fp=0xcf7edc30 sp=0xcf7edc14 pc=0x8086381
            runtime.park_m(0x62408100)
            	runtime/proc.go:3745 +0x155 fp=0xcf7edc4c sp=0xcf7edc30 pc=0x8086a65
            runtime.mcall()
            	runtime/asm_386.s:329 +0x40 fp=0xcf7edc54 sp=0xcf7edc4c pc=0x80aaf00

watchflakes

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/511356 mentions this issue: Revert "runtime: decrement netpollWaiters in netpollunblock"

@ianlancetaylor
Copy link
Contributor

This was most likely triggered by https://go.dev/cl/503923 for #60782.

However, my current thinking is that that CL is just exposing a pre-existing bug and making it more likely to occur. Before that change, netpollWaiters was rarely zero. The CL fixed the code so that netpollWaiters would be zero if there were in fact no waiters.

The scheduler has an optimization: it it doesn't bother to poll the network if netpollWaiters is zero.

The netpollWaiters count indicates the number of goroutines waiting for the network. However, the count is currently adjusted before the goroutines are actually marked ready to run. So there is a period of time when netpollWaiters is zero but the scheduler doesn't see any ready goroutines. If checkdead is run during that period of time, it may think that the program is deadlocked.

Before the CL, when netpollWaiters was almost never zero, the last step in findRunnable is to call the network. In the cases that fail there are no timers running, so findRunnable would block until there was some network activity. That blocking is enough to prevent the deadlock detector from firing. That gives the scheduler the time to collect the ready goroutines and mark them as ready to run.

Now that netpollWaiters can be zero, that last findRunnable step is not executed. That opens up the gap in time when the deadlock detector can think that no goroutines are ready--because they haven't yet been marked as runnable.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/511455 mentions this issue: runtime: adjust netpollWaiters after goroutines are ready

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
Archived in project
Development

No branches or pull requests

4 participants