Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestGdbBacktrace failures due to GDB "internal-error: wait returned unexpected status 0x0" #43068

Closed
bcmills opened this issue Dec 8, 2020 · 12 comments
Labels
NeedsInvestigation release-blocker Testing
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Dec 8, 2020

2020-12-07T21:01:46-7ad6596/linux-mips64le-mengzhuo
2020-10-23T15:11:15-646531c/linux-mips64le-mengzhuo

--- FAIL: TestGdbBacktrace (1.51s)
    runtime-gdb_test.go:77: gdb version 8.1
    runtime-gdb_test.go:437: gdb output:
        Loading Go Runtime support.
        Breakpoint 1 at 0x7ece0: file /tmp/farm/tmp/go-build405377235/main.go, line 17.
        [New LWP 18663]
        [New LWP 18664]
        [New LWP 18665]
        [New LWP 18666]
        
        Thread 1 "a.exe" hit Breakpoint 1, 0x000000000007ece0 in main.eee (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:17
        17	func eee() bool { return true }
        #0  0x000000000007ece0 in main.eee (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:17
        #1  0x000000000007ecbc in main.ddd (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:14
        #2  0x000000000007ec5c in main.ccc (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:11
        #3  0x000000000007ec0c in main.bbb (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:8
        #4  0x000000000007ebbc in main.aaa (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:5
        #5  0x000000000007ed1c in main.main () at /tmp/farm/tmp/go-build405377235/main.go:22
        ../../gdb/linux-nat.c:2081: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Quit this debugging session? (y or n) [answered Y; input not from terminal]
        
        This is a bug, please report it.  For instructions, see:
        <http://www.gnu.org/software/gdb/bugs/>.
        
        ../../gdb/linux-nat.c:2081: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Create a core file of GDB? (y or n) [answered Y; input not from terminal]
    runtime-gdb_test.go:439: gdb exited with error: signal: aborted (core dumped)
FAIL
FAIL	runtime	44.687s

See previously #37405 (TestGdbBacktrace hanging on the same builder), #39228 (TestGdbBacktrace failures on Linux), #39204 (meta-bug about runtime GDB test flakiness).

CC @mengzhuo @ianlancetaylor

@bcmills bcmills added Builders NeedsInvestigation labels Dec 8, 2020
@bcmills bcmills added this to the Backlog milestone Dec 8, 2020
@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Dec 9, 2020

https://sourceware.org/bugzilla/show_bug.cgi?id=20301

Can finally reproduce after a few tries.

What happens is that another thread calls exit() while I "next" this thread.
Most of the time gdb handles it nicely, but it also happens that I get the internal-error immediately:

@bcmills
Copy link
Member Author

@bcmills bcmills commented May 11, 2021

Looks like the same underlying GDB bug on linux-riscv64-jsing:

2021-05-10T19:19:34-dc50683/linux-riscv64-jsing

--- FAIL: TestGdbBacktrace (6.72s)
    runtime-gdb_test.go:76: gdb version 9.2
    runtime-gdb_test.go:428: gdb output:
        Loading Go Runtime support.
        Breakpoint 1 at 0x719a8: file /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go, line 17.
        [New LWP 2737319]
        [New LWP 2737320]
        [New LWP 2737321]
        
        Thread 1 "a.exe" hit Breakpoint 1, main.eee (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:17
        17	func eee() bool { return true }
        #0  main.eee (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:17
        #1  0x0000000000071994 in main.ddd (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:14
        #2  0x0000000000071954 in main.ccc (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:11
        #3  0x000000000007191c in main.bbb (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:8
        #4  0x00000000000718e4 in main.aaa (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:5
        #5  0x00000000000719dc in main.main () at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:21
        /build/gdb-D4eJJR/gdb-9.2/gdb/linux-nat.c:1963: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Quit this debugging session? (y or n) [answered Y; input not from terminal]
        
        This is a bug, please report it.  For instructions, see:
        <http://www.gnu.org/software/gdb/bugs/>.
        
    runtime-gdb_test.go:430: gdb exited with error: signal: aborted
FAIL
FAIL	runtime	294.989s

CC @4a6f656c

@bcmills bcmills changed the title runtime: TestGdbBacktrace failures due to GDB internal error on linux-mips64le-mengzhuo builder runtime: TestGdbBacktrace failures due to GDB "internal-error: wait returned unexpected status 0x0" May 11, 2021
@bcmills
Copy link
Member Author

@bcmills bcmills commented Jan 26, 2022

greplogs --dashboard -md -l -e 'FAIL: TestGdbBacktrace .*(?:\n .*)*: wait returned unexpected status' --since=2021-05-12

2022-01-12T00:01:48-8070e70/linux-riscv64-jsing
2021-10-26T22:05:53-80be4a4/linux-riscv64-unmatched

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

greplogs --dashboard -md -l -e 'FAIL: TestGdbBacktrace .*(?:\n .*)*: wait returned unexpected status' --since=2022-01-27

2022-02-07T21:57:29-911c78f/linux-386-longtest

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

Note that the most recent failure is on linux/386, which is a first-class port.

Given how little time is left in the Go 1.18 release cycle, marking as release-blocker for Go 1.19.

@bcmills bcmills removed this from the Backlog milestone Feb 8, 2022
@bcmills bcmills added this to the Go1.19 milestone Feb 8, 2022
@bcmills bcmills added release-blocker and removed Builders labels Feb 8, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

Given that the failure is within GDB itself, we could resolve this issue by doing one or more of the following:

  • Getting an upstream fix in GDB and updating the builders to pull it in.
  • Identifying affected GDB versions and skipping the test on those versions.
  • Modifying the test to skip itself if it detects this failure mode.
  • Determining that the scenario under test cannot work reliably and removing the test for it.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

There are at least two upstream bugs reported against GDB for this symptom:
https://sourceware.org/bugzilla/show_bug.cgi?id=24628
https://sourceware.org/bugzilla/show_bug.cgi?id=28551

@gopherbot
Copy link

@gopherbot gopherbot commented Feb 8, 2022

Change https://go.dev/cl/384234 mentions this issue: runtime: skip TestGdbBacktrace flakes matching a known GDB internal error

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

Modifying the test to skip itself if it detects this failure mode.

Actually, we can do that one now, and we should: otherwise, this test may flake for Go users when they run go test all in their own module.

@bcmills bcmills removed this from the Go1.19 milestone Feb 8, 2022
@bcmills bcmills added this to the Go1.18 milestone Feb 8, 2022
@bcmills bcmills self-assigned this Feb 8, 2022
@bcmills bcmills added the Testing label Feb 8, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented May 24, 2022

@gopherbot, please backport to Go 1.17. This test still fails intermittently on the release branch, and the patch to skip for that failure mode is small and test-only.

@gopherbot
Copy link

@gopherbot gopherbot commented May 24, 2022

Backport issue(s) opened: #53049 (for 1.17).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link

@gopherbot gopherbot commented May 24, 2022

Change https://go.dev/cl/408054 mentions this issue: [release-branch.go1.17] runtime: skip TestGdbBacktrace flakes matching a known GDB internal error

gopherbot pushed a commit that referenced this issue May 25, 2022
…g a known GDB internal error

TestGdbBacktrace occasionally fails due to a GDB internal error.
We have observed the error on various linux builders since at least
October 2020, and it has been reported upstream at least twice.¹²

Since the bug is external to the Go project and does not appear to be
fixed upstream, this failure mode can only add noise.

¹https://sourceware.org/bugzilla/show_bug.cgi?id=24628
²https://sourceware.org/bugzilla/show_bug.cgi?id=28551

Fixes #53049
Updates #43068

Change-Id: I6c92006a5d730f1c4df54b0307f080b3d643cc6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/384234
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit 275aedc)
Reviewed-on: https://go-review.googlesource.com/c/go/+/408054
Reviewed-by: Alex Rakoczy <alex@golang.org>
@rsc rsc unassigned bcmills Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation release-blocker Testing
Projects
None yet
Development

No branches or pull requests

3 participants