runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash #39204

bcmills · 2020-05-21T21:28:59Z

The tests in runtime-gdb_test.go, for the GDB bindings in runtime-gdb.py, have historically been very flaky. They are currently skipped on a large fraction of platforms due to:

runtime: TestGdbBacktrace fails on netbsd/amd64 #15603
runtime: re-enable TestGdbPython and TestGdbBacktrace on linux/ppc64 (BE) if/when it gets cgo support #17366
runtime: TestGdbPythonCgo fails on MIPS32 #18784
runtime: TestGdbPython* fails on solaris #20821
cmd/compile: gdb 8 on windows crashes reading Go binary #21380 / x/build: Add gdb to Windows Builders #22021 / all.bat: tests fails in windows 10 at tip #22687
runtime: gdb tests fail on NetBSD #22893
cmd/link: dwarf compression broke mips builder #25939
runtime: gdb isn't available on aix/ppc64 #28558
runtime: GDB tests fail on freebsd/arm builder #29508
runtime: timeout with GDB tests on aix/ppc64 #35710
and what appears to be an untracked issue for 'darwin': a skip was added in CL 20193, but neither the code itself nor any of the commentary I could find on the code review explains why gdb does or did not work on that platform.

If we had an incoming change to add a test with this many skips today, I suspect that we would turn it down: it tests what is supposed to be cross-platform code, but only checks the actual behavior of that code for a narrow subset of users — because that behavior does not work reliably across platforms.

But it doesn't work reliably on the remaining tested platforms either: despite all of those skips, the tests are still flaky today:

#39021 in particular describes a regression during the 1.15 cycle that is interfering with the SlowBots and TryBots on pending CLs, will likely interfere with release testing as well, and so far has resisted attempts at bisection because the test (and its failure mode) is highly nondeterministic.

If we consider the runtime “incorrect” if it does not work with gdb, then we should take the time to fix and maintain the Go bindings properly and portably — not just Skip their tests! — in the same way that we fix and maintain other features of the runtime across platforms. Otherwise, it is not appropriate to run the tests for those bindings as part of the tests for the runtime package, which is a dependency of most Go binaries (and is therefore likely to have its tests run in users' CI systems).

Most of the other Go project tests for integration with third-party tools are in the misc subdirectory. I propose that we move the gdb integration tests there.

I further propose that we should not run these tests as part of all.bash, run.bash, or on the Go project builders until such time as they can be made reliable, with only one Skip based on a check for a gdb executable at a sufficiently recent version.

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2020-05-21T21:44:53Z

I don't think this has to be a proposal, in that I don't think it has to go through the proposal committee. This does not affect any user visible API. I think the runtime package maintainers can decide.

mundaym · 2020-05-25T19:44:17Z

The tests in runtime-gdb_test.go, for the GDB bindings in runtime-gdb.py ...

TestGdbBacktrace doesn't actually load the python script. It just gets GDB to execute a Go program, sets a breakpoint and triggers a backtrace. It is one of the few places where we test that the debug information in a Go executable can be consumed by a debugger like GDB.

I'm not sure what the solution to the flakiness is but I think having basic GDB support is valuable and should be tested on platforms where GDB is available. In particular I think we should continue to run the tests on the builders wherever possible - even if it is just a subset of platforms.

bcmills · 2020-05-26T17:46:45Z

@mundaym

I'm not sure what the solution to the flakiness is but I think having basic GDB support is valuable and should be tested on platforms where GDB is available.

Note that we already do not test basic GDB support on most platforms for which GDB is available, due to the bugs mentioned in the first section.¹ If we agree that that support is valuable, then we should remove those skips and fix the behavior on those platforms. If we don't fix the tests when they are broken, what's the point in running them?

Either way, I still don't think these tests belong with the runtime package: they are testing integration with a third-party tool (gdb), not the Go-observable behavior of the runtime package.

¹

go/src/runtime/runtime-gdb_test.go

Lines 22 to 50 in 5c802c1

    
           func checkGdbEnvironment(t *testing.T) { 
        
           	testenv.MustHaveGoBuild(t) 
        
           	switch runtime.GOOS { 
        
           	case "darwin": 
        
           		t.Skip("gdb does not work on darwin") 
        
           	case "netbsd": 
        
           		t.Skip("gdb does not work with threads on NetBSD; see https://golang.org/issue/22893 and https://gnats.netbsd.org/52548") 
        
           	case "windows": 
        
           		t.Skip("gdb tests fail on Windows: https://golang.org/issue/22687") 
        
           	case "linux": 
        
           		if runtime.GOARCH == "ppc64" { 
        
           			t.Skip("skipping gdb tests on linux/ppc64; see https://golang.org/issue/17366") 
        
           		} 
        
           		if runtime.GOARCH == "mips" { 
        
           			t.Skip("skipping gdb tests on linux/mips; see https://golang.org/issue/25939") 
        
           		} 
        
           	case "freebsd": 
        
           		t.Skip("skipping gdb tests on FreeBSD; see https://golang.org/issue/29508") 
        
           	case "aix": 
        
           		if testing.Short() { 
        
           			t.Skip("skipping gdb tests on AIX; see https://golang.org/issue/35710") 
        
           		} 
        
           	case "plan9": 
        
           		t.Skip("there is no gdb on Plan 9") 
        
           	} 
        
           	if final := os.Getenv("GOROOT_FINAL"); final != "" && runtime.GOROOT() != final { 
        
           		t.Skip("gdb test can fail with GOROOT_FINAL pending") 
        
           	} 
        
           }

ianlancetaylor · 2020-06-26T00:37:13Z

This didn't happen for 1.15 (and the gdb tests seem to have gotten a bit more reliable, too). Changing milestone to 1.16.

mengzhuo · 2020-12-09T01:57:49Z

FYI, maybe related to
https://sourceware.org/bugzilla/show_bug.cgi?id=20301

changkun · 2021-03-19T08:08:22Z

I just encountered one on tip:

##### GOMAXPROCS=2 runtime -cpu=1,2,4 -quick
--- FAIL: TestGdbPythonCgo (2.21s)
    runtime-gdb_test.go:76: gdb version 9.2
    runtime-gdb_test.go:269: gdb output:
        Loading Go Runtime support.
        Loading Go Runtime support.
        Breakpoint 1 at 0x49a289: file /tmp/go-build247077086/main.go, line 27.
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
        hi
        
        Thread 1 "a.exe" hit Breakpoint 1, main.main () at /tmp/go-build247077086/main.go:27
        27              gslice = slicevar
        BEGIN info goroutines
        * 1 running  syscall.Syscall
          2 waiting  runtime.gopark
          3 waiting  runtime.gopark
          4 waiting  runtime.gopark
          18 waiting  runtime.gopark
        END
        BEGIN print mapvar
        $1 = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
        END
        BEGIN print slicemap
        $2 = map[string][]string = {["e"] = []string = {"f", "g", "h"}, ["a"] = []string = {"b", "c", "d"}}
        END
        BEGIN print strvar
        $3 = "abc"
        END
        BEGIN print chanint
        $4 = chan int = {99, 11}
        END
        BEGIN print chanstr
        $5 = chan string = {"spongepants", "squarebob"}
        END
        BEGIN info locals
        mapvar = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
        chanstr = chan string = {"spongepants", "squarebob"}
        chanint = chan int = {99, 11}
        strvar = "abc"
        slicemap = map[string][]string = {["e"] = []string = {"f", "g", "h"}, ["a"] = []string = {"b", "c", "d"}}
        ptrvar = <optimized out>
        slicevar = []string
        END
        BEGIN goroutine 1 bt
        #0  main.main () at /tmp/go-build247077086/main.go:27
        END
        BEGIN goroutine all bt
        #0  main.main () at /tmp/go-build247077086/main.go:27
        No such goroutine:  17
        #0  runtime.gopark (unlockf={void (runtime.g *, void *, bool *)} 0xc00003afb0, lock=0x554130 <runtime.forcegc>, reason=17 '\021', traceEv=20 '\024', traceskip=1) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:337
        #1  0x00000000004385e5 in runtime.goparkunlock (lock=<optimized out>, reason=<optimized out>, traceEv=<optimized out>, traceskip=<optimized out>) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:342
        #2  runtime.forcegchelper () at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:276
        #3  0x0000000000464e61 in runtime.goexit () at /home/changkun/dev/godev/go-gerrit/src/runtime/asm_amd64.s:1445
        #4  0x0000000000000000 in ?? ()
        #0  runtime.gopark (unlockf={void (runtime.g *, void *, bool *)} 0xc00003b7a8, lock=0x554280 <runtime.sweep>, reason=12 '\f', traceEv=20 '\024', traceskip=1) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:337
        #1  0x0000000000423e88 in runtime.goparkunlock (lock=<optimized out>, reason=<optimized out>, traceEv=<optimized out>, traceskip=<optimized out>) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:342
        #2  runtime.bgsweep (c=chan int) at /home/changkun/dev/godev/go-gerrit/src/runtime/mgcsweep.go:163
        #3  0x0000000000464e61 in runtime.goexit () at /home/changkun/dev/godev/go-gerrit/src/runtime/asm_amd64.s:1445
        #4  0x000000c00001c070 in ?? ()
        #5  0x0000000000000000 in ?? ()
        #0  runtime.gopark (unlockf={void (runtime.g *, void *, bool *)} 0xc00003bf78, lock=0x554320 <runtime.scavenge>, reason=13 '\r', traceEv=20 '\024', traceskip=1) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:337
        #1  0x0000000000421d36 in runtime.goparkunlock (lock=<optimized out>, reason=<optimized out>, traceEv=<optimized out>, traceskip=<optimized out>) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:342
        #2  runtime.bgscavenge (c=chan int) at /home/changkun/dev/godev/go-gerrit/src/runtime/mgcscavenge.go:265
        #3  0x0000000000464e61 in runtime.goexit () at /home/changkun/dev/godev/go-gerrit/src/runtime/asm_amd64.s:1445
        #4  0x000000c00001c070 in ?? ()
        #5  0x0000000000000000 in ?? ()
        #0  runtime.gopark (unlockf={void (runtime.g *, void *, bool *)} 0xc00003a740, lock=0x581530 <runtime.finlock>, reason=16 '\020', traceEv=20 '\024', traceskip=1) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:337
        #1  0x0000000000418c66 in runtime.goparkunlock (lock=<optimized out>, reason=<optimized out>, traceEv=<optimized out>, traceskip=<optimized out>) at /home/changkun/dev/godev/go-gerrit/src/runtime/proc.go:342
        #2  runtime.runfinq () at /home/changkun/dev/godev/go-gerrit/src/runtime/mfinal.go:177
        #3  0x0000000000464e61 in runtime.goexit () at /home/changkun/dev/godev/go-gerrit/src/runtime/asm_amd64.s:1445
        #4  0x0000000000000000 in ?? ()
        END
        No breakpoint at main.go:15.
        Breakpoint 2 at 0x49a41f: file /tmp/go-build247077086/main.go, line 30.
    runtime-gdb_test.go:271: gdb exited with error: signal: aborted (core dumped)
FAIL
FAIL    runtime 12.139s
FAIL
2021/03/19 08:47:46 Failed: exit status 1
go tool dist: FAILED

ianlancetaylor · 2021-04-28T00:02:36Z

Moving to Backlog.

bcmills · 2023-06-01T14:40:17Z

@golang/runtime, can you please take another look at this in triage? I keep hitting #58698 in TryBots.

I expect that external Go users may hit these test flakes in their own projects when running go test all (because every go project depends on the runtime package), and go test all ought to be a reasonable thing for them to do. Given how flaky the GDB tests are, I would really rather not have them standing in the way of Go users improving their testing practices.

gopherbot · 2023-06-01T19:34:10Z

Change https://go.dev/cl/499975 mentions this issue: runtime: update skips for TestGdbBacktrace

One issue simply has a reworded message, probably from a new version of GDB. Another is a new issue. Fixes #60553. Fixes #58698. Updates #39204. Change-Id: I8389aa981fab5421f57ee761bfb5e1dd237709ed Reviewed-on: https://go-review.googlesource.com/c/go/+/499975 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>

gopherbot added this to the Proposal milestone May 21, 2020

gopherbot added the Proposal label May 21, 2020

bcmills changed the title ~~proposal: runtime: move tests for runtime-gdb.py to misc/ and do not run them during all.bash~~ runtime: move tests for runtime-gdb.py to misc/ and do not run them during all.bash May 21, 2020

bcmills modified the milestones: Proposal, Go1.15 May 21, 2020

bcmills added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Debugging Testing An issue that has been verified to require only test changes, not just a test failure. and removed Proposal labels May 21, 2020

ianlancetaylor modified the milestones: Go1.15, Go1.16 Jun 26, 2020

bcmills mentioned this issue Dec 8, 2020

runtime: TestGdbBacktrace failures due to GDB "internal-error: wait returned unexpected status 0x0" #43068

Closed

aclements modified the milestones: Go1.16, Go1.17 Dec 8, 2020

bcmills mentioned this issue Jan 21, 2021

runtime: hang in TestGdbAutotmpTypes on linux-mips64le-mengzhuo builder #43825

Closed

ianlancetaylor modified the milestones: Go1.17, Backlog Apr 28, 2021

agarciamontoro mentioned this issue Oct 8, 2021

runtime: TestGdbPythonCgo failure with "'runtime.allgs' has unknown type" on linux/amd64 #48874

Closed

bcmills mentioned this issue Mar 17, 2022

runtime: GDB tests failing on linux-mips64-rtrk since CL 380755 #51769

Closed

jonathan-albrecht-ibm mentioned this issue Jun 20, 2022

runtime: TestGdbPython failure with "PC not saved" on linux-s390x-ibm #53431

Open

bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 30, 2023

bcmills changed the title ~~runtime: move tests for runtime-gdb.py to misc/ and do not run them during all.bash~~ runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash Jul 21, 2023

bcmills mentioned this issue Jan 2, 2024

runtime: TestGdbCoreCrashThreadBacktrace failures #64752

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash #39204

runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash #39204

bcmills commented May 21, 2020 •

edited

Loading

ianlancetaylor commented May 21, 2020

mundaym commented May 25, 2020

bcmills commented May 26, 2020

ianlancetaylor commented Jun 26, 2020

mengzhuo commented Dec 9, 2020

changkun commented Mar 19, 2021

ianlancetaylor commented Apr 28, 2021

bcmills commented Jun 1, 2023 •

edited

Loading

gopherbot commented Jun 1, 2023

runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash #39204

runtime: move tests for runtime-gdb.py to a separate package and do not run them during all.bash #39204

Comments

bcmills commented May 21, 2020 • edited Loading

ianlancetaylor commented May 21, 2020

mundaym commented May 25, 2020

bcmills commented May 26, 2020

ianlancetaylor commented Jun 26, 2020

mengzhuo commented Dec 9, 2020

changkun commented Mar 19, 2021

ianlancetaylor commented Apr 28, 2021

bcmills commented Jun 1, 2023 • edited Loading

gopherbot commented Jun 1, 2023

bcmills commented May 21, 2020 •

edited

Loading

bcmills commented Jun 1, 2023 •

edited

Loading