Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestGdbPython flaky on linux #24616

Open
bradfitz opened this issue Mar 30, 2018 · 7 comments

Comments

@bradfitz
Copy link
Member

commented Mar 30, 2018

Just saw this TestGdbPython flake on linux-amd64:

https://storage.googleapis.com/go-build-log/149e81d1/linux-amd64_fd28318f.log

--- FAIL: TestGdbPython (3.49s)
	runtime-gdb_test.go:59: gdb version 7.7
	runtime-gdb_test.go:193: gdb output: Loading Go Runtime support.
		Loaded  Script                                                                 
		Yes     /tmp/workdir/go/src/runtime/runtime-gdb.py                             
		Breakpoint 1 at 0x47c190: file /tmp/workdir/go/src/fmt/print.go, line 263.
		
		Breakpoint 1, fmt.Println (a=..., err=..., n=<optimized out>) at /tmp/workdir/go/src/fmt/print.go:263
		263	func Println(a ...interface{}) (n int, err error) {
		BEGIN info goroutines
		* 1 running  runtime.systemstack_switch
		* 2 running  runtime.forcegchelper
		  3 waiting  runtime.gopark
		  4 runnable runtime.runfinq
		END
		#1  0x00000000004828a0 in main.main () at /tmp/go-build461513624/main.go:14
		14		fmt.Println("hi")
		BEGIN print mapvar
		$1 = map[string]string = {["ghi"] = "jkl", ["abc"] = "def"}
		END
		BEGIN print strvar
		$2 = "abc"
		END
		BEGIN info locals
		mapvar = map[string]string = {["ghi"] = "jkl", ["abc"] = "def"}
		slicevar =  []string = {"def"}
		strvar = "abc"
		END
		#0  fmt.Println (a=..., err=..., n=<optimized out>) at /tmp/workdir/go/src/fmt/print.go:263
		263	func Println(a ...interface{}) (n int, err error) {
		BEGIN goroutine 1 bt
		#0  fmt.Println (a=..., err=..., n=<optimized out>) at /tmp/workdir/go/src/fmt/print.go:263
		#1  0x00000000004828a0 in main.main () at /tmp/go-build461513624/main.go:14
		END
		BEGIN goroutine 2 bt
		No such goroutine:  2
		END
		Breakpoint 2 at 0x4828cd: file /tmp/go-build461513624/main.go, line 18.
		hi
		
		Breakpoint 2, main.main () at /tmp/go-build461513624/main.go:19
		19	}  // END_OF_PROGRAM
		BEGIN goroutine 1 bt at the end
		#0  main.main () at /tmp/go-build461513624/main.go:19
		END
		
	runtime-gdb_test.go:258: goroutine 2 bt failed: No such goroutine:  2
FAIL
FAIL	runtime	28.524s

/cc @aclements

@bradfitz bradfitz added this to the Go1.11 milestone Mar 30, 2018
@aclements

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

@hyangah, is this similar to the bug you fixed recently about getting the state of goroutines?

@hyangah

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2018

@aclements do you mean https://go-review.googlesource.com/c/go/+/49691?
I don't know. Maybe.

If when the test passes the gdb output should look like the following

--- PASS: TestGdbPython (0.44s)
	runtime-gdb_test.go:59: gdb version 7.7
	runtime-gdb_test.go:193: gdb output: Loading Go Runtime support.
		Loaded  Script                                                                 
		Yes     /tmp/workdir/go/src/runtime/runtime-gdb.py                             
		Breakpoint 1 at 0x47c1c0: file /tmp/workdir/go/src/fmt/print.go, line 263.
		
		Breakpoint 1, fmt.Println (a=..., err=..., n=) at /tmp/workdir/go/src/fmt/print.go:263
		263	func Println(a ...interface{}) (n int, err error) {
		BEGIN info goroutines
		* 1 running  runtime.systemstack_switch
		  2 waiting  runtime.gopark
		  17 waiting  runtime.gopark
		  33 runnable runtime.runfinq
		END
		#1  0x00000000004828d0 in main.main () at /tmp/go-build668958384/main.go:14
		14		fmt.Println("hi")
		BEGIN print mapvar
		$1 = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
		END
		BEGIN print strvar
		$2 = "abc"
		END
		BEGIN info locals
		mapvar = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
		slicevar =  []string = {"def"}
		strvar = "abc"
		END
		#0  fmt.Println (a=..., err=..., n=) at /tmp/workdir/go/src/fmt/print.go:263
		263	func Println(a ...interface{}) (n int, err error) {
		BEGIN goroutine 1 bt
		#0  fmt.Println (a=..., err=..., n=) at /tmp/workdir/go/src/fmt/print.go:263
		#1  0x00000000004828d0 in main.main () at /tmp/go-build668958384/main.go:14
		END
		BEGIN goroutine 2 bt
		#0  runtime.gopark (lock=0x528af0 , reason="force gc (idle)", traceEv=20 '\024', traceskip=1, unlockf=) at /tmp/workdir/go/src/runtime/proc.go:292
		#1  0x0000000000428c5e in runtime.goparkunlock (lock=, reason=..., traceEv=, traceskip=) at /tmp/workdir/go/src/runtime/proc.go:297
		#2  0x00000000004289ea in runtime.forcegchelper () at /tmp/workdir/go/src/runtime/proc.go:248
		#3  0x000000000044ee01 in runtime.goexit () at /tmp/workdir/go/src/runtime/asm_amd64.s:1385
		#4  0x0000000000000000 in ?? ()
		END
		Breakpoint 2 at 0x4828fd: file /tmp/go-build668958384/main.go, line 18.
		hi
		
		Breakpoint 2, main.main () at /tmp/go-build668958384/main.go:19
		19	}  // END_OF_PROGRAM
		BEGIN goroutine 1 bt at the end
		#0  main.main () at /tmp/go-build668958384/main.go:19
		END

Note the difference in the output of 'info goroutine' (2 running goroutines vs 1 running goroutine).

Is there any way to reliably reproduce the failure case?
I tried to gomote and test in the linux-amd64 buildlet, but failed to reproduce the failure with -count=1000 (too ~450s). Not from my linux either. I tried larger than 1000 in gomote and the run was SIGQUIT.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jul 7, 2018
@andybons andybons modified the milestones: Go1.12, Go1.13 Feb 12, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@bcmills

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

Here's a flake with very similar output on linux-ppc64le-buildlet:
https://build.golang.org/log/f5c124d74c9e6a71da5614b8b13db9328ec08910

@bcmills bcmills changed the title runtime: TestGdbPython flake on linux-amd64 runtime: TestGdbPython flaky on linux Aug 27, 2019
@bcmills bcmills added the OS-Linux label Aug 27, 2019
@bcmills

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

@laboger

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

The failure here happens because at the time the gdb script tries to display information about goroutine 2, that goroutine no longer exists.

Is that a bug because the goroutine has ended, or just normal behavior? Someone who knows more about what causes goroutines to come and go would have to answer that.

It seems like the testcase should have a goroutine that won't go away soon and reference that in the gdb script.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@bcmills

This comment has been minimized.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.