New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestGdbPython* fails on solaris #20821

Open
binarycrusader opened this Issue Jun 27, 2017 · 13 comments

Comments

Projects
None yet
5 participants
@binarycrusader
Contributor

binarycrusader commented Jun 27, 2017

On a development build of Solaris, TestGdbPythonCgo fails:

$ uname -a
SunOS darwin-sunw 5.12 s12_127 i86pc i386 i86pc
$ gdb --version
GNU gdb (GDB) 7.12.1
$ go version
go version devel +33b3cc1568 Tue Jun 27 20:45:38 2017 +0000 solaris/amd64
$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="solaris"
GOOS="solaris"
GOPATH="/builds/srwalker/gocode"
GORACE=""
GOROOT="/builds/srwalker/golang/go-trunk"
GOTOOLDIR="/builds/srwalker/golang/go/pkg/tool/solaris_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build788895360=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
$ go test -v -run TestGdbPythonCgo
=== RUN   TestGdbPythonCgo
--- FAIL: TestGdbPythonCgo (2.51s)
        runtime-gdb_test.go:55: gdb version 7.12
        runtime-gdb_test.go:218: goroutine 1 bt failed: #0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so.1
                #1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so.1
                Backtrace stopped: previous frame inner to this frame (corrupt stack?)
FAIL
exit status 1
FAIL    runtime 2.520s

At the moment, it's unclear if this is due to GDB 7.12 or because of something else as all other runtime tests pass. I intend to investigate and root cause if possible soon.

@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

Modifying the test to print the contents of blocks at the point of failure produces:

=== RUN   TestGdbPythonCgo
map[info locals:slicevar.len = 1
slicevar.cap = 16
slicevar.ptr = "def"
mapvar = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
strvar = "abc" goroutine 1 bt:#0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so.1
#1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so.1
Backtrace stopped: previous frame inner to this frame (corrupt stack?) goroutine 2 bt:#0  runtime.gopark (unlockf={void (struct runtime.g *, void *, bool *)} 0xc420030768, lock=0x6576e0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:278
#1  0x00000000004daf5e in runtime.goparkunlock (lock=0x6576e0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:283
#2  0x00000000004dac8c in runtime.forcegchelper () at /builds/srwalker/golang/go/src/runtime/proc.go:235
#3  0x0000000000501b51 in runtime.goexit () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:2337
#4  0x0000000000000000 in ?? () info goroutines:* 1 running  runtime.systemstack_switch
  2 waiting  runtime.gopark
  3 waiting  runtime.gopark
  4 runnable runtime.runfinq print mapvar:$1 = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"} print strvar:$2 = "abc"]
--- FAIL: TestGdbPythonCgo (2.50s)
        runtime-gdb_test.go:55: gdb version 7.12
        runtime-gdb_test.go:219: goroutine 1 bt failed: #0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so.1
                #1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so.1
                Backtrace stopped: previous frame inner to this frame (corrupt stack?)
FAIL
exit status 1
FAIL    runtime 2.508s

@binarycrusader binarycrusader changed the title from runtime: TestGdbPythonCgo fails on solaris to runtime: TestGdbPython* fails on solaris Jun 27, 2017

@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

The non-cgo variant fails the same way:

=== RUN   TestGdbPython
map[info goroutines:* 1 running  runtime.systemstack_switch
  2 waiting  runtime.gopark
  3 waiting  runtime.gopark
  4 runnable runtime.runfinq print mapvar:$1 = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"} print strvar:$2 = "abc" info locals:slicevar.len = 1
slicevar.cap = 16
slicevar.ptr = "def"
mapvar = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
strvar = "abc" goroutine 1 bt:#0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so
#1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?) goroutine 2 bt:#0  runtime.gopark (unlockf={void (struct runtime.g *, void *, bool *)} 0xc420030768, lock=0x53efb0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:278
#1  0x000000000042a06e in runtime.goparkunlock (lock=0x53efb0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:283
#2  0x0000000000429d9c in runtime.forcegchelper () at /builds/srwalker/golang/go/src/runtime/proc.go:235
#3  0x0000000000450be1 in runtime.goexit () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:2337
#4  0x0000000000000000 in ?? ()]
--- FAIL: TestGdbPython (1.87s)
        runtime-gdb_test.go:55: gdb version 7.12
        runtime-gdb_test.go:219: goroutine 1 bt failed: #0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so
                #1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so
                Backtrace stopped: previous frame inner to this frame (corrupt stack?)
FAIL
exit status 1
FAIL    runtime 1.878s
@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jun 27, 2017

At first glance it looks like gdb is unable to do a stack backtrace for a thread stopped in __nanosleep in the standard library. This may be a lack of debug info combined with a failure to understand the function prologue. Try writing a C function that calls nanosleep and see whether gdb can do a backtrace when the process is sleeping there.

@bradfitz bradfitz added this to the Go1.10 milestone Jun 27, 2017

@bradfitz

This comment has been minimized.

Member

bradfitz commented Jun 27, 2017

Marking Go 1.10 but if you want to add a t.Skip for Go 1.9, that's fine. Send a CL soon.

@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

So I manually created 'hello.go' using the sample source from the test:

$ cat hello.go 
package main

import "fmt"
import "runtime"
var gslice []string
func main() {
        mapvar := make(map[string]string,5)
        mapvar["abc"] = "def"
        mapvar["ghi"] = "jkl"
        strvar := "abc"
        ptrvar := &strvar
        slicevar := make([]string, 0, 16)
        slicevar = append(slicevar, mapvar["abc"])
        fmt.Println("hi") // line 13
        _ = ptrvar
        gslice = slicevar
        runtime.KeepAlive(mapvar)
}

Then built that and tried to run it through gdb roughly the same as the test does:

$ go build hello.go 
$ ./hello 
hi
$ gdb hello
GNU gdb (GDB) 7.12.1
....
Reading symbols from hello...done.
Loading Go Runtime support.
(gdb) add-auto-load-safe-path /builds/srwalker/golang/go/src/runtime
(gdb) info auto-load python-scripts
Loaded  Script                                                                 
Yes     /builds/srwalker/golang/go/src/runtime/runtime-gdb.py                  
        full name: /builds/srwalker/golang/go-trunk/src/runtime/runtime-gdb.py
(gdb) set python print-stack full
(gdb) br fmt.Println
Breakpoint 1 at 0x48bd90: file /builds/srwalker/golang/go/src/fmt/print.go, line 256.
(gdb) run
Starting program: /tmp/hello 
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
[New LWP    3        ]
[New LWP    4        ]
[New LWP    5        ]
[New Thread 2 (LWP 2)]
[New Thread 3        ]
[New Thread 4        ]
[New Thread 5 (LWP 5)]
[Switching to Thread 1 (LWP 1)]

Thread 2 hit Breakpoint 1, fmt.Println (a= []interface {} = {...}, n=3, err=...) at /builds/srwalker/golang/go/src/fmt/print.go:256
256     func Println(a ...interface{}) (n int, err error) {
(gdb) where
#0  fmt.Println (a= []interface {} = {...}, n=3, err=...) at /builds/srwalker/golang/go/src/fmt/print.go:256
#1  0x0000000000493026 in main.main () at /tmp/hello.go:14
(gdb) info goroutines
* 1 running  runtime.systemstack_switch
  2 waiting  runtime.gopark
  3 waiting  runtime.gopark
  4 runnable runtime.runfinq
(gdb) up
#1  0x0000000000493026 in main.main () at /tmp/hello.go:14
14              fmt.Println("hi") // line 13
(gdb) print mapvar
$1 = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
(gdb) print strvar
$2 = "abc"
(gdb) info locals
slicevar.len = 1
slicevar.cap = 16
slicevar.ptr = "def"
mapvar = map[string]string = {["abc"] = "def", ["ghi"] = "jkl"}
strvar = "abc"
(gdb) down
#0  fmt.Println (a= []interface {} = {...}, n=3, err=...) at /builds/srwalker/golang/go/src/fmt/print.go:256
256     func Println(a ...interface{}) (n int, err error) {
(gdb) goroutine 1 bt
#0  0x00007fffbf25574a in __nanosleep () from /lib/64/libc.so
#1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) goroutine 2 bt
#0  runtime.gopark (unlockf={void (struct runtime.g *, void *, bool *)} 0xc420032768, lock=0x53efb0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1)
    at /builds/srwalker/golang/go/src/runtime/proc.go:278
#1  0x000000000042a06e in runtime.goparkunlock (lock=0x53efb0 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:283
#2  0x0000000000429d9c in runtime.forcegchelper () at /builds/srwalker/golang/go/src/runtime/proc.go:235
#3  0x0000000000450be1 in runtime.goexit () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:2337
#4  0x0000000000000000 in ?? ()
(gdb) 
@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

A simple C program:

$ cat sleep.c 
#include <time.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
        struct timespec t, t2;
        t.tv_sec = 10;
        t.tv_nsec = 0;
        if (nanosleep(&t, &t2) < 0) {
                printf("nanosleep call failed\n");
                return (-1);
        }
        return (0);
}
$ gcc sleep.c
$ gdb ./a.out 
GNU gdb (GDB) 7.12.1
...
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) br nanosleep
Breakpoint 1 at 0x400e20
(gdb) run
Starting program: /tmp/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[Switching to Thread 1 (LWP 1)]

Thread 2 hit Breakpoint 1, 0x00007fffbf23d824 in nanosleep () from /lib/64/libc.so.1
(gdb) where
#0  0x00007fffbf23d824 in nanosleep () from /lib/64/libc.so.1
#1  0x000000000040102a in main ()
@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jun 27, 2017

Any idea why the Go program shows __nanosleep where the C program shows only nanosleep?

@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

My guess is the difference is where the breakpoint was set; in Solaris libc, nanosleep is just a stub that almost immediately calls __nanosleep:

$ LD_BIND_NOW=1 gdb ./a.out 
GNU gdb (GDB) 7.12.1
...
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) br __nanosleep
Function "__nanosleep" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (__nanosleep) pending.
(gdb) run
Starting program: /tmp/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[Switching to Thread 1 (LWP 1)]

Thread 2 hit Breakpoint 1, 0x00007fffbf255740 in __nanosleep () from /lib/64/libc.so.1
(gdb) where
#0  0x00007fffbf255740 in __nanosleep () from /lib/64/libc.so.1
#1  0x00007fffbf23d89a in nanosleep () from /lib/64/libc.so.1
#2  0x000000000040102a in main ()
(gdb) 
@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

Here's a minimal Go program that calls nanosleep via a C wrapper:

$ cat sleep.go
package main
// #include <time.h>
// #include <stdio.h>
// void mysleep(void) {
//      struct timespec t, t2;
//      t.tv_sec = 1;
//      t.tv_nsec = 0;
//      if (nanosleep(&t, &t2) < 0) {
//              printf("nanosleep call failed\n");
//              return;
//      }
// }
import "C"
import "fmt"

func main() {
        C.mysleep();
        fmt.Println("slept")
}

Built and run in gdb similar to failing test:

$ LD_BIND_NOW=1 gdb ./sleep
GNU gdb (GDB) 7.12.1
...
Reading symbols from ./sleep...done.
Loading Go Runtime support.
(gdb) add-auto-load-safe-path /builds/srwalker/golang/go/src/runtime
(gdb) info auto-load python-scripts
Loaded  Script                                                                 
Yes     /builds/srwalker/golang/go/src/runtime/runtime-gdb.py                  
        full name: /builds/srwalker/golang/go-trunk/src/runtime/runtime-gdb.py
(gdb) set python print-stack full
(gdb) br mysleep
Thread 2 hit Breakpoint 1, mysleep () at /tmp/sleep.go:8
8       //      if (nanosleep(&t, &t2) < 0) {
(gdb) where
#0  mysleep () at /tmp/sleep.go:8
#1  0x00000000005003c9 in runtime.asmcgocall () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:624
#2  0x00007fffbffff3c8 in ?? ()
#3  0x00000000004fc222 in runtime.(*mcache).nextFree.func1 () at /builds/srwalker/golang/go/src/runtime/malloc.go:553
#4  0x000000c420000180 in ?? ()
#5  0x00000000004dd1e0 in ?? () at /builds/srwalker/golang/go/src/runtime/proc.go:1060
#6  0x0000000000657000 in runtime.sched ()
#7  0x00007fffbffff420 in ?? ()
#8  0x0000000000657060 in ?? ()
#9  0x00007fffbffff410 in ?? ()
#10 0x00000000004dd244 in runtime.mstart () at /builds/srwalker/golang/go/src/runtime/proc.go:1142
#11 0x00000000004fe9b5 in runtime.rt0_go () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:186
#12 0x0000000000000001 in ?? ()
#13 0x00007fffbffff468 in ?? ()
#14 0x0000000000000001 in ?? ()
#15 0x00007fffbffff468 in ?? ()
#16 0x00007fffbffff450 in ?? ()
#17 0x00000000004b1873 in _start ()
(gdb) info goroutines
* 1 syscall  runtime.asmcgocall
  2 waiting  runtime.gopark
  3 waiting  runtime.gopark
  4 runnable runtime.runfinq
(gdb) br __nanosleep
Breakpoint 2 at 0x7fffbf255740
(gdb) cont
Continuing.
Thread 2 hit Breakpoint 2, 0x00007fffbf255740 in __nanosleep () from /lib/64/libc.so.1
(gdb) where
#0  0x00007fffbf255740 in __nanosleep () from /lib/64/libc.so.1
#1  0x00007fffbf23d8e3 in nanosleep () from /lib/64/libc.so.1
#2  0x0000000000543655 in mysleep () at /tmp/sleep.go:8
#3  0x00000000005003c9 in runtime.asmcgocall () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:624
#4  0x00007fffbffff3c8 in ?? ()
#5  0x00000000004fc222 in runtime.(*mcache).nextFree.func1 () at /builds/srwalker/golang/go/src/runtime/malloc.go:553
#6  0x000000c420000180 in ?? ()
#7  0x00000000004dd1e0 in ?? () at /builds/srwalker/golang/go/src/runtime/proc.go:1060
#8  0x0000000000657000 in runtime.sched ()
#9  0x00007fffbffff420 in ?? ()
#10 0x0000000000657060 in ?? ()
#11 0x00007fffbffff410 in ?? ()
#12 0x00000000004dd244 in runtime.mstart () at /builds/srwalker/golang/go/src/runtime/proc.go:1142
#13 0x00000000004fe9b5 in runtime.rt0_go () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:186
#14 0x0000000000000001 in ?? ()
#15 0x00007fffbffff468 in ?? ()
#16 0x0000000000000001 in ?? ()
#17 0x00007fffbffff468 in ?? ()
#18 0x00007fffbffff450 in ?? ()
#19 0x00000000004b1873 in _start ()
(gdb) info goroutines
* 1 syscall  runtime.asmcgocall
  2 waiting  runtime.gopark
  3 waiting  runtime.gopark
  4 runnable runtime.runfinq
(gdb) goroutine 1 bt
#0  runtime.asmcgocall () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:610
#1  0x00000000004b2dfd in runtime.cgocall (fn=0x543670 <_cgo_dab3f36848c5_Cfunc_mysleep>, arg=0xc420053f30, ~r2=537214832) at /builds/srwalker/golang/go/src/runtime/cgocall.go:133
#2  0x0000000000543531 in main._Cfunc_mysleep (r1=...) at command-line-arguments/_obj/_cgo_gotypes.go:41
#3  0x0000000000543572 in main.main () at /tmp/sleep.go:17
(gdb) goroutine 2 bt
#0  runtime.gopark (unlockf={void (struct runtime.g *, void *, bool *)} 0xc420030768, lock=0x656b80 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1)
    at /builds/srwalker/golang/go/src/runtime/proc.go:278
#1  0x00000000004da68e in runtime.goparkunlock (lock=0x656b80 <runtime.forcegc>, reason="force gc (idle)", traceEv=20 '\024', traceskip=1) at /builds/srwalker/golang/go/src/runtime/proc.go:283
#2  0x00000000004da3bc in runtime.forcegchelper () at /builds/srwalker/golang/go/src/runtime/proc.go:235
#3  0x0000000000501281 in runtime.goexit () at /builds/srwalker/golang/go/src/runtime/asm_amd64.s:2337
#4  0x0000000000000000 in ?? ()
@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jun 27, 2017

Thanks. I have no idea what is going on.

@binarycrusader

This comment has been minimized.

Contributor

binarycrusader commented Jun 27, 2017

I built and tried gdb 7.11 as well; same problem. As such, I'm going to just add a t.Skip() for the moment until I have time to figure out what's gone wrong. The other test failures are more important at the moment.

@gopherbot

This comment has been minimized.

gopherbot commented Jun 27, 2017

CL https://golang.org/cl/46913 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 28, 2017

runtime: temporarily skip gdb python-related tests on solaris
Updates #20821

Change-Id: I77a5b9a3bbb931845ef52a479549d71069af9540
Reviewed-on: https://go-review.googlesource.com/46913
Run-TryBot: Shawn Walker-Salas <shawn.walker@oracle.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jul 7, 2018

This may have been affected by https://golang.org/cl/99617, which is in the 1.11beta1 release. It would be interesting to see whether the test passes now.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jul 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment