Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: in core dump generated for a signal while executing C code, gdb reports a corrupt stack #57698

Open
MariappanBalraj opened this issue Jan 9, 2023 · 20 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@MariappanBalraj
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.19.4 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/ubuntu/.cache/go-build"
GOENV="/home/ubuntu/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/ubuntu/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/ubuntu/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.4"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1839558467=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Following is the working source code of GO which uses CGO. From GO code, the c function test3() is called, which calls test2(), which calls test1(). In test1(), there is a NULL pointer assignment, which will cause segmentation fault. To get the core dump, "ulimit -c unlimited" and "echo "core" > /proc/sys/kernel/core_pattern" are run. The code is compiled without optimization by setting env variable CGO_CFLAGS to -g. The program is run by "GOTRACEBACK=crash ./test", which produced core dump. From the core dump, I am not getting complete C stack and it reports corrupt stack. Please note that when I run the program by using gdb, I am getting the complete stack.

package main

/*
#include <stdio.h>

void test1(void) {
int p = (int)NULL;
*p = 30;
}

void test2(void) {
int val = 2;
test1();
}

void test3(void) {
int val = 3;
test2();
}
*/
import "C"

func main() {
C.test3()
}

What did you expect to see?

I am expecting the same C stack which is displayed when the program is run directly by gdb.

gdb ./test
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./test...
warning: Unsupported auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/ubuntu/mbalraj/GO/TEST/test.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) run
Starting program: /home/ubuntu/mbalraj/GO/TEST/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffd08f7640 (LWP 193717)]
[New Thread 0x7fffcbfff640 (LWP 193718)]
[New Thread 0x7fffcb7fe640 (LWP 193719)]
[New Thread 0x7fffcaffd640 (LWP 193720)]
[New Thread 0x7fffca7fc640 (LWP 193721)]

Thread 1 "test" received signal SIGSEGV, Segmentation fault.
0x000000000045abe6 in test1 () at /home/ubuntu/GO/TEST/test.go:8
8 *p = 30;
(gdb) bt
#0 0x000000000045abe6 in test1 () at /home/ubuntu/GO/TEST/test.go:8
#1 0x000000000045ac07 in test2 () at /home/ubuntu/GO/TEST/test.go:13
#2 0x000000000045ac22 in test3 () at /home/ubuntu/GO/TEST/test.go:18
#3 0x000000000045ac42 in _cgo_6ac9acc88c92_Cfunc_test3 (v=0xc000056f70) at /tmp/go-build/cgo-gcc-prolog:49
#4 0x00000000004567e4 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:844
#5 0x00000000004e15e0 in ?? ()
#6 0x0000000000000001 in ?? ()
#7 0x000000c000100a00 in ?? ()
#8 0x00007fffffffe2d8 in ?? ()
#9 0x0000000000439fa5 in runtime.malg.func1 () at /usr/local/go/src/runtime/proc.go:4080
#10 0x0000000000456629 in runtime.systemstack () at /usr/local/go/src/runtime/asm_amd64.s:492
#11 0x0000000000458fe5 in runtime.newproc (fn=0x1) at :1
#12 0x00000000004c9780 in runtime[scavenger] ()
#13 0x0000000000000001 in ?? ()
#14 0x0000000000456525 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:390
#15 0x00000000004564af in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:354
#16 0x0000000000000001 in ?? ()
#17 0x00007fffffffe458 in ?? ()
#18 0x0000000000000000 in ?? ()

What did you see instead?

gdb ./test core.193685
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
https://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./test...
[New LWP 193685]
[New LWP 193687]
[New LWP 193686]
[New LWP 193689]
[New LWP 193688]
[New LWP 193690]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by ./test'. Program terminated with signal SIGABRT, Aborted. #0 runtime.raise () at /usr/local/go/src/runtime/sys_linux_amd64.s:159 159 RET [Current thread is 1 (Thread 0x7fdf0cfb7740 (LWP 193685))] warning: Unsupported auto-load script at offset 0 in section .debug_gdb_scripts of file /home/ubuntu/mbalraj/GO/TEST/test. Use info auto-load python-scripts [REGEXP]' to list them.
(gdb) bt
#0 runtime.raise () at /usr/local/go/src/runtime/sys_linux_amd64.s:159
#1 0x0000000000443345 in runtime.dieFromSignal (sig=6) at /usr/local/go/src/runtime/signal_unix.go:870
#2 0x00000000004438de in runtime.sigfwdgo (sig=6, info=, ctx=, ~r0=)
at /usr/local/go/src/runtime/signal_unix.go:1086
#3 0x0000000000442027 in runtime.sigtrampgo (sig=0, info=0x0, ctx=0x4583c1 <runtime.raise+33>) at /usr/local/go/src/runtime/signal_unix.go:432
#4 0x00000000004586a6 in runtime.sigtramp () at /usr/local/go/src/runtime/sys_linux_amd64.s:359
#5
#6 runtime.raise () at /usr/local/go/src/runtime/sys_linux_amd64.s:159
#7 0x0000000000443345 in runtime.dieFromSignal (sig=6) at /usr/local/go/src/runtime/signal_unix.go:870
#8 0x0000000000443558 in runtime.crash () at /usr/local/go/src/runtime/signal_unix.go:962
#9 0x000000000042f891 in runtime.fatalthrow.func1 () at /usr/local/go/src/runtime/panic.go:1129
#10 0x000000000042f80c in runtime.fatalthrow (t=) at /usr/local/go/src/runtime/panic.go:1122
#11 0x000000000042f4bd in runtime.throw (s=...) at /usr/local/go/src/runtime/panic.go:1047
#12 0x0000000000443289 in runtime.sigpanic () at /usr/local/go/src/runtime/signal_unix.go:819
#13 0x000000000045abe6 in test1 () at /home/ubuntu/GO/TEST/test.go:8
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 9, 2023
@cagedmantis cagedmantis changed the title affected/package: cmd/gco: core dump reports a corrupt stack Jan 9, 2023
@cagedmantis cagedmantis added this to the Backlog milestone Jan 9, 2023
@ianlancetaylor ianlancetaylor changed the title cmd/gco: core dump reports a corrupt stack runtime: in core dump generated for a signal while executing C code, gdb reports a corrupt stack Jan 9, 2023
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 9, 2023
@ianlancetaylor
Copy link
Contributor

CC @golang/runtime

Looks like gdb can't unwind past the call to sigpanic. It gets to test1 but can't get any farther. This might be difficult to solve, as the call to sigpanic is dynamically generated.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 10, 2023 via email

@ianlancetaylor
Copy link
Contributor

I'm suggesting that the problem is that gdb can't unwind past a call to sigpanic. When you call the C function assert, no call to sigpanic is involved.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 11, 2023 via email

@cherrymui
Copy link
Member

Maybe we could consider not injecting a sigpanic if a faulting signal lands on a non-user G. It would be that either we're in the runtime, or we're running some C code. Either way I think we cannot recover.

@cherrymui cherrymui self-assigned this Jan 11, 2023
@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 12, 2023 via email

@ianlancetaylor
Copy link
Contributor

There is no API to avoid calling sigpanic, and we are not going to add one.

@cherrymui has pointed out that it is currently impossible to recover a panic that occurs while executing C code, so it may be reasonable to avoid injecting a sigpanic call in C code. We could, instead, simply reraise the signal, as we do for other signals that occur while executing C code. See the badsignal function.

However, making that change is not going to be a high priority for us. I don't know of anybody else who has reported problems getting a stack backtrace in a core dump generated while executing C code called by Go code. Go is an open source project, and anybody is welcome to contribute. See https://go.dev/doc/contribute.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 12, 2023 via email

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 12, 2023 via email

@ianlancetaylor
Copy link
Contributor

Just removing the _Sigpanic flag will mean that dereferencing a nil pointer in Go will not cause a panic as it should. I expect that if you do that some tests will fail. You can run "all.bash" to build the entire tree and run all the tests, as documented at https://go.dev/doc/contribute#testing .

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 13, 2023 via email

@cherrymui
Copy link
Member

My thinking is to set it to _SigThrow if the signal is not arrived on a user Go stack. I plan to send a CL, but I need to check a few things to make sure it works as expected. Thanks.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 16, 2023 via email

@gopherbot
Copy link

Change https://go.dev/cl/462437 mentions this issue: runtime: don't inject a sigpanic if not on user G stack

@cherrymui
Copy link
Member

https://go.dev/cl/462437 is the CL. I'd still need to do more testing. You're welcome to try if the CL fixes your case. Thanks.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 18, 2023 via email

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 18, 2023 via email

@cherrymui
Copy link
Member

The Go functions and C functions run on different stacks. When test3 calls Test4, internally there is a stack switch, and GDB cannot unwind through that. You may want to try Delve (https://github.com/go-delve/delve), which knows Go's stack switch convention and probably handles this better.

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 19, 2023 via email

@MariappanBalraj
Copy link
Author

MariappanBalraj commented Jan 23, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Todo
Development

No branches or pull requests

6 participants