-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgo: use of libc's backtrace from a C++ signal handler invoked for Go code crashes on ppc64le #46374
Comments
Does this reproduce with older versions of Go as well, or is this new? Judging by your analysis, this is true for older Go versions as well, hence why I put it in the backlog. If it's new, we probably need to deal with it this release. |
Backlog is fine, I don't think this has ever worked. I'm not expecting this to get fixed for this release but trying to understand if this should work as is. To me it seems like they are doing some things that might not work with Go. And if that is the case is there another way this could be written to make it work, or at least not SEGV for platforms where this would be a problem. Here is a summary as I understand their code where this started failing: There is some C++ code to dump out all the stacks for a process. A few things that seem suspicious (although this "works" on x86) and I don't know if they directly contribute to the problem:
If I write a program to call backtrace using cgo, this does not SEGV. The Go code will call asmcgocall before calling the C code and that will clear out the location for the backchain that backtrace expects. Since it is nil, backrace will stop at that point. In the scenario where Go code is running and is interrupted so the C++ signal handler gets control, there is nothing like asmcgocall in between to indicate that backtrace should stop. I'm trying to create a smaller, simpler reproducer for this. |
A simple reproducer of the backtrace crash for ppc64le can be done with some minor changes to the example in #46286 by replacing the obvious segfaulting code with a call to backtrace, and replacing the deadlock with a sufficiently long sleep. |
I meant a reproducer that has multiple Go and non-Go threads where a signal is sent to all to generate backtrace-type information like in the original issue. |
The I agree that using cgosymbolizer ought to work. |
If we could do this, that would be great. There are other tools the expect the PPC64 ABI so don't work correctly. I was always under the impression this was not an option due to compatibility issues. Something to investigate for the next release due to the work and risk.
Based on my understanding, it would be better to have a Go signal handler to dump out the stacks for threads running Go and use cgosymbolizer from that signal handler. Does that seem like the right approach? |
I think that should work, yes. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
This original failure is documented here cockroachdb/cockroach#62979
I'm opening this issue to get some feedback on this problem and possible solutions. I know there have been other problems reported in the community related to cgo stack unwinding and this falls into that category.
In this case there is a problem because there is a C++ signal handler which calls backtrace from libc, and the interrupt can happen in Go or non-Go code. The code where this feature was added can be found here cockroachdb/cockroach@957b4bd.
The call to backtrace gets a SEGV on ppc64le which is not unexpected since backtrace expects the stackframe to contain the backchain pointer (according to the PPC64 ABI). But since Go doesn't maintain the backchain pointer, when backtrace tries to unwind the stack it eventually tries to use something as a pointer which is not a pointer and SEGVs.
I've seen several other issues related to unwinding stacks with cgo, and in particular one that mentions the use of cgosymbolizer by as a solution. It seems like cgosymbolizer could be used in this case from within a signal handler to give the desired stack information when Go code is involved.
@ianlancetaylor @cherrymui Does this seem like a possible solution, to create a Go signal handler which could then use cgosymbolizer to dump the stacks?
@pmur
The text was updated successfully, but these errors were encountered: