-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: process crash instead of panic on SIGBUS with SetPanicOnDefault(true) #41155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not sure how to replicate this failure, but I'd like to give this a shot. Do we think that the posted workaround is something that could also be long-term solution? |
Please avoid looking at the workaround (and, everyone, please avoid posting patches through the issue tracker). We want patches to only come in as Gerritt code reviews or GitHub pull requests, because then we have automation that confirms that the copyright assignments are in order. Thanks. To put it another way, I can't answer your question about the posted workaround because I'm not going to look at it. Sorry. I think you might be able to write a test that gets a |
Thanks for the pointers, I'll try to get a repro done, and then see how the issue can be fixed! |
Thank you for looking into this. I tough I should open a ticket for discussion before creating a PR. Sorry if I didn't respect the rules by adding a link to my workaround commit in the ticket. If desired, I would be happy to contribute to fix this issue and make a PR. For now, I try to find a way to write a test which could be integrated with the regular test suite to reproduce this issue without our embedded FPGA platform. I created a test doing what @ianlancetaylor suggested. Doing this doesn't reproduce the issue. This result in the expected |
This would a good first step; I hope I can assist in that as well. (and also, thanks for having a positive attitude to getting to the bottom of this!) The following code uses CGO and triggers a SIGBUS. I tried it on darwin and linux, but could not get the same error. This happens both with and without the Code : https://play.golang.org/p/vWdhf2mtuEq
EDIT: Here's the same using
Output without debug.SetPanicOnFault
|
I tried the code using mmap on our embedded platform, and see the same behavior. Then I modified the runtime to print the flags and the sigcode when a SIGBUS is received. Output of SIGBUS generated by sample from previous comment
Output with SIGBUS generated by a bad register access
Since sigcode 0 match with |
Here is a minimal code which reproduce the issue on armv7. The same code on amd64 doesn't reproduce the issue as mmap simply refuse to mmap bad addresses. https://play.golang.org/p/Zbi9pBZ3rKu
|
Punting to Go1.17, thank you all for the patience, and for the discussion, please keep it going. |
I don't understand why the kernel would send a signal with |
Since this has been around forever, I'd like just to add you can reliably trigger a crashing SIGBUS even when trying to recover the panic by writing to PROT_READ mmap'd memory. I can trigger it 100% of the time using gommap on Darwin arm64. Not sure if that helps with debugging and finding a handler. |
For linux/arm64, torvalds/linux@526c3dd and torvalds/linux@af40ff6 come to mind (which suggest to me that this used to be a problem that got fixed.) I can't speak to linux/arm32 or darwin/arm64. I cannot reproduce the problem on linux/amd64. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We are using Go for some embedded development (cross compiled to linux arm32). We access various FPGA registers from the Go process. In order to access those registers, we use mmap /dev/mem at the address space of those registers.
When we access registers which are not defined/accessible in the FPGA, the process crash with the error reported below.
We use
defer debug.SetPanicOnFault(debug.SetPanicOnFault(true))
in the stack which makes the register read as we expect this to make the runtime panic instead of crash on this kind of memory fault.What did you expect to see?
A panic where the bad access happened. This way, with a recover call, it would be possible to handle the case where some registers are not available.
What did you see instead?
The process crash, in an unrecoverable way, with the following output:
Workaround
I build a custom runtime with this commit which makes the call panic as expected.
The text was updated successfully, but these errors were encountered: