-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: corrupt gp.syscallsp after cgo call #9875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Replace if getcallersp(unsafe.Pointer(&dummy)) > _g_.syscallsp {
throw("exitsyscall: syscall frame is no longer valid")
} with if getcallersp(unsafe.Pointer(&dummy)) > _g_.syscallsp {
systemstack(func() {
println("exitsyscall: syscall frame is no longer valid", unsafe.Pointer(&dummy)), g_.syscallsp)
throw("exitsyscall: syscall frame is no longer valid")
})
} What does it print? |
@dvyukov, thanks for the quick reply! I get:
|
OK, so g.syscallsp is 0. That's already something. |
Here is my OS:
The machine has around 23G of RAM and 16 cores like this:
And I’m trying to use them all. So I have 16 goroutines calling a C function; the CPU utilization is around 1600%. The function just does some linear algebra and nothing of what you mentioned. |
Please try the following patch.
|
Patched. Strange things started to happen. I got another “stack split at bad time,” but this time in a completely different part of my program where I spin off my goroutines by
Also,
And sometimes everything works just fine. Maybe it’s not specific to cgo after all. I guess I’m doing something horrible somewhere without realizing it. I need to carefully go through my code and to spend more time experimenting with my setup. Right now it looks rather random. @dvyukov, thank you for your help! |
I agree that it looks like a heap corruption.
|
I’ve tried running my code on another machine, and I haven’t observed anything similar there. Everything works as it should. Might be related to #9906. |
I suspect this is the same as #9906 - bad hardware. |
Hello,
I’m experiencing a panic when calling a C function. It happens only occasionally, and, unfortunately, I don’t have a sensibly small piece of code that could reproduce it. I’ve been trying to trace what happens and have even switched to master (from 1.4.1).
Let me please explain it the way I see it.
cgocall_errno
after each invocation of a C function.cgocall_errno
finishes by callingexitsyscall
.exitsyscall
throws an exception “syscall frame is no longer valid.”However, I don’t see that message when the program panics. I discovered those calls in the stack trace. What actually happens next is as follows.
throw
tries to print the error message.morestack
gets in the way and callsnewstack
.newstack
throws “stack split at bad time.”And this is the error message that is printed in the terminal.
I would be grateful for any feedback. Thank you.
Regards,
Ivan
The text was updated successfully, but these errors were encountered: