-
Notifications
You must be signed in to change notification settings - Fork 17.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: fatal error: mcall called on m->g0 stack on Windows #67108
Comments
Below is my go.mod file (with a few redactions)
|
This seems to be related to: #56774 But in that issue, the poster said updating to 1.21.3 fixed it. However, we're using 1.21.4, and still seeing the issue. |
Slightly different panic:
|
A new type of panic:
|
Just to be clear, you're cross compiling inside docker, but running the program on real Windows machines? Do you see it only on a specific type of Windows machines? Thanks. |
Correct. We cross compile in docker. But we're running in native Windows.
Yes. We only see it on our AMD servers. Example specs of the machines is listed above in the original post. None of our Intel-based servers are seeing these panics Let me know if you have any other questions. :) |
Thanks. Have you tried running the program under the race detector? Is the program using cgo? Is there a way we can reproduce the issue ourselves? |
cc @golang/windows |
No, but that's a good idea. I'll give that a shot today
No, we disable it with
Unfortunately, it's a proprietary program. But given it's panic'ing on init() and in standard libraries, I'm going to see if I can create a small "hello world" test program and just run it continuously on the machines to see if I can reproduce it. |
I'm wondering if there is some small incorrect assumption with AMD hardware in the golang runtime code? I see this other issue, which isn't the same stack trace, but is also limited to AMD hardware: #62440 |
Some more callstacks from our servers:
|
Another callstack:
|
|
It could be. It could also be a kernel issue. A reproducer would be very helpful. Thanks. |
I'm attempting to get a minimal repro. I created something and I'm running it on all our workers periodically to try to get a failure. I'll report back here if I can get it to trigger. |
Go version
go1.21.4
Output of
go env
in your module/workspace:NOTE: We compile inside docker. The output below is from running
go env
within the docker container that we use. We cross-compile for linux, windows, and darwin. The workers having a problem in this case are running windows/amd64.What did you do?
We have a cli program that runs on many thousands of VMs / bare metal workers as part of our larger CI setup. Recently, we've been seeing a non-trivial number of panics on a specific set of workers. These workers are bare metal AMD Threadripper machines running Windows 10. Below is some info from DXDiag. (I can get additional information if needed).
We're seeing two types of panics:
fatal error: runtime: mcall called on m->g0 stack
I have included multiple call stacks of these panics below. NOTE: these callstacks all happen on different machines. But they only happen on the same "series" of machine, which is documented above.
All the panics seem to happen during the
init()
phase of the runtime start-up. The panics are not super reproduce-able. I'm only really seeing it due to our scale.What did you see happen?
Below are the callstacks from a number of panics across multiple machines.
Then here are some callstacks from init's that are seemingly not following the right init order:
What did you expect to see?
The program finishes the
init()
phase without panicsThe text was updated successfully, but these errors were encountered: