Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
We're seeing stack overflows this week in various tests on the
This may be related to #35349, but the stack traces on
I can imagine a possibility: if there are both a synchronous preemption request (by clobbering the stack guard) and an asynchronous one (by signal), and the goroutine in a function prologue first sees the clobbered stack guard, so it will call morestack. If the signal lands after the CMP instruction but before the call to morestack, it will be asynchronously preempted, enter the scheduler. When it is resumed, the scheduler clears the preemption request, unclobbers the stack guard. But the resumed goroutine will still call morestack (as it has passed the CMP instruction). morestack will, as there is no preemption request, double the stack unnecessarily. If this happens multiple times, the stack may grow too big, although only a small amount is actually used.
I let it print the current stack bounds in the stack-too-large error message, and the stack is indeed quite large, with only a small amount used:
In theory this can happen on other platforms. Not sure why this is only seen on the ARM builder.
@cherrymui That sounds like a good idea.
I think we might have to start at the load of the stack guard, as the CMP result is predestined at that point. But then maybe we need to only prevent async preemption if that loaded value is in fact the preempted guard.