This is the exact same failure mode as in #34693, which was believed to be fixed in December 2019. This is the first such failure in the logs since CL 210217, which suggests either a recent regression in the runtime (perhaps the GC pacer?) or a lingering very-low-probability flake in the test.
FYI, we had this failure a lot in toward the end of the 2019, starting with 2019-09-04T17:56:17-0607cdd and ending with 2019-12-05T22:08:26-e751af1, mostly on openbsd-arm though also on a smattering of other platforms. The one failure @bcmills linked is the only other time we've seen this.
I haven't been able to reproduce yet, but this test seems rather unstable to me.
I don't think there's necessarily a GC pacing bug, there's a lot that could be going wrong here. For instance, if a test runs just before this with a large-ish heap (we have some tests that do this, O(100 MiB), but 50 MiB will do), and doesn't clean up by running runtime.GC a few times then this could easily fail. It's possible that what happened on Windows is that its set of tests results in a different ordering compared to other platforms, hence the failure.
But more fundamentally, if the goal of the test is to just trigger a concurrent GC, why try to do so indirectly? We have runtime.GC and the stacks are going to get shrunk there too. That completely avoids this failure.
I think what I want to do here is just switch the test to calling runtime.GC directly. I'll try to induce a bug in stack shrinking to make sure it actually triggers.