Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
runtime: eliminate stack rescanning #17503
One of the largest remaining contributors to GC STW time is stack rescanning. I have an approach for eliminating this entirely. This is a tracking bug for implementing this approach.
I will upload a design document and proof soon, and I have a working implementation that I plan to have cleaned up and mailed out in a day or two.
I'm marking this Go 1.9. My current plan is to get the change in for Go 1.8, but have a GODEBUG flag to fall back to the current algorithm for debugging purposes (and in case something goes wrong). Assuming things go smoothly, we'll actually rip out the stack rescanning code when Go 1.9 opens.
Edit: Design doc
Edit: Things to follow up on in Go 1.9+:
We used the "double barrier" to address stack scanning issues in the IBM J9 implementation of Metronome. It's in section 4.3 of "Design and implementation of a comprehensive real-time java virtual machine" by Auerbach et al, section 4.3. In that case we were incrementalizing over many thread's stacks, as opposed to the individual stack, but it solves the same problem.
It worked well, although the extra barrier overhead was annoying. Let me know if you have any questions.
@rasky, it's about a 1.7% performance hit on the x/benchmarks garbage benchmark (which, as the name suggests, is designed to hammer the garbage collector). I haven't checked, but I suspect we've gained more than that from other optimizations since Go 1.7.
They are harder to eliminate at compile time. I completely disabled the optimizations that don't carry over directly to the hybrid barrier and binaries got about 1% larger. We could eliminate some of these, but it requires flow analysis and the current insertion code doesn't do any flow analysis. OTOH, the places where we can eliminate write barriers with the current write barrier aren't all that common anyway (how often do you write the address of a global to something?), so we're not losing much.