We have a lot of intermittent failures on the builders, which are hard to reproduce. Evidently the Firefox project had similar struggles, so they built https://rr-project.org/, which allows recording and replaying program executions on linux/x86. Perhaps we should look into using it for our builders too?
Quickly skimming build.golang.org, it's not clear to me if there are enough flaky failures on linux/x86 to warrant the ~20% slowdown (according to their website). It seems like most flakiness is on non-linux or non-x86, unfortunately, and rr doesn't currently support any other platforms.
Maybe the regabi builder could benefit from it at least. It seems to have had some flakiness a couple weeks ago.
I'm not sure wether this is from any help. But I've stumbled upon this issue and the description from rr-project reminded me a lot about a technique I've heard before. @mdempsky stated that most flakey tests are on non Linux. Maybe it's from interest that MS has a similar technique called Time travel debugging. I've no experience with this but maybe it's worth evaluating if TTD can help with some none Linux flakiness.