Join GitHub today
cmd/compile: SSA performance regression on polygon code #15532
clone this repo: https://github.com/nkovacs/polygonperf
Results for BenchmarkContains (ns/op) on a Core 2 Q6600:
Results for BenchmarkStructContains (ns/op) on a Core 2 Q6600:
(last line is average)
I've seen a similar 30% increase in ns/op on an AMD Athlon II X2 270, but on that CPU the 1 cpu benchmark had the same result as the 2 cpu benchmark.
On the two more modern Intel CPUs I briefly tested, this simple polygon does not show a difference between master and 1.6.2. I added a second polygon (BenchmarkContains2 and BenchmarkStructContains2) that does show a difference, with 1.6.2 again being faster. On the Q6600, go 1.6.2 performs twice as fast in these benchmarks, on a Xeon server, go 1.6.2 is about 100-200 ns/op faster.
Looks like we're making unnecessary copies as a leftover from inlining. Here's a small repro:
This copy is not needed. The old compiler can get rid of the copy, but SSA doesn't.
I don't see any obvious fix. I will ponder.
This isn't going to get fixed for 1.7, too late.
As a workaround, you can replace
SSA does better with structs than arrays.
We'll look again into fixing this for 1.8.