Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
cmd/compile: boost inlining into FORs
As already Than McIntosh mentioned it's a common practise to boost inlining to FORs, since the callsite could be "hotter". This patch implements this functionality. The implementation uses a stack of FORs to recognise calls which are in a loop. The stack is maintained alongside inlnode function works and contains information about ancenstor FORs relative to a current node in inlnode. There is "big" FOR which cost is >= inlineBigForCost(47). In such FORs no boost is applied. Updates golang#17566 The following results on GO1, while binary size not increased significantly 10441232 -> 10465920, which is less than 0.3%. goos: linux goarch: amd64 pkg: test/bench/go1 cpu: Intel(R) Xeon(R) Gold 6230N CPU @ 2.30GHz name old time/op new time/op delta BinaryTree17-8 2.15s ± 1% 2.17s ± 1% +0.86% (p=0.041 n=6+6) Fannkuch11-8 2.70s ± 0% 2.72s ± 0% +0.71% (p=0.002 n=6+6) FmtFprintfEmpty-8 31.9ns ± 0% 31.6ns ± 0% -1.06% (p=0.008 n=5+5) FmtFprintfString-8 57.0ns ± 0% 58.3ns ± 0% +2.26% (p=0.004 n=6+5) FmtFprintfInt-8 65.2ns ± 0% 64.1ns ± 0% -1.65% (p=0.000 n=5+4) FmtFprintfIntInt-8 103ns ± 0% 102ns ± 0% -0.91% (p=0.000 n=5+6) FmtFprintfPrefixedInt-8 119ns ± 0% 118ns ± 0% -0.60% (p=0.008 n=5+5) FmtFprintfFloat-8 169ns ± 0% 171ns ± 0% +1.50% (p=0.004 n=5+6) FmtManyArgs-8 445ns ± 0% 445ns ± 0% ~ (p=0.506 n=6+5) GobDecode-8 4.37ms ± 1% 4.41ms ± 0% +0.79% (p=0.009 n=6+6) GobEncode-8 3.07ms ± 0% 3.05ms ± 0% -0.42% (p=0.004 n=5+6) Gzip-8 195ms ± 0% 194ms ± 0% -0.40% (p=0.009 n=5+6) Gunzip-8 28.2ms ± 0% 28.9ms ± 0% +2.22% (p=0.004 n=5+6) HTTPClientServer-8 45.0µs ± 1% 45.4µs ± 0% +0.97% (p=0.030 n=6+5) JSONEncode-8 8.01ms ± 0% 7.95ms ± 0% -0.78% (p=0.008 n=5+5) JSONDecode-8 35.3ms ± 1% 35.0ms ± 0% -1.04% (p=0.004 n=5+6) Mandelbrot200-8 4.50ms ± 0% 4.50ms ± 0% ~ (p=0.662 n=6+5) GoParse-8 3.03ms ± 1% 2.96ms ± 0% -2.41% (p=0.004 n=6+5) RegexpMatchEasy0_32-8 55.4ns ± 0% 53.8ns ± 0% -2.83% (p=0.004 n=5+6) RegexpMatchEasy0_1K-8 178ns ± 0% 162ns ± 1% -8.76% (p=0.004 n=5+6) RegexpMatchEasy1_32-8 50.1ns ± 0% 49.6ns ± 0% -0.92% (p=0.004 n=5+6) RegexpMatchEasy1_1K-8 271ns ± 1% 268ns ± 0% -1.15% (p=0.002 n=6+6) RegexpMatchMedium_32-8 949ns ± 0% 862ns ± 0% -9.20% (p=0.008 n=5+5) RegexpMatchMedium_1K-8 27.1µs ± 7% 27.4µs ± 7% ~ (p=0.589 n=6+6) RegexpMatchHard_32-8 1.28µs ± 2% 1.27µs ± 1% ~ (p=0.065 n=6+6) RegexpMatchHard_1K-8 38.5µs ± 0% 38.5µs ± 0% ~ (p=0.132 n=6+6) Revcomp-8 397ms ± 0% 397ms ± 0% ~ (p=1.000 n=6+6) Template-8 48.1ms ± 1% 47.8ms ± 0% -0.48% (p=0.016 n=5+5) TimeParse-8 213ns ± 0% 213ns ± 0% ~ (p=0.467 n=4+6) TimeFormat-8 295ns ± 1% 294ns ± 0% ~ (p=0.554 n=6+5) [Geo mean] 40.5µs 40.2µs -0.81% name old speed new speed delta GobDecode-8 176MB/s ± 1% 174MB/s ± 0% -0.79% (p=0.009 n=6+6) GobEncode-8 250MB/s ± 0% 251MB/s ± 0% +0.42% (p=0.004 n=5+6) Gzip-8 100MB/s ± 0% 100MB/s ± 0% +0.40% (p=0.009 n=5+6) Gunzip-8 687MB/s ± 0% 672MB/s ± 0% -2.17% (p=0.004 n=5+6) JSONEncode-8 242MB/s ± 0% 244MB/s ± 0% +0.78% (p=0.008 n=5+5) JSONDecode-8 54.9MB/s ± 1% 55.5MB/s ± 0% +1.05% (p=0.004 n=5+6) GoParse-8 19.1MB/s ± 1% 19.6MB/s ± 0% +2.48% (p=0.004 n=6+5) RegexpMatchEasy0_32-8 578MB/s ± 0% 594MB/s ± 0% +2.89% (p=0.008 n=5+5) RegexpMatchEasy0_1K-8 5.74GB/s ± 1% 6.31GB/s ± 1% +9.95% (p=0.002 n=6+6) RegexpMatchEasy1_32-8 639MB/s ± 0% 645MB/s ± 0% +0.93% (p=0.004 n=5+6) RegexpMatchEasy1_1K-8 3.78GB/s ± 1% 3.82GB/s ± 0% +1.15% (p=0.002 n=6+6) RegexpMatchMedium_32-8 33.7MB/s ± 0% 37.1MB/s ± 0% +10.15% (p=0.008 n=5+5) RegexpMatchMedium_1K-8 37.9MB/s ± 6% 37.5MB/s ± 7% ~ (p=0.697 n=6+6) RegexpMatchHard_32-8 24.9MB/s ± 2% 25.1MB/s ± 1% ~ (p=0.058 n=6+6) RegexpMatchHard_1K-8 26.6MB/s ± 0% 26.6MB/s ± 0% ~ (p=0.195 n=6+6) Revcomp-8 640MB/s ± 0% 641MB/s ± 0% ~ (p=1.000 n=6+6) Template-8 40.4MB/s ± 1% 40.6MB/s ± 0% +0.47% (p=0.016 n=5+5) [Geo mean] 175MB/s 178MB/s +1.56%
- Loading branch information