-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: expand TestIntendedInlining to more packages and funcs #21851
Comments
You can modify that test and move it to cmd/compile if you want. |
Yes - this is partly why I mentioned focusing on certain packages first, e.g. the ones closest to |
We can start with individual packages and switch to "build -a std" once enough packages are covered |
@TocarIP I would assume that even after adding all the funcs we want, we still wouldn't need all of |
Change https://golang.org/cl/63550 mentions this issue: |
Move it from the runtime package, as we will soon add more packages and functions for it to check. The test used the testEnv func, which cleaned certain environment variables from a command, so it was moved to internal/testenv under a more descriptive (and less ambiguous) name. Add a simple godoc to it too. For #21851. Change-Id: I6f39c1f23b45377718355fafe66ffd87047d8ab6 Reviewed-on: https://go-review.googlesource.com/63550 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Change https://golang.org/cl/63611 mentions this issue: |
Rework the test to work with any number of std packages. This was done to include a few funcs from unicode/utf8. Adding more will be much simpler too. While at it, add more runtime funcs by searching for "inlined" or "inlining" in the git log of its directory. These are: addb, subtractb, fastrand and noescape. Updates #21851. Change-Id: I4fb2bd8aa6a5054218f9b36cb19d897ac533710e Reviewed-on: https://go-review.googlesource.com/63611 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/63810 mentions this issue: |
If we use -gcflags='-m -m', the compiler should give us a reason why a func couldn't be inlined. Add the extra -m necessary for that extra info and use it to give better test failures. For example, for the func in the TODO: --- FAIL: TestIntendedInlining (1.53s) inl_test.go:104: runtime.nextFreeFast was not inlined: function too complex We might increase the number of -m flags to get more information at some later point, such as getting details on how close the func was to the inlining budget. Also started using regexes, as the output parsing is getting a bit too complex for manual string handling. While at it, also refactored the test to not buffer the entire output into memory. This is fine in practice, but it won't scale well as we add more packages or we depend more on the compiler's debugging output. For example, "go build -a -gcflags='-m -m' std" prints nearly 40MB of plaintext - and we only need to see the output line by line anyway. Updates #21851. Change-Id: I00986ff360eb56e4e9737b65a6be749ef8540643 Reviewed-on: https://go-review.googlesource.com/63810 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Change https://golang.org/cl/65351 mentions this issue: |
This is based from a list that Keith Randall provided in mid-2016. These are all funcs that, at the time, were important and small enough that they should be clearly inlined. The runtime has changed a bit since then. Ctz16 and Ctz8 were removed, so don't add them. stringtoslicebytetmp was moved to the backend, so it's no longer a Go function. And itabhash was moved to itabHashFunc. The only other outlier is adjustctxt, which is not inlineable at the moment. I've added a TODO and will address it myself in a separate commit. While at it, error if any funcs in the input table are duplicated. They're never useful and typos could lead to unintentionally thinking a function is inlineable when it actually isn't. And, since the lists are getting long, start sorting alphabetically. Finally, rotl_31 is only defined on 64-bit architectures, and the added runtime/internal/sys funcs are assembly on 386 and thus non-inlineable in that case. Updates #21851. Change-Id: Ib99ab53d777860270e8fd4aefc41adb448f13662 Reviewed-on: https://go-review.googlesource.com/65351 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>
Change https://golang.org/cl/65654 mentions this issue: |
In https://golang.org/cl/42813 we made a change to have a method of bytes.Buffer to be inlined. We also added a test for that. Would this be a candidate here, and therefore remove the test in bytes package? |
Yes, that would be good. There is no need to duplicate the logic to determine whether or not a func is inlineable (and all the extras, such as reporting why it isn't so). Then again, technically the tests are checking for different things. TestIntendedInlining checks if a func is inlineable, which is a decision made at compile-time. Your bytes test checks the binary to see if the symbol exists, so it checks if the func was inlined. Are all funcs that are inlineable always inlined? I honestly don't know the answer to that question. If the answer is yes, then it's safe to merge with this test. |
I also grepped the Go repository, and did not find any other occurences of |
I know of at least |
Well, I did see that one, but that checks for linker behavior - and it checks whether or not some funcs were linked into a binary, I assume because they should never be part of the binary in the first place (not even inlined). |
Inlineable functions might still appear in the binary if a closure is made out of them. |
Thanks, @randall77. I'll assume it is fine to move the check then. CL incoming. |
Change https://golang.org/cl/65658 mentions this issue: |
In golang.org/cl/42813, a test was added in the bytes package to check if a Buffer method was being inlined, using 'go tool nm'. Now that we have a compiler test that verifies that certain funcs are inlineable, merge it there. Knowing whether the funcs are inlineable is also more reliable than whether or not their symbol appears in the binary, too. For example, under some circumstances, inlineable funcs can't be inlined, such as if closures are used. While at it, add a few more bytes.Buffer methods that are currently inlined and should clearly stay that way. Updates #21851. Change-Id: I62066e32ef5542d37908bd64f90bda51276da4de Reviewed-on: https://go-review.googlesource.com/65658 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
Change https://golang.org/cl/65491 mentions this issue: |
By refactoring job.arg from int with 0/1 as the only valid values into bool and simplifying (*bitState).push, we reduce the number of nodes below the inlining threshold. This improves backtracking regexp performance by 5-10% and go1 geomean by 1.7% Full performance data below: name old time/op new time/op delta Find-6 510ns ± 0% 480ns ± 1% -5.90% (p=0.000 n=10+10) FindString-6 504ns ± 1% 479ns ± 1% -5.10% (p=0.000 n=10+10) FindSubmatch-6 689ns ± 1% 659ns ± 1% -4.27% (p=0.000 n=9+10) FindStringSubmatch-6 659ns ± 0% 628ns ± 1% -4.69% (p=0.000 n=8+10) Literal-6 174ns ± 1% 171ns ± 1% -1.50% (p=0.000 n=10+10) NotLiteral-6 2.89µs ± 1% 2.72µs ± 0% -5.84% (p=0.000 n=10+9) MatchClass-6 4.65µs ± 1% 4.28µs ± 1% -7.96% (p=0.000 n=10+10) MatchClass_InRange-6 4.15µs ± 1% 3.80µs ± 0% -8.61% (p=0.000 n=10+8) ReplaceAll-6 2.72µs ± 1% 2.60µs ± 1% -4.68% (p=0.000 n=10+10) AnchoredLiteralShortNonMatch-6 158ns ± 1% 153ns ± 1% -3.03% (p=0.000 n=10+10) AnchoredLiteralLongNonMatch-6 176ns ± 1% 176ns ± 0% ~ (p=1.000 n=10+9) AnchoredShortMatch-6 260ns ± 0% 255ns ± 1% -1.84% (p=0.000 n=9+10) AnchoredLongMatch-6 456ns ± 0% 455ns ± 0% -0.19% (p=0.008 n=8+10) OnePassShortA-6 1.13µs ± 1% 1.12µs ± 0% -0.57% (p=0.046 n=10+8) NotOnePassShortA-6 1.14µs ± 1% 1.14µs ± 1% ~ (p=0.162 n=10+10) OnePassShortB-6 908ns ± 0% 893ns ± 0% -1.60% (p=0.000 n=8+9) NotOnePassShortB-6 857ns ± 0% 803ns ± 1% -6.34% (p=0.000 n=8+10) OnePassLongPrefix-6 190ns ± 0% 190ns ± 1% ~ (p=0.059 n=8+10) OnePassLongNotPrefix-6 722ns ± 1% 722ns ± 1% ~ (p=0.451 n=10+10) MatchParallelShared-6 810ns ± 2% 807ns ± 2% ~ (p=0.643 n=10+10) MatchParallelCopied-6 72.1ns ± 1% 69.4ns ± 1% -3.81% (p=0.000 n=10+10) QuoteMetaAll-6 213ns ± 2% 216ns ± 3% ~ (p=0.284 n=10+10) QuoteMetaNone-6 89.7ns ± 1% 89.8ns ± 1% ~ (p=0.616 n=10+10) Match/Easy0/32-6 127ns ± 1% 127ns ± 1% ~ (p=0.977 n=10+10) Match/Easy0/1K-6 566ns ± 0% 566ns ± 0% ~ (p=1.000 n=8+8) Match/Easy0/32K-6 9.30µs ± 1% 9.28µs ± 1% ~ (p=0.529 n=10+10) Match/Easy0/1M-6 460µs ± 1% 460µs ± 1% ~ (p=0.853 n=10+10) Match/Easy0/32M-6 15.0ms ± 0% 15.1ms ± 0% +0.77% (p=0.000 n=9+8) Match/Easy0i/32-6 2.10µs ± 1% 1.98µs ± 0% -6.02% (p=0.000 n=10+8) Match/Easy0i/1K-6 61.5µs ± 0% 57.2µs ± 0% -6.97% (p=0.000 n=10+9) Match/Easy0i/32K-6 2.75ms ± 0% 2.72ms ± 0% -1.10% (p=0.000 n=9+9) Match/Easy0i/1M-6 88.0ms ± 0% 86.9ms ± 1% -1.29% (p=0.000 n=8+10) Match/Easy0i/32M-6 2.82s ± 0% 2.77s ± 1% -1.81% (p=0.000 n=8+10) Match/Easy1/32-6 123ns ± 1% 124ns ± 1% +0.90% (p=0.001 n=10+10) Match/Easy1/1K-6 1.70µs ± 1% 1.65µs ± 0% -3.18% (p=0.000 n=9+10) Match/Easy1/32K-6 69.1µs ± 0% 68.4µs ± 1% -0.95% (p=0.000 n=8+10) Match/Easy1/1M-6 2.46ms ± 1% 2.42ms ± 1% -1.66% (p=0.000 n=10+10) Match/Easy1/32M-6 78.4ms ± 1% 77.5ms ± 0% -1.08% (p=0.000 n=10+9) Match/Medium/32-6 2.07µs ± 1% 1.91µs ± 1% -7.69% (p=0.000 n=10+10) Match/Medium/1K-6 62.8µs ± 0% 58.0µs ± 1% -7.70% (p=0.000 n=8+10) Match/Medium/32K-6 2.63ms ± 1% 2.58ms ± 1% -2.14% (p=0.000 n=10+10) Match/Medium/1M-6 84.6ms ± 0% 82.5ms ± 0% -2.37% (p=0.000 n=8+9) Match/Medium/32M-6 2.71s ± 0% 2.64s ± 0% -2.46% (p=0.000 n=10+9) Match/Hard/32-6 3.26µs ± 1% 2.98µs ± 1% -8.49% (p=0.000 n=10+10) Match/Hard/1K-6 100µs ± 0% 90µs ± 1% -9.55% (p=0.000 n=9+10) Match/Hard/32K-6 3.82ms ± 0% 3.82ms ± 1% ~ (p=0.515 n=8+10) Match/Hard/1M-6 122ms ± 1% 123ms ± 0% +0.66% (p=0.000 n=10+8) Match/Hard/32M-6 3.89s ± 1% 3.91s ± 1% ~ (p=0.105 n=10+10) Match/Hard1/32-6 18.1µs ± 1% 16.1µs ± 1% -11.31% (p=0.000 n=10+10) Match/Hard1/1K-6 565µs ± 0% 493µs ± 1% -12.65% (p=0.000 n=8+10) Match/Hard1/32K-6 18.8ms ± 0% 18.8ms ± 1% ~ (p=0.905 n=9+10) Match/Hard1/1M-6 602ms ± 1% 602ms ± 1% ~ (p=0.278 n=9+10) Match/Hard1/32M-6 19.1s ± 1% 19.2s ± 1% +0.31% (p=0.035 n=9+10) Match_onepass_regex/32-6 6.32µs ± 1% 6.34µs ± 1% ~ (p=0.060 n=10+10) Match_onepass_regex/1K-6 204µs ± 1% 204µs ± 1% ~ (p=0.842 n=9+10) Match_onepass_regex/32K-6 6.53ms ± 0% 6.55ms ± 1% +0.36% (p=0.005 n=10+10) Match_onepass_regex/1M-6 209ms ± 0% 208ms ± 1% -0.65% (p=0.034 n=8+10) Match_onepass_regex/32M-6 6.72s ± 0% 6.68s ± 1% -0.74% (p=0.000 n=9+10) CompileOnepass/^(?:(?:(?:.(?:$))?))...-6 7.02µs ± 1% 7.02µs ± 1% ~ (p=0.671 n=10+10) CompileOnepass/^abcd$-6 5.65µs ± 1% 5.65µs ± 1% ~ (p=0.411 n=10+9) CompileOnepass/^(?:(?:a{0,})*?)$-6 7.06µs ± 1% 7.06µs ± 1% ~ (p=0.912 n=10+10) CompileOnepass/^(?:(?:a+)*)$-6 6.40µs ± 1% 6.41µs ± 1% ~ (p=0.699 n=10+10) CompileOnepass/^(?:(?:a|(?:aa)))$-6 8.18µs ± 2% 8.16µs ± 1% ~ (p=0.529 n=10+10) CompileOnepass/^(?:[^\s\S])$-6 5.08µs ± 1% 5.17µs ± 1% +1.77% (p=0.000 n=9+10) CompileOnepass/^(?:(?:(?:a*)+))$-6 6.86µs ± 1% 6.85µs ± 0% ~ (p=0.190 n=10+9) CompileOnepass/^[a-c]+$-6 5.14µs ± 1% 5.11µs ± 0% -0.53% (p=0.041 n=10+10) CompileOnepass/^[a-c]*$-6 5.62µs ± 1% 5.63µs ± 1% ~ (p=0.382 n=10+10) CompileOnepass/^(?:a*)$-6 5.76µs ± 1% 5.73µs ± 1% -0.41% (p=0.008 n=9+10) CompileOnepass/^(?:(?:aa)|a)$-6 7.89µs ± 1% 7.84µs ± 1% -0.66% (p=0.020 n=10+10) CompileOnepass/^...$-6 5.38µs ± 1% 5.38µs ± 1% ~ (p=0.857 n=9+10) CompileOnepass/^(?:a|(?:aa))$-6 7.80µs ± 2% 7.82µs ± 1% ~ (p=0.342 n=10+10) CompileOnepass/^a((b))c$-6 7.75µs ± 1% 7.78µs ± 1% ~ (p=0.172 n=10+10) CompileOnepass/^a.[l-nA-Cg-j]?e$-6 8.39µs ± 1% 8.42µs ± 1% ~ (p=0.138 n=10+10) CompileOnepass/^a((b))$-6 6.92µs ± 1% 6.95µs ± 1% ~ (p=0.159 n=10+10) CompileOnepass/^a(?:(b)|(c))c$-6 10.0µs ± 1% 10.0µs ± 1% ~ (p=0.896 n=10+10) CompileOnepass/^a(?:b|c)$-6 5.62µs ± 1% 5.66µs ± 1% +0.71% (p=0.023 n=10+10) CompileOnepass/^a(?:b?|c)$-6 8.49µs ± 1% 8.43µs ± 1% -0.69% (p=0.010 n=10+10) CompileOnepass/^a(?:b?|c+)$-6 9.26µs ± 1% 9.28µs ± 1% ~ (p=0.448 n=10+10) CompileOnepass/^a(?:bc)+$-6 6.52µs ± 1% 6.46µs ± 2% -1.02% (p=0.003 n=10+10) CompileOnepass/^a(?:[bcd])+$-6 6.29µs ± 1% 6.32µs ± 1% ~ (p=0.256 n=10+10) CompileOnepass/^a((?:[bcd])+)$-6 7.77µs ± 1% 7.79µs ± 1% ~ (p=0.105 n=10+10) CompileOnepass/^a(:?b|c)*d$-6 14.0µs ± 1% 13.9µs ± 1% -0.69% (p=0.003 n=10+10) CompileOnepass/^.bc(d|e)*$-6 8.96µs ± 1% 9.06µs ± 1% +1.20% (p=0.000 n=10+9) CompileOnepass/^loooooooooooooooooo...-6 219µs ± 1% 220µs ± 1% +0.63% (p=0.006 n=9+10) [Geo mean] 31.6µs 31.1µs -1.82% name old speed new speed delta QuoteMetaAll-6 65.5MB/s ± 2% 64.8MB/s ± 3% ~ (p=0.315 n=10+10) QuoteMetaNone-6 290MB/s ± 1% 290MB/s ± 1% ~ (p=0.755 n=10+10) Match/Easy0/32-6 250MB/s ± 0% 251MB/s ± 1% ~ (p=0.277 n=8+9) Match/Easy0/1K-6 1.81GB/s ± 0% 1.81GB/s ± 0% ~ (p=0.408 n=8+10) Match/Easy0/32K-6 3.52GB/s ± 1% 3.53GB/s ± 1% ~ (p=0.529 n=10+10) Match/Easy0/1M-6 2.28GB/s ± 1% 2.28GB/s ± 1% ~ (p=0.853 n=10+10) Match/Easy0/32M-6 2.24GB/s ± 0% 2.23GB/s ± 0% -0.76% (p=0.000 n=9+8) Match/Easy0i/32-6 15.2MB/s ± 1% 16.2MB/s ± 0% +6.43% (p=0.000 n=10+9) Match/Easy0i/1K-6 16.6MB/s ± 0% 17.9MB/s ± 0% +7.48% (p=0.000 n=10+9) Match/Easy0i/32K-6 11.9MB/s ± 0% 12.0MB/s ± 0% +1.11% (p=0.000 n=9+9) Match/Easy0i/1M-6 11.9MB/s ± 0% 12.1MB/s ± 1% +1.31% (p=0.000 n=8+10) Match/Easy0i/32M-6 11.9MB/s ± 0% 12.1MB/s ± 1% +1.84% (p=0.000 n=8+10) Match/Easy1/32-6 260MB/s ± 1% 258MB/s ± 1% -0.91% (p=0.001 n=10+10) Match/Easy1/1K-6 601MB/s ± 1% 621MB/s ± 0% +3.28% (p=0.000 n=9+10) Match/Easy1/32K-6 474MB/s ± 0% 479MB/s ± 1% +0.96% (p=0.000 n=8+10) Match/Easy1/1M-6 426MB/s ± 1% 433MB/s ± 1% +1.68% (p=0.000 n=10+10) Match/Easy1/32M-6 428MB/s ± 1% 433MB/s ± 0% +1.09% (p=0.000 n=10+9) Match/Medium/32-6 15.4MB/s ± 1% 16.7MB/s ± 1% +8.23% (p=0.000 n=10+9) Match/Medium/1K-6 16.3MB/s ± 1% 17.7MB/s ± 1% +8.43% (p=0.000 n=9+10) Match/Medium/32K-6 12.5MB/s ± 1% 12.7MB/s ± 1% +2.15% (p=0.000 n=10+10) Match/Medium/1M-6 12.4MB/s ± 0% 12.7MB/s ± 0% +2.44% (p=0.000 n=8+9) Match/Medium/32M-6 12.4MB/s ± 0% 12.7MB/s ± 0% +2.52% (p=0.000 n=10+9) Match/Hard/32-6 9.82MB/s ± 1% 10.73MB/s ± 1% +9.29% (p=0.000 n=10+10) Match/Hard/1K-6 10.2MB/s ± 0% 11.3MB/s ± 1% +10.56% (p=0.000 n=9+10) Match/Hard/32K-6 8.58MB/s ± 0% 8.58MB/s ± 1% ~ (p=0.554 n=8+10) Match/Hard/1M-6 8.59MB/s ± 1% 8.53MB/s ± 0% -0.70% (p=0.000 n=10+8) Match/Hard/32M-6 8.62MB/s ± 1% 8.59MB/s ± 1% ~ (p=0.098 n=10+10) Match/Hard1/32-6 1.77MB/s ± 1% 1.99MB/s ± 1% +12.40% (p=0.000 n=10+8) Match/Hard1/1K-6 1.81MB/s ± 1% 2.08MB/s ± 1% +14.55% (p=0.000 n=10+10) Match/Hard1/32K-6 1.74MB/s ± 0% 1.74MB/s ± 0% ~ (p=0.108 n=9+10) Match/Hard1/1M-6 1.74MB/s ± 0% 1.74MB/s ± 1% ~ (p=1.000 n=9+10) Match/Hard1/32M-6 1.75MB/s ± 0% 1.75MB/s ± 1% ~ (p=0.157 n=9+10) Match_onepass_regex/32-6 5.05MB/s ± 0% 5.05MB/s ± 1% ~ (p=0.262 n=8+10) Match_onepass_regex/1K-6 5.02MB/s ± 1% 5.02MB/s ± 1% ~ (p=0.677 n=9+10) Match_onepass_regex/32K-6 5.02MB/s ± 0% 4.99MB/s ± 0% -0.47% (p=0.000 n=10+9) Match_onepass_regex/1M-6 5.01MB/s ± 0% 5.04MB/s ± 1% +0.68% (p=0.017 n=8+10) Match_onepass_regex/32M-6 4.99MB/s ± 0% 5.03MB/s ± 1% +0.74% (p=0.000 n=10+10) [Geo mean] 29.1MB/s 29.8MB/s +2.44% go1 data for reference name old time/op new time/op delta BinaryTree17-6 4.39s ± 1% 4.37s ± 0% -0.58% (p=0.006 n=9+9) Fannkuch11-6 5.13s ± 0% 5.18s ± 0% +0.87% (p=0.000 n=8+8) FmtFprintfEmpty-6 74.2ns ± 0% 71.7ns ± 3% -3.41% (p=0.000 n=10+10) FmtFprintfString-6 120ns ± 1% 122ns ± 2% ~ (p=0.333 n=10+10) FmtFprintfInt-6 127ns ± 1% 127ns ± 1% ~ (p=0.809 n=10+10) FmtFprintfIntInt-6 186ns ± 0% 188ns ± 1% +1.02% (p=0.002 n=8+10) FmtFprintfPrefixedInt-6 223ns ± 1% 222ns ± 2% ~ (p=0.421 n=10+10) FmtFprintfFloat-6 374ns ± 0% 376ns ± 1% +0.43% (p=0.030 n=8+10) FmtManyArgs-6 795ns ± 0% 788ns ± 1% -0.79% (p=0.000 n=8+9) GobDecode-6 10.9ms ± 1% 10.9ms ± 0% ~ (p=0.079 n=10+9) GobEncode-6 8.60ms ± 1% 8.56ms ± 0% -0.52% (p=0.004 n=10+10) Gzip-6 378ms ± 1% 386ms ± 1% +2.28% (p=0.000 n=10+10) Gunzip-6 63.7ms ± 0% 62.3ms ± 0% -2.22% (p=0.000 n=9+8) HTTPClientServer-6 120µs ± 3% 114µs ± 3% -4.99% (p=0.000 n=10+10) JSONEncode-6 20.3ms ± 1% 19.9ms ± 0% -1.90% (p=0.000 n=9+10) JSONDecode-6 84.3ms ± 0% 83.7ms ± 0% -0.76% (p=0.000 n=8+8) Mandelbrot200-6 6.91ms ± 0% 6.89ms ± 0% -0.31% (p=0.000 n=9+8) GoParse-6 5.49ms ± 0% 5.47ms ± 1% ~ (p=0.101 n=8+10) RegexpMatchEasy0_32-6 130ns ± 0% 128ns ± 0% -1.54% (p=0.002 n=8+10) RegexpMatchEasy0_1K-6 322ns ± 1% 322ns ± 0% ~ (p=0.525 n=10+9) RegexpMatchEasy1_32-6 124ns ± 0% 124ns ± 0% -0.32% (p=0.046 n=8+10) RegexpMatchEasy1_1K-6 570ns ± 0% 548ns ± 1% -3.76% (p=0.000 n=10+10) RegexpMatchMedium_32-6 196ns ± 0% 183ns ± 1% -6.61% (p=0.000 n=8+10) RegexpMatchMedium_1K-6 64.3µs ± 0% 59.0µs ± 1% -8.31% (p=0.000 n=8+10) RegexpMatchHard_32-6 3.08µs ± 0% 2.80µs ± 0% -8.96% (p=0.000 n=8+9) RegexpMatchHard_1K-6 93.0µs ± 0% 84.5µs ± 1% -9.17% (p=0.000 n=8+9) Revcomp-6 647ms ± 2% 646ms ± 1% ~ (p=0.720 n=10+9) Template-6 92.3ms ± 0% 91.7ms ± 0% -0.65% (p=0.000 n=8+8) TimeParse-6 490ns ± 0% 488ns ± 0% -0.43% (p=0.000 n=10+10) TimeFormat-6 513ns ± 0% 513ns ± 1% ~ (p=0.144 n=9+10) [Geo mean] 79.1µs 77.7µs -1.73% name old speed new speed delta GobDecode-6 70.1MB/s ± 1% 70.3MB/s ± 0% ~ (p=0.078 n=10+9) GobEncode-6 89.2MB/s ± 1% 89.7MB/s ± 0% +0.52% (p=0.004 n=10+10) Gzip-6 51.4MB/s ± 1% 50.2MB/s ± 1% -2.23% (p=0.000 n=10+10) Gunzip-6 304MB/s ± 0% 311MB/s ± 0% +2.27% (p=0.000 n=9+8) JSONEncode-6 95.8MB/s ± 1% 97.7MB/s ± 0% +1.93% (p=0.000 n=9+10) JSONDecode-6 23.0MB/s ± 0% 23.2MB/s ± 0% +0.76% (p=0.000 n=8+8) GoParse-6 10.6MB/s ± 0% 10.6MB/s ± 1% ~ (p=0.111 n=8+10) RegexpMatchEasy0_32-6 244MB/s ± 0% 249MB/s ± 0% +2.06% (p=0.000 n=9+10) RegexpMatchEasy0_1K-6 3.18GB/s ± 1% 3.17GB/s ± 0% ~ (p=0.211 n=10+9) RegexpMatchEasy1_32-6 257MB/s ± 0% 258MB/s ± 0% +0.37% (p=0.000 n=8+8) RegexpMatchEasy1_1K-6 1.80GB/s ± 0% 1.87GB/s ± 1% +3.91% (p=0.000 n=10+10) RegexpMatchMedium_32-6 5.08MB/s ± 0% 5.43MB/s ± 1% +7.03% (p=0.000 n=8+10) RegexpMatchMedium_1K-6 15.9MB/s ± 0% 17.4MB/s ± 1% +9.08% (p=0.000 n=8+10) RegexpMatchHard_32-6 10.4MB/s ± 0% 11.4MB/s ± 0% +9.82% (p=0.000 n=8+9) RegexpMatchHard_1K-6 11.0MB/s ± 0% 12.1MB/s ± 1% +10.10% (p=0.000 n=8+9) Revcomp-6 393MB/s ± 2% 394MB/s ± 1% ~ (p=0.720 n=10+9) Template-6 21.0MB/s ± 0% 21.2MB/s ± 0% +0.66% (p=0.000 n=8+8) [Geo mean] 74.2MB/s 76.2MB/s +2.70% Updates #21851 Change-Id: Ie88455db925f422a828f8528293790726a9c036b Reviewed-on: https://go-review.googlesource.com/65491 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
This is based on a list that Austin Clements provided in mid-2016. It is mostly untouched, except for the fact that the wbufptr funcs were removed from the runtime thus removed from the lits here too. Add a section for these GC funcs, since there are quite a lot of them and the runtime has tons of funcs that we want to inline. As before, sort this section too. Also place some of these funcs out of the GC section, as they are not directly related to the GC. Updates #21851. Change-Id: I35eb777a4c50b5f655618920dc2bc568c7c30ff5 Reviewed-on: https://go-review.googlesource.com/65654 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
Change https://golang.org/cl/66990 mentions this issue: |
Add the package to the table and start it off with a few small, basic functions. Inspired by CL 66331, which added flag.ro. Updates #21851. Change-Id: I3995cde1ff7bb09a718110473bed8b193c2232a5 Reviewed-on: https://go-review.googlesource.com/66990 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/68650 mentions this issue: |
Change https://golang.org/cl/72190 mentions this issue: |
I have run out of ideas to add to the table, so I'm going to close this issue for now. Of course, feel free to still add more to it - but an umbrella issue to expand the table is no longer useful. I believe it was @aclements who suggested that we could use benchmarks to identify what runtime functions should always be inlineable, as opposed to human guesses. That could be a way forward if someone wants to look into it. |
#17566 is about improving the inlining cost model. Making changes to the compiler, especially touching the inlining heuristics, always has the risk of unintentionally making some funcs non-inlineable, potentially resulting in noticeable performance regressions.
That's why @josharian added
TestIntendedInlining
: https://go-review.googlesource.com/c/go/+/57410It basically runs
go build -a -gcflags=-m runtime
and checks that certain funcs are printed in a line along withcan inline
.However, the test is limited to the runtime. There are other small functions in the standard library that we want to be inlineable, and which we even tweak to appease the current heuristic. A recent example: https://go-review.googlesource.com/c/go/+/63332
I propose that we add a way to cover more funcs in other packages within
std
.Thoughts:
unicode/utf8
andbytes
TestIntendedInlining
, or add a new test? (given how it usesbuild -a
, I presume a single test would be the fastest)runtime
, I presume we would want the test out of theruntime
package (where? cmd/compile?)Suggestions of funcs/packages to add welcome. Happy to work on it if pointers on the above are given.
CC @TocarIP @josharian @martisch
The text was updated successfully, but these errors were encountered: