-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: support inlining non-escaping closures #15561
Comments
I'd very much like to see this happen. An idiom that's useful for dealing with lots of returns from a function with long-winded return values is this:
It would be great if this generated exactly the same code as the equivalent:
|
Change https://golang.org/cl/65071 mentions this issue: |
Taking a stab at this. I have a CL that can handle the basic test case above, with serious limitations/bugs. I'm not sure what's going on with captured variables (getting invariant violations in the SSA phase) and would appreciate some guidance from someone more familiar with the compiler. Also, there doesn't seem to be any easy way to tell if a variable is assigned more than once. Am I missing something? |
Calls to a closure held in a local, non-escaping, variable can be inlined, provided the closure body can be inlined and the variable is never written to. The current implementation has the following limitations: - closures with captured variables are not inlined because doing so naively triggers invariant violation in the SSA phase - re-assignment check is currently approximated by checking the Addrtaken property of the variable which should be safe but may miss optimization opportunities if the address is not used for a write before the invocation Updates #15561 Change-Id: I508cad5d28f027bd7e933b1f793c14dcfef8b5a1 Reviewed-on: https://go-review.googlesource.com/65071 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
Change https://golang.org/cl/72490 mentions this issue: |
Change https://golang.org/cl/75770 mentions this issue: |
CL 65071 enabled inlining for local closures with no captures. To determine safety of inlining a call sites, we check whether the variable holding the closure has any assignments after its original definition. Unfortunately, that check did not catch OAS2MAPR and OAS2DOTTYPE, leading to incorrect inlining when a variable holding a closure was subsequently reassigned through a type conversion or a 2-valued map access. There was another more subtle issue wherein reassignment check would always return a false positive for closure calls inside other closures. This was caused by the Name.Curfn field of local variables pointing to the OCLOSURE node instead of the corresponding ODCLFUNC, which resulted in reassigned walking an empty Nbody and thus never seeing any reassignments. This CL fixes these oversights and adds many more tests for closure inlining which ensure not only that inlining triggers but also the correctness of the resulting code. Updates #15561 Change-Id: I74bdae849c4ecfa328546d6d62b512e8d54d04ce Reviewed-on: https://go-review.googlesource.com/75770 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
When inlining a closure with captured variables, walk up the param chain to find the one that is defined inside the scope into which the function is being inlined, and map occurrences of the captures to temporary inlvars, similarly to what is done for function parameters. No noticeable impact on compilation speed and binary size. Minor improvements to go1 benchmarks on darwin/amd64 name old time/op new time/op delta BinaryTree17-4 2.59s ± 3% 2.58s ± 1% ~ (p=0.470 n=19+19) Fannkuch11-4 3.15s ± 2% 3.15s ± 1% ~ (p=0.647 n=20+19) FmtFprintfEmpty-4 43.7ns ± 3% 43.4ns ± 4% ~ (p=0.178 n=18+20) FmtFprintfString-4 74.0ns ± 2% 77.1ns ± 7% +4.13% (p=0.000 n=20+20) FmtFprintfInt-4 77.2ns ± 3% 79.2ns ± 6% +2.53% (p=0.000 n=20+20) FmtFprintfIntInt-4 112ns ± 4% 112ns ± 2% ~ (p=0.672 n=20+19) FmtFprintfPrefixedInt-4 136ns ± 1% 135ns ± 2% ~ (p=0.827 n=16+20) FmtFprintfFloat-4 232ns ± 2% 233ns ± 1% ~ (p=0.194 n=20+20) FmtManyArgs-4 490ns ± 2% 484ns ± 2% -1.28% (p=0.001 n=20+20) GobDecode-4 6.68ms ± 2% 6.72ms ± 2% ~ (p=0.113 n=20+19) GobEncode-4 5.62ms ± 2% 5.71ms ± 2% +1.64% (p=0.000 n=20+19) Gzip-4 235ms ± 3% 236ms ± 2% ~ (p=0.607 n=20+19) Gunzip-4 37.1ms ± 2% 36.8ms ± 3% ~ (p=0.060 n=20+20) HTTPClientServer-4 61.9µs ± 2% 62.7µs ± 4% +1.24% (p=0.007 n=18+19) JSONEncode-4 12.5ms ± 2% 12.4ms ± 3% ~ (p=0.192 n=20+20) JSONDecode-4 51.6ms ± 3% 51.0ms ± 3% -1.19% (p=0.008 n=20+19) Mandelbrot200-4 4.12ms ± 6% 4.06ms ± 5% ~ (p=0.063 n=20+20) GoParse-4 3.12ms ± 5% 3.10ms ± 2% ~ (p=0.402 n=19+19) RegexpMatchEasy0_32-4 80.7ns ± 2% 75.1ns ± 9% -6.94% (p=0.000 n=17+20) RegexpMatchEasy0_1K-4 197ns ± 2% 186ns ± 2% -5.43% (p=0.000 n=20+20) RegexpMatchEasy1_32-4 77.5ns ± 4% 71.9ns ± 7% -7.25% (p=0.000 n=20+18) RegexpMatchEasy1_1K-4 341ns ± 3% 341ns ± 3% ~ (p=0.732 n=20+20) RegexpMatchMedium_32-4 113ns ± 2% 112ns ± 3% ~ (p=0.102 n=20+20) RegexpMatchMedium_1K-4 36.6µs ± 2% 35.8µs ± 2% -2.26% (p=0.000 n=18+20) RegexpMatchHard_32-4 1.75µs ± 3% 1.74µs ± 2% ~ (p=0.473 n=20+19) RegexpMatchHard_1K-4 52.6µs ± 2% 52.0µs ± 3% -1.15% (p=0.005 n=20+20) Revcomp-4 381ms ± 4% 377ms ± 2% ~ (p=0.067 n=20+18) Template-4 57.3ms ± 2% 57.7ms ± 2% ~ (p=0.108 n=20+20) TimeParse-4 291ns ± 3% 292ns ± 2% ~ (p=0.585 n=20+20) TimeFormat-4 314ns ± 3% 315ns ± 1% ~ (p=0.681 n=20+20) [Geo mean] 47.4µs 47.1µs -0.73% name old speed new speed delta GobDecode-4 115MB/s ± 2% 114MB/s ± 2% ~ (p=0.115 n=20+19) GobEncode-4 137MB/s ± 2% 134MB/s ± 2% -1.63% (p=0.000 n=20+19) Gzip-4 82.5MB/s ± 3% 82.4MB/s ± 2% ~ (p=0.612 n=20+19) Gunzip-4 523MB/s ± 2% 528MB/s ± 3% ~ (p=0.060 n=20+20) JSONEncode-4 155MB/s ± 2% 156MB/s ± 3% ~ (p=0.192 n=20+20) JSONDecode-4 37.6MB/s ± 3% 38.1MB/s ± 3% +1.21% (p=0.007 n=20+19) GoParse-4 18.6MB/s ± 4% 18.7MB/s ± 2% ~ (p=0.405 n=19+19) RegexpMatchEasy0_32-4 396MB/s ± 2% 426MB/s ± 8% +7.56% (p=0.000 n=17+20) RegexpMatchEasy0_1K-4 5.18GB/s ± 2% 5.48GB/s ± 2% +5.79% (p=0.000 n=20+20) RegexpMatchEasy1_32-4 413MB/s ± 4% 444MB/s ± 6% +7.46% (p=0.000 n=20+19) RegexpMatchEasy1_1K-4 3.00GB/s ± 3% 3.00GB/s ± 3% ~ (p=0.678 n=20+20) RegexpMatchMedium_32-4 8.82MB/s ± 2% 8.90MB/s ± 3% +0.99% (p=0.044 n=20+20) RegexpMatchMedium_1K-4 28.0MB/s ± 2% 28.6MB/s ± 2% +2.32% (p=0.000 n=18+20) RegexpMatchHard_32-4 18.3MB/s ± 3% 18.4MB/s ± 2% ~ (p=0.482 n=20+19) RegexpMatchHard_1K-4 19.5MB/s ± 2% 19.7MB/s ± 3% +1.18% (p=0.004 n=20+20) Revcomp-4 668MB/s ± 4% 674MB/s ± 2% ~ (p=0.066 n=20+18) Template-4 33.8MB/s ± 2% 33.6MB/s ± 2% ~ (p=0.104 n=20+20) [Geo mean] 124MB/s 126MB/s +1.54% Updates #15561 Updates #18270 Change-Id: I980086efe28b36aa27f81577065e2a729ff03d4e Reviewed-on: https://go-review.googlesource.com/72490 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
Change https://golang.org/cl/83538 mentions this issue: |
We can't currently inline functions that contain closures anyway, so just delete this budgeting code for now. Re-enable once we can (if ever) inline functions with nested closures. Updates #15561. Fixes #23093. Change-Id: Idc5f8e042ccfcc8921022e58d3843719d4ab821e Reviewed-on: https://go-review.googlesource.com/83538 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
Hi @rogpeppe, is your specific request in #15561 (comment) addressed at this point, or still pending? For your example, it looks like Separately, I'm not sure sure if this overall issue is about addressing inlining
(And sorry, I just looked at a bunch of examples, so non-trivial chance I might just be confused here). |
The SearchInts, SearchFloat64s, and SearchStrings functions are convenience wrappers around the generic Search function, which takes a function parameter to determine truthfulness. However, since this passed function is utilized within a for loop, it cannot currently be inlined by the Go compiler, resulting in some degree of performance overhead (see golang#15561). This trivial commit manually inlines the Search function itself to these convenience wrappers, avoiding the function call overhead in the hot loop. It replaces 1 line of copy-pasted code with 10 lines of copy-pasted code, however it has a roughly 2x beneficial effect on performance as seen below: $ benchstat before.txt after.txt name old time/op new time/op delta SearchWrappers-16 84.7ns ± 1% 43.8ns ± 2% -48.34% (p=0.000 n=19+19) In the future, generics may enable similar perf gains while avoiding the minimal copypasta, but for now this small change can improve standard library performance and give the existing search "convenience wrappers" performance benefits as well as convenience. Updates golang#15561
The SearchInts, SearchFloat64s, and SearchStrings functions are convenience wrappers around the generic Search function, which takes a function parameter to determine truthfulness. However, since this passed function is utilized within a for loop, it cannot currently be inlined by the Go compiler, resulting in some degree of performance overhead (see golang#15561). This trivial commit manually inlines the Search function itself to these convenience wrappers, avoiding the function call overhead in the hot loop. It replaces 1 line of copy-pasted code with 10 lines of copy-pasted code, however it has a roughly 2x beneficial effect on performance as seen below: $ benchstat before.txt after.txt name old time/op new time/op delta SearchWrappers-16 84.7ns ± 1% 43.8ns ± 2% -48.34% (p=0.000 n=19+19) In the future, generics may enable similar perf gains while avoiding the minimal copypasta, but for now this small change can improve standard library performance and give the existing search "convenience wrappers" performance benefits as well as convenience. Updates golang#15561
Change https://golang.org/cl/260018 mentions this issue: |
This appears to be fixed on the latest Go https://go.godbolt.org/z/MdGb366c5 @josharian @rogpeppe am I correct? |
Split out from #9337 and #15537.
The text was updated successfully, but these errors were encountered: