-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the quality of unoptimized code #36409
Comments
Rationale: The "unoptimized code" representation is really a stack machine that pushes and pops arguments. The intermediate representation did not show the correct labeling for "temporaries", in particular around push arguments. This CL fixes that bug, making the human readable form of the intermediate much easier to understand. Also, it allows for proper stack depth computations. First use: Sharpen condition on OSR points a bit. Simple control flow collections will now allow OSR. #36421 #36409 Change-Id: I8be707af08462a22687b355b747ba85e42431ad5 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98500 Commit-Queue: Aart Bik <ajcbik@google.com> Reviewed-by: Vyacheslav Egorov <vegorov@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
Bug: #36409 Change-Id: Id1a6a65b26c95fb3f56bf844943aa09cc4f13a2a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99728 Reviewed-by: Alexander Markov <alexmarkov@google.com> Reviewed-by: Aart Bik <ajcbik@google.com> Commit-Queue: Ryan Macnak <rmacnak@google.com>
… calls. This is preparation for switching the call's stub when polymorphism gets too high. Bug: #26780 Bug: #29294 Bug: #36409 Change-Id: If601ea97a1942a567bee0847afeb3b4e9c11481a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99705 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
…res. Bug: #26780 Bug: #36409 Change-Id: I7865e6fb033ad2a4b1fd185d38c1b5a4c96dd6e7 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/101921 Reviewed-by: Alexander Markov <alexmarkov@google.com> Commit-Queue: Ryan Macnak <rmacnak@google.com>
I have started exploring some ideas. The first is a basic peephole optimizer, which does not add any significant overhead to compile-time, but reduces code size and possibly runtime. For example, ;; t0 <- Constant(#0)
0x7f45d0cb84e4 6a00 push 0
;; StoreLocal(i @-2, t0)
0x7f45d0cb84e6 58 pop rax
0x7f45d0cb84e7 488945d8 movq [rbp-0x28],rax becomes the more "compact" (saving one byte in this case, but when combined with others, it may add up) ;; t0 <- Constant(#0)
;; StoreLocal(i @-2, t0)
0x7f29804f83a3 33c0 xorl rax,rax
0x7f29804f83a5 488945d8 movq [rbp-0x28],rax On ARM, the unoptimized code is ;; t0 <- Constant(#0)
0x7f4de1286408 d2800000 movz r0, 0x0
0x7f4de128640c f81f8de0 strx r0, [r15, #-8] !
;; StoreLocal(k @-1, t0)
0x7f4de1286410 f84085e0 ldrx r0, [r15], #8 !
0x7f4de1286414 f81e03a0 strx r0, [fp, #-32] with peephole, saving 8 bytes ;; t0 <- Constant(#0)
0x7f1ff8bc2888 d2800000 movz r0, 0x0
;; Peephole
;; StoreLocal(k @-1, t0)
0x7f1ff8bc288c f81e03a0 strx r0, [fp, #-32]
|
Performance gains look promising too, even for very basic peephole optimization with window size 1. On a toy function with a loop and some operations, I get the following timings (with unoptimized referring to "--optimization_counter_threshold=-1"):
So full optimization is about 7x faster, but peephole optimizations yields about 20% faster performance on unoptimized code. |
Remove the optimization function stub and self function from the constant pool of every unoptimized function. Remove one PP move from unoptimized frame entry. Bug: #36409 Change-Id: Idbe37ce36b57dc316a131e6c83742ee2b3df4a0d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102660 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Aart Bik <ajcbik@google.com>
Function entry points have been subject to block scheduling since 06f9a9e. Block scheduling in AOT never reorders the entries. Bug: #36409 Bug: #36731 Change-Id: Id6725fed4d9bbd381631fa88c5397435a35acc9a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102463 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
Rationale: A basic peephole optimizer with window size one that avoids redundant push-pop sequences already results is substantial savings in code size and runtime, without any noticeable impact on compile time (the peephole is very, very fast). Performance: Golem unoptimized code (which typically runs -90% compared to optimized code, sees 5-60% improvements #36409 Change-Id: I08db4b3dbc92377d89340a4969db6e664e54bceb Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102980 Commit-Queue: Aart Bik <ajcbik@google.com> Reviewed-by: Vyacheslav Egorov <vegorov@google.com> Reviewed-by: Ryan Macnak <rmacnak@google.com>
Next idea to explore is using a very cheap analysis to lower some of the instance calls for basic arithmetic, mainly to make e.g. loop logic faster. |
When an instance call in unoptimized code creates more than FLAG_max_polymorphic_checks cases, switch the call to use a MegamorphicCache instead of ICData. The prevents unbounded collection of type feedback, and gives improvements on microbenchmarks in the 3-8% range for unoptimized code. It also leads to a loss of target frequency information for the optimizer, leading to different ordering for range checks in polymorphic inlining. This leads to changes on megamorphic microbenchmarks from -31% to +60%, weighted toward the negative end. In practice the frequency information seems unimportant, as dart2js has 4.01% geomean improvement. This is a step toward direct monomorphic calls in unoptimized code, which will also make use of the patching and type feedback extraction added here. Bug: #26780 Bug: #36409 Bug: #36731 Change-Id: I29f53f23b6794c5f5f0db8b8184788cee16fd9c5 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99270 Reviewed-by: Alexander Markov <alexmarkov@google.com>
This reverts commit fde6a59. Reason for revert: This commit looks to be causing new test failures on dart2js. While there are some odd results on some builders that look like infra failures, some builders (for example, dart2js-minified-strong-linux-x64-d8) show the new failing tests clearly. Original change's description: > [vm, compiler] Unoptimized megamorphic calls. > > When an instance call in unoptimized code creates more than FLAG_max_polymorphic_checks cases, switch the call to use a MegamorphicCache instead of ICData. The prevents unbounded collection of type feedback, and gives improvements on microbenchmarks in the 3-8% range for unoptimized code. > > It also leads to a loss of target frequency information for the optimizer, leading to different ordering for range checks in polymorphic inlining. This leads to changes on megamorphic microbenchmarks from -31% to +60%, weighted toward the negative end. > > In practice the frequency information seems unimportant, as dart2js has 4.01% geomean improvement. > > This is a step toward direct monomorphic calls in unoptimized code, which will also make use of the patching and type feedback extraction added here. > > Bug: #26780 > Bug: #36409 > Bug: #36731 > Change-Id: I29f53f23b6794c5f5f0db8b8184788cee16fd9c5 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99270 > Reviewed-by: Alexander Markov <alexmarkov@google.com> TBR=rmacnak@google.com,alexmarkov@google.com,ajcbik@google.com Change-Id: Icad46b93cdf8541a00563f49da6b4ac0a4df1ba1 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: #26780, #36409, #36731 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/103440 Reviewed-by: Teagan Strickland <sstrickl@google.com> Commit-Queue: Teagan Strickland <sstrickl@google.com>
dart-bytecode, arm64: +4.742% geomean dart-bytecode-jit-unopt, arm64: +12.73% geomean dart2js-compile, x64: +3.635% geomean In the polymorphic and unlinked cases, call to a stub the does a linear scan against an ICData. In the monomorphic case, call to a prologue of the expected target function that checks the expected receiver class. There is additional indirection in the JIT version compared to the AOT version to also tick a usage counter so the inliner can make good decisions. In the megamorphic case, call to a stub that does a hash table lookup against a MegamorphicCache. Megamorphic call sites face a loss of precision in usage counts. The call site count is not recorded and the usage counter of the target function is used as an approximation. Monomorphic and megamorphic calls sites are reset to the polymorphic/unlinked state on hot reload. Monomorphic and megamorphic calls sites do not check the stepping state, so they are reset to the polymorphic/unlinked state when stepping begins and disabled. Back-edges now increment the usage counter in addition to checking it. This ensures function with loops containing monomorphic calls will eventually cross the optimization threshold. Fixed backwards use of kMonomorphicEntryOffset and kPolymorphicEntryOffset. Fixed C stack overflow when bouncing between the KBC interpreter and a simulator. Bug: #26780 Bug: #36409 Bug: #36731 Change-Id: I78a49cccd962703a459288e71ce246ed845df474 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102820 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
…tion logic Rationale: 1) Centralize constant expression evaluation logic. 2) Avoid code duplication with constant folding in unoptimized code. Bug: #36409 Change-Id: I0ee4916e9e731df3dd8eb30dd5f182bc40e1dfa3 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/122172 Reviewed-by: Aart Bik <ajcbik@google.com> Commit-Queue: Fizaa Luthra <fizaaluthra@google.com>
… calls. This is preparation for switching the call's stub when polymorphism gets too high. Bug: dart-lang#26780 Bug: dart-lang#29294 Bug: dart-lang#36409 Change-Id: If601ea97a1942a567bee0847afeb3b4e9c11481a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99705 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
Remove the optimization function stub and self function from the constant pool of every unoptimized function. Remove one PP move from unoptimized frame entry. Bug: dart-lang#36409 Change-Id: Idbe37ce36b57dc316a131e6c83742ee2b3df4a0d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102660 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Aart Bik <ajcbik@google.com>
Function entry points have been subject to block scheduling since 06f9a9e. Block scheduling in AOT never reorders the entries. Bug: dart-lang#36409 Bug: dart-lang#36731 Change-Id: Id6725fed4d9bbd381631fa88c5397435a35acc9a Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102463 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
dart-bytecode, arm64: +4.742% geomean dart-bytecode-jit-unopt, arm64: +12.73% geomean dart2js-compile, x64: +3.635% geomean In the polymorphic and unlinked cases, call to a stub the does a linear scan against an ICData. In the monomorphic case, call to a prologue of the expected target function that checks the expected receiver class. There is additional indirection in the JIT version compared to the AOT version to also tick a usage counter so the inliner can make good decisions. In the megamorphic case, call to a stub that does a hash table lookup against a MegamorphicCache. Megamorphic call sites face a loss of precision in usage counts. The call site count is not recorded and the usage counter of the target function is used as an approximation. Monomorphic and megamorphic calls sites are reset to the polymorphic/unlinked state on hot reload. Monomorphic and megamorphic calls sites do not check the stepping state, so they are reset to the polymorphic/unlinked state when stepping begins and disabled. Back-edges now increment the usage counter in addition to checking it. This ensures function with loops containing monomorphic calls will eventually cross the optimization threshold. Fixed backwards use of kMonomorphicEntryOffset and kPolymorphicEntryOffset. Fixed C stack overflow when bouncing between the KBC interpreter and a simulator. Bug: dart-lang#26780 Bug: dart-lang#36409 Bug: dart-lang#36731 Change-Id: I78a49cccd962703a459288e71ce246ed845df474 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/102820 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
I am going to close this as the original issue of ephemeral apps is not being worked on. |
Focus on improving the quality of unoptimized code. Look at code quality and look for quick ways to improve the code
The text was updated successfully, but these errors were encountered: