Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field guards from unreachable code can be moved to reachable code and can cause repeated deoptimization #50245

Closed
alexmarkov opened this issue Oct 18, 2022 · 6 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size

Comments

@alexmarkov
Copy link
Contributor

Consider the following example:

class A {
  List<int> foo;
  A(this.foo);
}

A obj = A([1, 2, 3]);

main() {
  // Make sure A.foo= is compiled.
  obj.foo = [];
  int sum = 0;
  for (int i = 0; i < 1000000000; ++i) {
    if (int.parse('1') != 1) {
      // Field guard from this unreachable code is moved up
      // and causes repeated deoptimization.
      obj.foo = const [];
    }
    sum += i;
  }
  print(sum);
}

IL for the unreachable code obj.foo = const []; after ApplyClassIds pass:

      v17 <- Constant(#_ImmutableList len:0) T{_ImmutableList}
    CheckClass:50(v16 Cids[1: A etc.  cid 1180])
    GuardFieldClass:50(foo <not-nullable _GrowableList@0150898 {trivially-exact(1)}>, v17)
    GuardFieldType:50(foo <not-nullable _GrowableList@0150898 {trivially-exact(1)}>, v17)
    StoreField(v16 . foo = v17)

LICM moves GuardFieldClass and GuardFieldType instructions to the beginning of the method:

B1[osr entry]:2 stack_depth=0 {
      v2 <- Parameter(0)
      v3 <- Parameter(1) T{*?}
      v4 <- Parameter(2) T{*?}
}
    CheckSmi:64(v3)
    CheckSmi:64(v4)
    GuardFieldClass:64(foo <not-nullable _GrowableList@0150898 {trivially-exact(1)}>, v17)
    GuardFieldType:64(foo <not-nullable _GrowableList@0150898 {trivially-exact(1)}>, v17 T{_GrowableList})
    goto:64 B8

These instructions always trigger deoptimization as _ImmutableList doesn't match guarded field class _GrowableList.
As a result, this causes OSR->deoptimization->OSR loop, wasted JIT compilations and unoptimized code running all the time.

/cc @mraleph @mkustermann @rmacnak-google

@alexmarkov alexmarkov added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size labels Oct 18, 2022
@rmacnak-google
Copy link
Contributor

I'm surprised the unreachable setter is inlined. The IC should have a usage counter of 0 that blocks the inlining. Maybe some AOT optimizations have broken the old invariant the never inline calls that have not executed?

@alexmarkov
Copy link
Contributor Author

I need to clarify that in this issue by "unreachable code" I mean code which was not executed so far. It is still reachable from compiler perspective.

I'm surprised the unreachable setter is inlined. The IC should have a usage counter of 0 that blocks the inlining.

Call specializer (ApplyClassIds pass) doesn't look at IC usage counters at all - it just tries to recognize and replace certain calls with more specific IL instructions. This includes "inlining" implicit getters and setters.

Even in case of the normal inliner FlowGraphInliner::AlwaysInline heuristic might be able to override usage counter of 0 to inline tiny methods and methods annotated with vm:prefer-inline pragma.

While fully blocking inlining of the calls with usage counter 0 might solve this particular problem, it will probably regress performance in other cases, as re-optimization happens rarely and it is not uncommon to execute code paths which were not executed before optimized compilation.

@mraleph
Copy link
Member

mraleph commented Oct 19, 2022

There are some, ahm, workarounds in the JIT that try to work around this issue. We try to track if a hoisted check has deoptimized and then prohibit LICM. Look for SetProhibitsHoistingCheckClass (the name is somewhat of a misnomer because it completely inhibits LICM). So we can put this issue under the carpet by adding {set_,}licm_hoisted property on guards and then adding necessary flags to deopt stub. You can copy the handling of this from CheckClass.

There is more to this - where we try to track which exactly check needs to be pinned instead of disabling LICM completely, but I am not sure this code works as intended because once the check is moved it looses information about its original deopt id. We should consider fixing this but it requires adding more information to checks/guards and then feeding this information through deopt stubs. Unclear if we want to invest into that.

@alexmarkov do you have IL from _Uri that hits this problem? I would like to see if we can statically predict that hoisted guard will always deopt or is it a dynamic property.

@alexmarkov
Copy link
Contributor Author

@mraleph In the case I saw in #50239 we can predict this statically, because rhs of the store is a constant (const []), similarly to the reproduction above. However, in general case this would be unpredictable.

Maybe we should somehow attribute guard instructions which were never executed (from ICData usage counters) and disallow/limit LICM for such code?

@alexmarkov
Copy link
Contributor Author

As a very controversial idea we can also replace code which was never executed with an unconditional deoptimization (in JIT). This might even result in a better performance as we might be able to do more optimizations such as CSE/LICM in the hot code paths. But this could also result in much more deoptimizations/recompilations compared to current setup.

@rmacnak-google
Copy link
Contributor

it is not uncommon to execute code paths which were not executed before optimized compilation.

That will cause bugs. Code coverage relies on at least the first execution going through unoptimized code to increment counters (#42061). Similarly, breakpoints rely on the unoptimized function being compiled first to resolve to a code position.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-performance Issue relates to performance or code size
Projects
None yet
Development

No branches or pull requests

3 participants