Skip to content

[vm/aot] Bad code motion with @pragma('vm:unsafe:no-bounds-checks') #56808

@rakudrama

Description

@rakudrama

In both foo1 and foo2 generate code that crashes:

===== CRASH =====
si_signo=Segmentation fault(11), si_code=SEGV_MAPERR(1), si_addr=0x7f68355141e0
Aborted
@pragma('vm:unsafe:no-bounds-checks')
@pragma('vm:never-inline')
int foo1(String s) {
  int sum = 0;
  for (int i = 0; i < s.length; i++) {
    if (s.codeUnitAt(i) == 0) {
      sum += s.codeUnitAt(1_000_000_000);
    }
  }
  return sum; 
}

@pragma('vm:unsafe:no-bounds-checks')
@pragma('vm:never-inline')
int foo2(String s) {
  int sum = 0;
  for (int i = 1; i < s.length; i++) {
      sum += s.codeUnitAt(1_000_000_000);
  }
  return sum; 
}


main() {
  print(foo1(''));
  print(foo1('x'));
  print(foo2(''));
  print(foo2('x'));
}

The problem is that the position of the memory access is no longer constrained by the bounds check, which can throw, so is hoisted out of the loop. The programmer, even one with a licence to use the unsafe pragma, probably does not expect this result.

I worry that there are more subtle versions of this problem. For example, a bounds check would prevent partial redundancy elimination from moving a load a[i] at the top of the loop into the continue block when the same load a[0] occurs before the loop.

After bounds checks are removed there needs to be a mechanism to prevent subsequent unsound code motion. I'm not sure how this is currently done. One mechanism might be to arrange phase ordering so that there are no code motion phases after bounds check elimination optimization. If phase-ordering is mechanism, it implies that bounds check elimination cannot have been done on inlining candidates. We might know that a bounds check is safe, but it cannot be removed early.

One remedy might be to split bounds check elimination into two phases. The first, which can be run at any time, marks bounds checks as to-be-removed, but still considers them capable of throwing. This keeps the loads constrained. The second phase, after all code motion, does cleanup to remove the bounds checks and any pure transitive inputs. Inlining heuristics could discount the size of to-be-removed bounds checks. The unsafe annotation would generate the bounds check with the to-be-removed flag already set. Perhaps there could a vestigial form of the bound check without a length input to release the non-index inputs early.

Metadata

Metadata

Assignees

Labels

P3A lower priority bug or feature requestarea-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.triagedIssue has been triaged by sub teamtype-bugIncorrect behavior (everything from a crash to more subtle misbehavior)type-performanceIssue relates to performance or code size

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions