Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

Closed
xortator opened this issue Jan 23, 2023 · 2 comments
Closed

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

xortator opened this issue Jan 23, 2023 · 2 comments

Comments

@xortator
Copy link
Contributor

xortator commented Jan 23, 2023

Alive2 repro: https://godbolt.org/z/P7fb4P9ab

The bug is in guard widening. However, to demonstrate why it is a bug, I will also run indvars. Consider case:

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  br i1 %wc1, label %guard_block, label %exit_by_wc

exit_by_wc:
  %rval1 = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv) ]
  ret i32 %rval1

guard_block:
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 %guard, label %backedge, label %failure

backedge:
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

exit:
  ret i32 -1

failure:
  %rval2 = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv) ]
  ret i32 %rval2
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite)
declare i1 @llvm.experimental.widenable.condition()

declare void @side_effect()

Let start = 0. The possible scenarios here are the following:

  1. wc1 = false. In this case, @side_effect is never called, we deopt with iv = 0.
  2. wc1 = true, wc2 = false (on 1st iteration). In this case, @side_effect is never called, we deopt with iv = 0.
  3. wc1 = true, wc2 = true (on 1st iteration). In this case, @side_effect will be called once. Deopt after loop will not happen (because wc1 = true and is loop-invariant), however cond will be false on 1st iteration (iv will be equal to start + 1). So regardless on wc2, we will then deopt with iv = 1.

As you can see, in all cases the number of calls of @side_effect matches the value we deoptimize with. We can expect that this fact stays true, as long as we do the right things.

Now, indvars comes. Invars notices that branch by wc1 is loop-invariant. Therefore, deopt there can only happen on 1st iteration or never. It means it is safe to replace the iv in deopt value with start. Result: https://godbolt.org/z/Wq7EMT58M

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:                                             ; preds = %backedge, %entry
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  br i1 %wc1, label %guard_block, label %exit_by_wc

exit_by_wc:                                       ; preds = %loop
  %iv.lcssa = phi i32 [ %start, %loop ]
  %rval1 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa) ]
  ret i32 %rval1

guard_block:                                      ; preds = %loop
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 %guard, label %backedge, label %failure

backedge:                                         ; preds = %guard_block
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

  ret i32 -1

failure:                                          ; preds = %guard_block
  %iv.lcssa1 = phi i32 [ %iv, %guard_block ]
  %rval2 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa1) ]
  ret i32 %rval2
}

declare i1 @llvm.experimental.widenable.condition() #0

declare void @side_effect()

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite) }

Note that this transform is completely legal and corresponds to LangRef for widenable condition, saying:

While this may appear similar in semantics to undef, it is very different in that an invocation produces a particular, singular value. It is also intended to be lowered late, and remain available for specific optimizations and transforms that can benefit from its special properties

(when it was written, there is no such thing as freeze, but you can think that widenable condition outside the loop behaves exactly like freeze(undef) outside the loop, so it's a loop invariant).

And now how the bug happens. Let's run guard-widening on top of it: https://godbolt.org/z/P7fb4P9ab

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:                                             ; preds = %backedge, %entry
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wide.chk = and i1 true, %cond
  %0 = and i1 %wide.chk, %wc1
  br i1 %0, label %guard_block, label %exit_by_wc

exit_by_wc:                                       ; preds = %loop
  %iv.lcssa = phi i32 [ %start, %loop ]
  %rval1 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa) ]
  ret i32 %rval1

guard_block:                                      ; preds = %loop
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 true, label %backedge, label %failure

backedge:                                         ; preds = %guard_block
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

  ret i32 -1

failure:                                          ; preds = %guard_block
  %iv.lcssa1 = phi i32 [ %iv, %guard_block ]
  %rval2 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa1) ]
  ret i32 %rval2
}

declare i1 @llvm.experimental.widenable.condition() #0

declare void @side_effect()

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite) }

Now imagine that wc1 = true and wc2 = true. On the first iteration, wide_check is true (because iv = 0 and start + 1 = 1). So we will reach guard block and then backedge, calling @side_effect once. Then, on the 2nd iteration, cond = false and therefore wide.chk = false. It means that we must deoptimize in block exit_by_wc with lv.lcssa = 0.

So now we've called side effect once, but after deopt, we think that iv = 0. If it was used to re-execute the loop in interpreter, it will make one extra iteration.

My working theory is that widening into loop-invariant WCs is just wrong.

@xortator
Copy link
Contributor Author

In fact, widening is currently creating widened condition at the branch point, but it is only legal to do so in WC's location.

xortator added a commit that referenced this issue Jan 23, 2023
#60234 explains how widening
of a branch by loop-invariant condition is causing a miscompile.
CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Jan 24, 2023
llvm/llvm-project#60234 explains how widening
of a branch by loop-invariant condition is causing a miscompile.
xortator added a commit that referenced this issue Jan 31, 2023
…branches. PR60234

When guards are represented as widenable branches, there is a tricky
situation when the branch stays in loop but widenable condition doesn't.
It means that the widenable condition is loop-invariant, and some other
optimizations could have done changes using this fact.

If widening is allowed to create widened condition inside this loop,
and join the loop-invariant wc with some non-invariant facts, it can
cause miscompile. See example of this at #60234.

The solution is to adjust the point of generationg the wide condition,
and therefore of hoisting all related parts there. It should not be before
the branch, but before the widenable_condition call. The fact that `wc()`
and the wide condition are in the same block guarantees that they will
not violate the invariance property for any loop.

Differential Revision: https://reviews.llvm.org/D142693
Reviewed By: apilipenko
@xortator
Copy link
Contributor Author

commit 5cb568a37a53a9b0fd8fc9c2c35870cad43623e9 (HEAD -> main, origin/main, origin/HEAD)
Author: Max Kazantsev <mkazantsev@azul.com>
Date:   Tue Jan 31 12:29:26 2023 +0700

    [GuardWidening] Choose right point for generating wide condition for branches. PR60234

    When guards are represented as widenable branches, there is a tricky
    situation when the branch stays in loop but widenable condition doesn't.
    It means that the widenable condition is loop-invariant, and some other
    optimizations could have done changes using this fact.

    If widening is allowed to create widened condition inside this loop,
    and join the loop-invariant wc with some non-invariant facts, it can
    cause miscompile. See example of this at https://github.com/llvm/llvm-project/issues/60234.

    The solution is to adjust the point of generationg the wide condition,
    and therefore of hoisting all related parts there. It should not be before
    the branch, but before the widenable_condition call. The fact that `wc()`
    and the wide condition are in the same block guarantees that they will
    not violate the invariance property for any loop.

    Differential Revision: https://reviews.llvm.org/D142693
    Reviewed By: apilipenko

CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Feb 1, 2023
…branches. PR60234

When guards are represented as widenable branches, there is a tricky
situation when the branch stays in loop but widenable condition doesn't.
It means that the widenable condition is loop-invariant, and some other
optimizations could have done changes using this fact.

If widening is allowed to create widened condition inside this loop,
and join the loop-invariant wc with some non-invariant facts, it can
cause miscompile. See example of this at llvm/llvm-project#60234.

The solution is to adjust the point of generationg the wide condition,
and therefore of hoisting all related parts there. It should not be before
the branch, but before the widenable_condition call. The fact that `wc()`
and the wide condition are in the same block guarantees that they will
not violate the invariance property for any loop.

Differential Revision: https://reviews.llvm.org/D142693
Reviewed By: apilipenko
xortator added a commit that referenced this issue Mar 20, 2023
Despite the fact that it is legal, it is not profitable. It may prevent
Loop Guard Widening to happen. Because of bug described at
#60234, now the guard widening is
only possible when condtion we want to add is available at the point of the
widenable_condition() of dominating guard. It means that, if all such calls are
hoisted out of loop, and the loop conditions depend on loop-variants, we cannot
widen. Hoisting is otherwise not helpful, because it does not introduce any
optimization opportunities.

Differential Revision: https://reviews.llvm.org/D146274
Reviewed By: apilipenko
aleks-tmb pushed a commit to aleks-tmb/llvm-project that referenced this issue Nov 3, 2023
Turn GuardWidening and LoopPredication branch widening scheme when
always `br(cond && WC)` form is maintained:
```
%wc = call i1 @widenable.condition()
%guard = and i1 %cond0, %wc
br i1 %guard ...
```
-->
```
%wide.check = and i1 %cond0, %cond1
%wc = call i1 @widenable.condition()
%guard = and i1 %%wide.check, %wc
br i1 %guard ...
```

... into the scheme of widenable conditions widening when WC is being
replaced by `NewCheck && WC`:
```
%wc = call i1 @widenable.condition()
%guard = and i1 %cond0, %wc
br i1 %guard ...
```
-->
```
%wc = call i1 @widenable.condition()
%wide.check = and i1 %cond1, %wc
%guard = and i1 %cond0, %wide.check
br i1 %guard ...
```

The change aims to avoid cases when we turn loop-invarinat branch into a
loop-variant one, like in this issue:
llvm#60234
veselypeta pushed a commit to veselypeta/cherillvm that referenced this issue Jun 17, 2024
llvm/llvm-project#60234 explains how widening
of a branch by loop-invariant condition is causing a miscompile.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants