[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

xortator · 2023-01-23T10:21:49Z

Alive2 repro: https://godbolt.org/z/P7fb4P9ab

The bug is in guard widening. However, to demonstrate why it is a bug, I will also run indvars. Consider case:

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  br i1 %wc1, label %guard_block, label %exit_by_wc

exit_by_wc:
  %rval1 = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv) ]
  ret i32 %rval1

guard_block:
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 %guard, label %backedge, label %failure

backedge:
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

exit:
  ret i32 -1

failure:
  %rval2 = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv) ]
  ret i32 %rval2
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite)
declare i1 @llvm.experimental.widenable.condition()

declare void @side_effect()

Let start = 0. The possible scenarios here are the following:

wc1 = false. In this case, @side_effect is never called, we deopt with iv = 0.
wc1 = true, wc2 = false (on 1st iteration). In this case, @side_effect is never called, we deopt with iv = 0.
wc1 = true, wc2 = true (on 1st iteration). In this case, @side_effect will be called once. Deopt after loop will not happen (because wc1 = true and is loop-invariant), however cond will be false on 1st iteration (iv will be equal to start + 1). So regardless on wc2, we will then deopt with iv = 1.

As you can see, in all cases the number of calls of @side_effect matches the value we deoptimize with. We can expect that this fact stays true, as long as we do the right things.

Now, indvars comes. Invars notices that branch by wc1 is loop-invariant. Therefore, deopt there can only happen on 1st iteration or never. It means it is safe to replace the iv in deopt value with start. Result: https://godbolt.org/z/Wq7EMT58M

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:                                             ; preds = %backedge, %entry
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  br i1 %wc1, label %guard_block, label %exit_by_wc

exit_by_wc:                                       ; preds = %loop
  %iv.lcssa = phi i32 [ %start, %loop ]
  %rval1 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa) ]
  ret i32 %rval1

guard_block:                                      ; preds = %loop
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 %guard, label %backedge, label %failure

backedge:                                         ; preds = %guard_block
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

  ret i32 -1

failure:                                          ; preds = %guard_block
  %iv.lcssa1 = phi i32 [ %iv, %guard_block ]
  %rval2 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa1) ]
  ret i32 %rval2
}

declare i1 @llvm.experimental.widenable.condition() #0

declare void @side_effect()

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite) }

Note that this transform is completely legal and corresponds to LangRef for widenable condition, saying:

While this may appear similar in semantics to undef, it is very different in that an invocation produces a particular, singular value. It is also intended to be lowered late, and remain available for specific optimizations and transforms that can benefit from its special properties

(when it was written, there is no such thing as freeze, but you can think that widenable condition outside the loop behaves exactly like freeze(undef) outside the loop, so it's a loop invariant).

And now how the bug happens. Let's run guard-widening on top of it: https://godbolt.org/z/P7fb4P9ab

declare i32 @llvm.experimental.deoptimize.i32(...)

define i32 @test(i32 %start) {
entry:
  %wc1 = call i1 @llvm.experimental.widenable.condition()
  br label %loop

loop:                                             ; preds = %backedge, %entry
  %iv = phi i32 [ %start, %entry ], [ %iv.next, %backedge ]
  %start_plus_1 = add i32 %start, 1
  %cond = icmp ne i32 %start_plus_1, %iv
  %wide.chk = and i1 true, %cond
  %0 = and i1 %wide.chk, %wc1
  br i1 %0, label %guard_block, label %exit_by_wc

exit_by_wc:                                       ; preds = %loop
  %iv.lcssa = phi i32 [ %start, %loop ]
  %rval1 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa) ]
  ret i32 %rval1

guard_block:                                      ; preds = %loop
  %wc2 = call i1 @llvm.experimental.widenable.condition()
  %guard = and i1 %cond, %wc2
  br i1 true, label %backedge, label %failure

backedge:                                         ; preds = %guard_block
  call void @side_effect()
  %iv.next = add i32 %iv, 1
  br label %loop

  ret i32 -1

failure:                                          ; preds = %guard_block
  %iv.lcssa1 = phi i32 [ %iv, %guard_block ]
  %rval2 = call i32 (...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 %iv.lcssa1) ]
  ret i32 %rval2
}

declare i1 @llvm.experimental.widenable.condition() #0

declare void @side_effect()

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(inaccessiblemem: readwrite) }

Now imagine that wc1 = true and wc2 = true. On the first iteration, wide_check is true (because iv = 0 and start + 1 = 1). So we will reach guard block and then backedge, calling @side_effect once. Then, on the 2nd iteration, cond = false and therefore wide.chk = false. It means that we must deoptimize in block exit_by_wc with lv.lcssa = 0.

So now we've called side effect once, but after deopt, we think that iv = 0. If it was used to re-execute the loop in interpreter, it will make one extra iteration.

My working theory is that widening into loop-invariant WCs is just wrong.

The text was updated successfully, but these errors were encountered:

xortator · 2023-01-23T11:44:45Z

In fact, widening is currently creating widened condition at the branch point, but it is only legal to do so in WC's location.

#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

llvm/llvm-project#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

…branches. PR60234 When guards are represented as widenable branches, there is a tricky situation when the branch stays in loop but widenable condition doesn't. It means that the widenable condition is loop-invariant, and some other optimizations could have done changes using this fact. If widening is allowed to create widened condition inside this loop, and join the loop-invariant wc with some non-invariant facts, it can cause miscompile. See example of this at #60234. The solution is to adjust the point of generationg the wide condition, and therefore of hoisting all related parts there. It should not be before the branch, but before the widenable_condition call. The fact that `wc()` and the wide condition are in the same block guarantees that they will not violate the invariance property for any loop. Differential Revision: https://reviews.llvm.org/D142693 Reviewed By: apilipenko

xortator · 2023-01-31T06:13:51Z

commit 5cb568a37a53a9b0fd8fc9c2c35870cad43623e9 (HEAD -> main, origin/main, origin/HEAD)
Author: Max Kazantsev <mkazantsev@azul.com>
Date:   Tue Jan 31 12:29:26 2023 +0700

    [GuardWidening] Choose right point for generating wide condition for branches. PR60234

    When guards are represented as widenable branches, there is a tricky
    situation when the branch stays in loop but widenable condition doesn't.
    It means that the widenable condition is loop-invariant, and some other
    optimizations could have done changes using this fact.

    If widening is allowed to create widened condition inside this loop,
    and join the loop-invariant wc with some non-invariant facts, it can
    cause miscompile. See example of this at https://github.com/llvm/llvm-project/issues/60234.

    The solution is to adjust the point of generationg the wide condition,
    and therefore of hoisting all related parts there. It should not be before
    the branch, but before the widenable_condition call. The fact that `wc()`
    and the wide condition are in the same block guarantees that they will
    not violate the invariance property for any loop.

    Differential Revision: https://reviews.llvm.org/D142693
    Reviewed By: apilipenko

…branches. PR60234 When guards are represented as widenable branches, there is a tricky situation when the branch stays in loop but widenable condition doesn't. It means that the widenable condition is loop-invariant, and some other optimizations could have done changes using this fact. If widening is allowed to create widened condition inside this loop, and join the loop-invariant wc with some non-invariant facts, it can cause miscompile. See example of this at llvm/llvm-project#60234. The solution is to adjust the point of generationg the wide condition, and therefore of hoisting all related parts there. It should not be before the branch, but before the widenable_condition call. The fact that `wc()` and the wide condition are in the same block guarantees that they will not violate the invariance property for any loop. Differential Revision: https://reviews.llvm.org/D142693 Reviewed By: apilipenko

Despite the fact that it is legal, it is not profitable. It may prevent Loop Guard Widening to happen. Because of bug described at #60234, now the guard widening is only possible when condtion we want to add is available at the point of the widenable_condition() of dominating guard. It means that, if all such calls are hoisted out of loop, and the loop conditions depend on loop-variants, we cannot widen. Hoisting is otherwise not helpful, because it does not introduce any optimization opportunities. Differential Revision: https://reviews.llvm.org/D146274 Reviewed By: apilipenko

Turn GuardWidening and LoopPredication branch widening scheme when always `br(cond && WC)` form is maintained: ``` %wc = call i1 @widenable.condition() %guard = and i1 %cond0, %wc br i1 %guard ... ``` --> ``` %wide.check = and i1 %cond0, %cond1 %wc = call i1 @widenable.condition() %guard = and i1 %%wide.check, %wc br i1 %guard ... ``` ... into the scheme of widenable conditions widening when WC is being replaced by `NewCheck && WC`: ``` %wc = call i1 @widenable.condition() %guard = and i1 %cond0, %wc br i1 %guard ... ``` --> ``` %wc = call i1 @widenable.condition() %wide.check = and i1 %cond1, %wc %guard = and i1 %cond0, %wide.check br i1 %guard ... ``` The change aims to avoid cases when we turn loop-invarinat branch into a loop-variant one, like in this issue: llvm#60234

llvm/llvm-project#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

github-actions bot added the new issue label Jan 23, 2023

xortator added a commit that referenced this issue Jan 23, 2023

[Test] Add test for PR60234

f7de5bb

#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

EugeneZelenko added llvm:optimizations and removed new issue labels Jan 23, 2023

CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Jan 24, 2023

[Test] Add test for PR60234

e704e34

llvm/llvm-project#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

xortator closed this as completed Jan 31, 2023

xortator mentioned this issue Mar 24, 2023

[LoopPredication] Miscompile due to sinking of widenable condition into loop #61673

Closed

aleks-tmb mentioned this issue Nov 3, 2023

[GuardWidening] Widen widenable conditions instead of widenable branches #66501

Open

veselypeta pushed a commit to veselypeta/cherillvm that referenced this issue Jun 17, 2024

[Test] Add test for PR60234

2a9e124

llvm/llvm-project#60234 explains how widening of a branch by loop-invariant condition is causing a miscompile.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

xortator commented Jan 23, 2023 •

edited by VoltrexKeyva

Loading

xortator commented Jan 23, 2023

xortator commented Jan 31, 2023

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

[GuardWidening] Miscompile due to widening into loop-invariant WC #60234

Comments

xortator commented Jan 23, 2023 • edited by VoltrexKeyva Loading

xortator commented Jan 23, 2023

xortator commented Jan 31, 2023

xortator commented Jan 23, 2023 •

edited by VoltrexKeyva

Loading