Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopIdiomRecognize] Miscompilation on bitmask loop can trigger UB #62873

Closed
ericliuu opened this issue May 22, 2023 · 1 comment
Closed

[LoopIdiomRecognize] Miscompilation on bitmask loop can trigger UB #62873

ericliuu opened this issue May 22, 2023 · 1 comment

Comments

@ericliuu
Copy link

The following IR is miscompiled by the LoopIdiomRecognize pass:

define i32 @src(i1 %a, i32 %b) {
  %z = zext i1 %a to i32
  %bitmask.i = shl i32 1, %z
  br label %loop.i

loop.i:                                           ; preds = %loop.i, %0
  %curr.i = phi i32 [ %b, %0 ], [ %next.i, %loop.i ]
  %curr.bitmasked.i = and i32 %curr.i, %bitmask.i
  %curr.isbitunset.i = icmp eq i32 %curr.bitmasked.i, 0
  %next.i = shl i32 %curr.i, 1
  br i1 %curr.isbitunset.i, label %loop.i, label %p2_different_liveout.SM.exit

p2_different_liveout.SM.exit:                     ; preds = %loop.i
  ret i32 %next.i
}

When optimized with -passes="loop-idiom", the output IR is:

define i32 @tgt(i1 %a, i32 %b) {
  %z = zext i1 %a to i32
  %bitmask.i = shl i32 1, %z
  %z.lowbitmask = add i32 %bitmask.i, -1
  %z.mask = or i32 %z.lowbitmask, %bitmask.i
  %b.masked = and i32 %b, %z.mask
  %b.masked.numleadingzeros = call i32 @llvm.ctlz.i32(i32 %b.masked, i1 true)
  %b.masked.numactivebits = sub nuw nsw i32 32, %b.masked.numleadingzeros
  %b.masked.leadingonepos = add nsw i32 %b.masked.numactivebits, -1
  %loop.i.backedgetakencount = sub nuw nsw i32 %z, %b.masked.leadingonepos
  %loop.i.tripcount = add nuw nsw i32 %loop.i.backedgetakencount, 1
  %curr.i = shl i32 %b, %loop.i.backedgetakencount
  %next.i = shl i32 %curr.i, 1
  br label %loop.i

loop.i:                                           ; preds = %loop.i, %0
  %loop.i.iv = phi i32 [ 0, %0 ], [ %loop.i.iv.next, %loop.i ]
  %1 = phi i32 [ %b, %0 ], [ %2, %loop.i ]
  %curr.bitmasked.i = and i32 %1, %bitmask.i
  %curr.isbitunset.i = icmp eq i32 %curr.bitmasked.i, 0
  %2 = shl i32 %1, 1
  %loop.i.iv.next = add nuw nsw i32 %loop.i.iv, 1
  %loop.i.ivcheck = icmp eq i32 %loop.i.iv.next, %loop.i.tripcount
  br i1 %loop.i.ivcheck, label %p2_different_liveout.SM.exit, label %loop.i

p2_different_liveout.SM.exit:                     ; preds = %loop.i
  %next.i.lcssa = phi i32 [ %next.i, %loop.i ]
  ret i32 %next.i.lcssa
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i32 @llvm.ctlz.i32(i32, i1 immarg) #0

This optimized code can result in UB on the final branch to basic block p2_different_liveout.SM.exit. Here is Alive2 output that provides example input which triggers this error: https://alive2.llvm.org/ce/z/GcsFEn

@luxufan
Copy link
Contributor

luxufan commented May 23, 2023

The transformation becomes correct If adds noundef attribute to %a in the original IR. So it seems like the bug is caused by we don't check isGuaranteedNotToBeUndefOrPoison at some point. I am interested to solve this. And also welcome other people's solutions.

@luxufan luxufan closed this as completed in e9ddb58 Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants