Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HardwareLoops] Add support for strictfp functions. #84531

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

kpneal
Copy link
Member

@kpneal kpneal commented Mar 8, 2024

This pass was adding new function calls without adding the strictfp attribute as required by the rules laid out in the langref. With this change a make check has 4-5 fewer failing tests with the Verifier changes in D146845.

LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics

Test failures found with "https://reviews.llvm.org/D146845".

This pass was adding new function calls without adding the strictfp
attribute as required by the rules laid out in the langref. With
this change a make check has 4-5 fewer failing tests with the
Verifier changes in D146845.

LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics

Test failures found with "https://reviews.llvm.org/D146845".
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 8, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Kevin P. Neal (kpneal)

Changes

This pass was adding new function calls without adding the strictfp attribute as required by the rules laid out in the langref. With this change a make check has 4-5 fewer failing tests with the Verifier changes in D146845.

LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics

Test failures found with "https://reviews.llvm.org/D146845".


Patch is 23.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/84531.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/HardwareLoops.cpp (+8)
  • (added) llvm/test/Transforms/HardwareLoops/scalar-while-strictfp.ll (+428)
diff --git a/llvm/lib/CodeGen/HardwareLoops.cpp b/llvm/lib/CodeGen/HardwareLoops.cpp
index e7b14d700a44a1..c536ec9f79d625 100644
--- a/llvm/lib/CodeGen/HardwareLoops.cpp
+++ b/llvm/lib/CodeGen/HardwareLoops.cpp
@@ -503,6 +503,8 @@ Value *HardwareLoop::InitLoopCount() {
 
 Value* HardwareLoop::InsertIterationSetup(Value *LoopCountInit) {
   IRBuilder<> Builder(BeginBB->getTerminator());
+  if (BeginBB->getParent()->getAttributes().hasFnAttr(Attribute::StrictFP))
+    Builder.setIsFPConstrained(true);
   Type *Ty = LoopCountInit->getType();
   bool UsePhi = UsePHICounter || Opts.ForcePhi;
   Intrinsic::ID ID = UseLoopGuard
@@ -535,6 +537,9 @@ Value* HardwareLoop::InsertIterationSetup(Value *LoopCountInit) {
 
 void HardwareLoop::InsertLoopDec() {
   IRBuilder<> CondBuilder(ExitBranch);
+  if (ExitBranch->getParent()->getParent()->getAttributes().hasFnAttr(
+          Attribute::StrictFP))
+    CondBuilder.setIsFPConstrained(true);
 
   Function *DecFunc =
     Intrinsic::getDeclaration(M, Intrinsic::loop_decrement,
@@ -557,6 +562,9 @@ void HardwareLoop::InsertLoopDec() {
 
 Instruction* HardwareLoop::InsertLoopRegDec(Value *EltsRem) {
   IRBuilder<> CondBuilder(ExitBranch);
+  if (ExitBranch->getParent()->getParent()->getAttributes().hasFnAttr(
+          Attribute::StrictFP))
+    CondBuilder.setIsFPConstrained(true);
 
   Function *DecFunc =
       Intrinsic::getDeclaration(M, Intrinsic::loop_decrement_reg,
diff --git a/llvm/test/Transforms/HardwareLoops/scalar-while-strictfp.ll b/llvm/test/Transforms/HardwareLoops/scalar-while-strictfp.ll
new file mode 100644
index 00000000000000..951aacc0653628
--- /dev/null
+++ b/llvm/test/Transforms/HardwareLoops/scalar-while-strictfp.ll
@@ -0,0 +1,428 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes
+; RUN: opt -passes='hardware-loops<force-hardware-loops;hardware-loop-decrement=1;hardware-loop-counter-bitwidth=32>' -S %s -o - | FileCheck %s --check-prefix=CHECK-DEC
+; RUN: opt -passes='hardware-loops<force-hardware-loops;hardware-loop-decrement=1;hardware-loop-counter-bitwidth=32;force-hardware-loop-phi>' -S %s -o - | FileCheck %s --check-prefix=CHECK-PHI
+
+define void @while_lt(i32 %i, i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_lt(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP4:%.*]] = icmp ult i32 [[I:%.*]], [[N:%.*]]
+; CHECK-DEC-NEXT:    br i1 [[CMP4]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = sub i32 [[N]], [[I]]
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[TMP0]]) #[[ATTR0:[0-9]+]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-DEC-NEXT:    [[TMP1:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP1]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_lt(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    [[CMP4:%.*]] = icmp ult i32 [[I:%.*]], [[N:%.*]]
+; CHECK-PHI-NEXT:    br i1 [[CMP4]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = sub i32 [[N]], [[I]]
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[TMP0]]) #[[ATTR0:[0-9]+]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP2:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP3:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-PHI-NEXT:    [[TMP3]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP2]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP4]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  %cmp4 = icmp ult i32 %i, %N
+  br i1 %cmp4, label %while.body, label %while.end
+
+while.body:
+  %i.addr.05 = phi i32 [ %inc, %while.body ], [ %i, %entry ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %inc = add nuw i32 %i.addr.05, 1
+  %exitcond = icmp eq i32 %inc, %N
+  br i1 %exitcond, label %while.end, label %while.body
+
+while.end:
+  ret void
+}
+
+define void @while_gt(i32 %i, i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_gt(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[I:%.*]], [[N:%.*]]
+; CHECK-DEC-NEXT:    br i1 [[CMP4]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = sub i32 [[I]], [[N]]
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[TMP0]]) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[DEC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[DEC]] = add nsw i32 [[I_ADDR_05]], -1
+; CHECK-DEC-NEXT:    [[TMP1:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP1]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_gt(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[I:%.*]], [[N:%.*]]
+; CHECK-PHI-NEXT:    br i1 [[CMP4]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = sub i32 [[I]], [[N]]
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[TMP0]]) #[[ATTR0]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[DEC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP2:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP3:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[DEC]] = add nsw i32 [[I_ADDR_05]], -1
+; CHECK-PHI-NEXT:    [[TMP3]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP2]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP4]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  %cmp4 = icmp sgt i32 %i, %N
+  br i1 %cmp4, label %while.body, label %while.end
+
+while.body:
+  %i.addr.05 = phi i32 [ %dec, %while.body ], [ %i, %entry ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %dec = add nsw i32 %i.addr.05, -1
+  %cmp = icmp sgt i32 %dec, %N
+  br i1 %cmp, label %while.body, label %while.end
+
+while.end:
+  ret void
+}
+
+define void @while_gte(i32 %i, i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_gte(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP4:%.*]] = icmp slt i32 [[I:%.*]], [[N:%.*]]
+; CHECK-DEC-NEXT:    br i1 [[CMP4]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = add i32 [[I]], 1
+; CHECK-DEC-NEXT:    [[TMP1:%.*]] = sub i32 [[TMP0]], [[N]]
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[TMP1]]) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[DEC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[DEC]] = add nsw i32 [[I_ADDR_05]], -1
+; CHECK-DEC-NEXT:    [[TMP2:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP2]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_gte(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    [[CMP4:%.*]] = icmp slt i32 [[I:%.*]], [[N:%.*]]
+; CHECK-PHI-NEXT:    br i1 [[CMP4]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = add i32 [[I]], 1
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = sub i32 [[TMP0]], [[N]]
+; CHECK-PHI-NEXT:    [[TMP2:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[TMP1]]) #[[ATTR0]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[DEC:%.*]], [[WHILE_BODY]] ], [ [[I]], [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP3:%.*]] = phi i32 [ [[TMP2]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP4:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[DEC]] = add nsw i32 [[I_ADDR_05]], -1
+; CHECK-PHI-NEXT:    [[TMP4]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP3]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP5:%.*]] = icmp ne i32 [[TMP4]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP5]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  %cmp4 = icmp slt i32 %i, %N
+  br i1 %cmp4, label %while.end, label %while.body
+
+while.body:
+  %i.addr.05 = phi i32 [ %dec, %while.body ], [ %i, %entry ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %dec = add nsw i32 %i.addr.05, -1
+  %cmp = icmp sgt i32 %i.addr.05, %N
+  br i1 %cmp, label %while.body, label %while.end
+
+while.end:
+  ret void
+}
+
+define void @while_ne(i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_ne(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP:%.*]] = icmp ne i32 [[N:%.*]], 0
+; CHECK-DEC-NEXT:    br i1 [[CMP]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP0]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_ne(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    [[CMP:%.*]] = icmp ne i32 [[N:%.*]], 0
+; CHECK-PHI-NEXT:    br i1 [[CMP]], label [[WHILE_BODY_PREHEADER:%.*]], label [[WHILE_END:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = phi i32 [ [[TMP0]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP2:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-PHI-NEXT:    [[TMP2]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP1]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP3:%.*]] = icmp ne i32 [[TMP2]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP3]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  %cmp = icmp ne i32 %N, 0
+  br i1 %cmp, label %while.body, label %while.end
+
+while.body:
+  %i.addr.05 = phi i32 [ %inc, %while.body ], [ 0, %entry ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %inc = add nuw i32 %i.addr.05, 1
+  %exitcond = icmp eq i32 %inc, %N
+  br i1 %exitcond, label %while.end, label %while.body
+
+while.end:
+  ret void
+}
+
+define void @while_eq(i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_eq(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-DEC-NEXT:    br i1 [[CMP]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP0]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_eq(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    [[CMP:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-PHI-NEXT:    br i1 [[CMP]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = phi i32 [ [[TMP0]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP2:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-PHI-NEXT:    [[TMP2]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP1]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP3:%.*]] = icmp ne i32 [[TMP2]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP3]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  %cmp = icmp eq i32 %N, 0
+  br i1 %cmp, label %while.end, label %while.body
+
+while.body:
+  %i.addr.05 = phi i32 [ %inc, %while.body ], [ 0, %entry ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %inc = add nuw i32 %i.addr.05, 1
+  %exitcond = icmp eq i32 %inc, %N
+  br i1 %exitcond, label %while.end, label %while.body
+
+while.end:
+  ret void
+}
+
+define void @while_preheader_eq(i32 %N, ptr nocapture %A) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @while_preheader_eq(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    br label [[PREHEADER:%.*]]
+; CHECK-DEC:       preheader:
+; CHECK-DEC-NEXT:    [[CMP:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-DEC-NEXT:    br i1 [[CMP]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-DEC:       while.body.preheader:
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-DEC:       while.body:
+; CHECK-DEC-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-DEC-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-DEC-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-DEC-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-DEC-NEXT:    [[TMP0:%.*]] = call i1 @llvm.loop.decrement.i32(i32 1) #[[ATTR0]]
+; CHECK-DEC-NEXT:    br i1 [[TMP0]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-DEC:       while.end:
+; CHECK-DEC-NEXT:    ret void
+;
+; CHECK-PHI: Function Attrs: strictfp
+; CHECK-PHI-LABEL: @while_preheader_eq(
+; CHECK-PHI-NEXT:  entry:
+; CHECK-PHI-NEXT:    br label [[PREHEADER:%.*]]
+; CHECK-PHI:       preheader:
+; CHECK-PHI-NEXT:    [[CMP:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-PHI-NEXT:    br i1 [[CMP]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]]
+; CHECK-PHI:       while.body.preheader:
+; CHECK-PHI-NEXT:    [[TMP0:%.*]] = call i32 @llvm.start.loop.iterations.i32(i32 [[N]]) #[[ATTR0]]
+; CHECK-PHI-NEXT:    br label [[WHILE_BODY:%.*]]
+; CHECK-PHI:       while.body:
+; CHECK-PHI-NEXT:    [[I_ADDR_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ]
+; CHECK-PHI-NEXT:    [[TMP1:%.*]] = phi i32 [ [[TMP0]], [[WHILE_BODY_PREHEADER]] ], [ [[TMP2:%.*]], [[WHILE_BODY]] ]
+; CHECK-PHI-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_ADDR_05]]
+; CHECK-PHI-NEXT:    store i32 [[I_ADDR_05]], ptr [[ARRAYIDX]], align 4
+; CHECK-PHI-NEXT:    [[INC]] = add nuw i32 [[I_ADDR_05]], 1
+; CHECK-PHI-NEXT:    [[TMP2]] = call i32 @llvm.loop.decrement.reg.i32(i32 [[TMP1]], i32 1) #[[ATTR0]]
+; CHECK-PHI-NEXT:    [[TMP3:%.*]] = icmp ne i32 [[TMP2]], 0
+; CHECK-PHI-NEXT:    br i1 [[TMP3]], label [[WHILE_BODY]], label [[WHILE_END]]
+; CHECK-PHI:       while.end:
+; CHECK-PHI-NEXT:    ret void
+;
+entry:
+  br label %preheader
+
+preheader:
+  %cmp = icmp eq i32 %N, 0
+  br i1 %cmp, label %while.end, label %while.body
+
+while.body:
+  %i.addr.05 = phi i32 [ %inc, %while.body ], [ 0, %preheader ]
+  %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.addr.05
+  store i32 %i.addr.05, ptr %arrayidx, align 4
+  %inc = add nuw i32 %i.addr.05, 1
+  %exitcond = icmp eq i32 %inc, %N
+  br i1 %exitcond, label %while.end, label %while.body
+
+while.end:
+  ret void
+}
+
+define void @nested(ptr nocapture %A, i32 %N) strictfp {
+; CHECK-DEC: Function Attrs: strictfp
+; CHECK-DEC-LABEL: @nested(
+; CHECK-DEC-NEXT:  entry:
+; CHECK-DEC-NEXT:    [[CMP20:%.*]] = icmp eq i32 [[N:%.*]], 0
+; CHECK-DEC-NEXT:    br i1 [[CMP20]], label [[WHILE_END7:%.*]], label [[WHILE_COND1_PREHEADER_US:%.*]]
+; CHECK-DEC:       while.cond1.preheader.us:
+; CHECK-DEC-NEXT:    [[I_021_US:%.*]] = phi i32 [ [[INC6_US:%.*]], [[WHILE_COND1_WHILE_END_CRIT_EDGE_US:%.*]] ], [ 0, [[ENTRY:%.*]] ]
+; CHECK-DEC-NEXT:    [[MUL_US:%.*]] = mul i32 [[I_021_US]], [[N]]
+; CHECK-DEC-NEXT:    call void @llvm.set.loop.iterations.i32(i32 [[N...
[truncated]

Copy link
Contributor

@sparker-arm sparker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks. LGTM

@kpneal kpneal merged commit 702e2da into llvm:main Mar 11, 2024
6 checks passed
@kpneal kpneal deleted the hardwareloops branch April 8, 2024 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants