[SCEV] Improve applyLoopGuards to support Mul #83428

komalon1 · 2024-02-29T13:51:39Z

Improve applyLoopGuards to preserve divisibility information of SCEVMul Expressions.
Before this patch, the code searched for a more complicated pattern, but now it simply searches for a pattern of (Expr * constant) and tries to prove that the whole SCEV divides by this constant.
For example, the SCEV ((2 * a) umin (4 * b)) is now known to divide by 2.

llvmbot · 2024-02-29T13:52:14Z

@llvm/pr-subscribers-llvm-analysis

Author: None (komalon1)

Changes

Improve applyLoopGuards to preserve divisibility information of SCEVMul Expressions.

This fixes #82367.

Full diff: https://github.com/llvm/llvm-project/pull/83428.diff

3 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+18-16)
(modified) llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll (+2-2)
(modified) llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll (+38)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 4b2db80bc1ec30..052ae4923d4a43 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -15000,6 +15000,13 @@ class SCEVLoopGuardRewriter : public SCEVRewriteVisitor<SCEVLoopGuardRewriter> {
       return SCEVRewriteVisitor<SCEVLoopGuardRewriter>::visitSMinExpr(Expr);
     return I->second;
   }
+
+  const SCEV *visitMulExpr(const SCEVMulExpr *Expr) {
+    auto I = Map.find(Expr);
+    if (I == Map.end())
+      return SCEVRewriteVisitor<SCEVLoopGuardRewriter>::visitMulExpr(Expr);
+    return I->second;
+  }
 };
 
 const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
@@ -15149,17 +15156,15 @@ const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
       const SCEV *URemLHS = nullptr;
       const SCEV *URemRHS = nullptr;
       if (matchURem(LHS, URemLHS, URemRHS)) {
-        if (const SCEVUnknown *LHSUnknown = dyn_cast<SCEVUnknown>(URemLHS)) {
-          auto I = RewriteMap.find(LHSUnknown);
-          const SCEV *RewrittenLHS =
-              I != RewriteMap.end() ? I->second : LHSUnknown;
-          RewrittenLHS = ApplyDivisibiltyOnMinMaxExpr(RewrittenLHS, URemRHS);
-          const auto *Multiple =
-              getMulExpr(getUDivExpr(RewrittenLHS, URemRHS), URemRHS);
-          RewriteMap[LHSUnknown] = Multiple;
-          ExprsToRewrite.push_back(LHSUnknown);
-          return;
-        }
+        auto I = RewriteMap.find(URemLHS);
+        const SCEV *RewrittenLHS =
+            I != RewriteMap.end() ? I->second : URemLHS;
+        RewrittenLHS = ApplyDivisibiltyOnMinMaxExpr(RewrittenLHS, URemRHS);
+        const auto *Multiple =
+            getMulExpr(getUDivExpr(RewrittenLHS, URemRHS), URemRHS);
+        RewriteMap[URemLHS] = Multiple;
+        ExprsToRewrite.push_back(URemLHS);
+        return;
       }
     }
 
@@ -15208,11 +15213,8 @@ const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
             auto *MulRHS = Mul->getOperand(1);
             if (isa<SCEVConstant>(MulLHS))
               std::swap(MulLHS, MulRHS);
-            if (auto *Div = dyn_cast<SCEVUDivExpr>(MulLHS))
-              if (Div->getOperand(1) == MulRHS) {
-                DividesBy = MulRHS;
-                return true;
-              }
+            DividesBy = MulRHS;
+            return true;
           }
           if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr))
             return HasDivisibiltyInfo(MinMax->getOperand(0), DividesBy) ||
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
index 7d4876baa9e5d9..482accc38cb391 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
@@ -65,7 +65,7 @@ define void @umin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void umin(unsigned a, unsigned b) {
 ;   a *= 2;
@@ -165,7 +165,7 @@ define void @smin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void smin(signed a, signed b) {
 ;   a *= 2;
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll b/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
index a0a5158bdff160..25fe63e9d61bc0 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
@@ -607,5 +607,43 @@ exit:
   ret void
 }
 
+define void @test_trip_scevmul_multiple_5(i32 %num1, i32 %num2) {
+; CHECK-LABEL: 'test_trip_scevmul_multiple_5'
+; CHECK-NEXT:  Classifying expressions for: @test_trip_scevmul_multiple_5
+; CHECK-NEXT:    %num = mul i32 %num1, %num2
+; CHECK-NEXT:    --> (%num1 * %num2) U: full-set S: full-set
+; CHECK-NEXT:    %u = urem i32 %num, 5
+; CHECK-NEXT:    --> ((-5 * ((%num1 * %num2) /u 5)) + (%num1 * %num2)) U: full-set S: full-set
+; CHECK-NEXT:    %i.010 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + (%num1 * %num2)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    %inc = add nuw nsw i32 %i.010, 1
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: (%num1 * %num2) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @test_trip_scevmul_multiple_5
+; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is -2
+; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:   Predicates:
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 5
+;
+entry:
+  %num = mul i32 %num1, %num2
+  %u = urem i32 %num, 5
+  %cmp = icmp eq i32 %u, 0
+  tail call void @llvm.assume(i1 %cmp)
+  %cmp.1 = icmp uge i32 %num, 5
+  tail call void @llvm.assume(i1 %cmp.1)
+  br label %for.body
+
+for.body:
+  %i.010 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %inc = add nuw nsw i32 %i.010, 1
+  %cmp2 = icmp ult i32 %inc, %num
+  br i1 %cmp2, label %for.body, label %exit
+
+exit:
+  ret void
+}
+
 declare void @llvm.assume(i1)
 declare void @llvm.experimental.guard(i1, ...)

github-actions · 2024-02-29T13:54:06Z

✅ With the latest revision this PR passed the C/C++ code formatter.

fhahn · 2024-02-29T14:11:33Z

Could you update the description to explain how applyLoopGuards is improved?

komalon1 · 2024-03-03T06:44:19Z

Could you update the description to explain how applyLoopGuards is improved?

Done

nikic

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

komalon1 · 2024-03-06T12:18:20Z

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

This logic is not new, it already exists in ScalarEvolution.cpp:15144. It simply checks for the assumption of (X % constant == 0). Knowing that the assumption holds, it is fine to rewrite the SCEV X to (X / constant) * constant. Please notice that we first divide, and then multiply, so overflow shouldn't be an issue.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

no problem, I will separate.

llvm/lib/Analysis/ScalarEvolution.cpp

Improve applyLoopGuards to preserve divisibility information of SCEVMul Expressions. It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant. It also generalized the assumption cache divisibility propagation to non-SCEVUnkinowns. For example if: TC = TC1 * TC2; __builtin_assume(TC % 8 == 0); We now propagate the information that SCEVMul divides by 8, by rewriting it to ((TC1 * TC2) \ 8) * 8 This fixes llvm#82367.

komalon1 · 2024-03-17T08:30:00Z

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

@nikic I split the patches. Any other comments?

komalon1 requested a review from nikic as a code owner February 29, 2024 13:51

llvmbot added the llvm:analysis label Feb 29, 2024

komalon1 requested a review from xortator February 29, 2024 13:53

komalon1 force-pushed the users/komalon1/scevmul-applyloopguards branch from 84b4d5a to 9e0555c Compare February 29, 2024 15:24

nikic requested changes Mar 6, 2024

View reviewed changes

nikic reviewed Mar 6, 2024

View reviewed changes

llvm/lib/Analysis/ScalarEvolution.cpp Show resolved Hide resolved

komalon1 added 2 commits March 10, 2024 10:13

Split patches, improve documentation

8c87ab0

komalon1 force-pushed the users/komalon1/scevmul-applyloopguards branch from 9e0555c to 8c87ab0 Compare March 10, 2024 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SCEV] Improve applyLoopGuards to support Mul #83428

[SCEV] Improve applyLoopGuards to support Mul #83428

komalon1 commented Feb 29, 2024 •

edited

llvmbot commented Feb 29, 2024

github-actions bot commented Feb 29, 2024 •

edited

fhahn commented Feb 29, 2024

komalon1 commented Mar 3, 2024

nikic left a comment

komalon1 commented Mar 6, 2024 •

edited

komalon1 commented Mar 17, 2024

[SCEV] Improve applyLoopGuards to support Mul #83428

Are you sure you want to change the base?

[SCEV] Improve applyLoopGuards to support Mul #83428

Conversation

komalon1 commented Feb 29, 2024 • edited

llvmbot commented Feb 29, 2024

github-actions bot commented Feb 29, 2024 • edited

fhahn commented Feb 29, 2024

komalon1 commented Mar 3, 2024

nikic left a comment

Choose a reason for hiding this comment

komalon1 commented Mar 6, 2024 • edited

komalon1 commented Mar 17, 2024

komalon1 commented Feb 29, 2024 •

edited

github-actions bot commented Feb 29, 2024 •

edited

komalon1 commented Mar 6, 2024 •

edited