Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCEV] Improve applyLoopGuards to support Mul #83428

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

komalon1
Copy link
Contributor

@komalon1 komalon1 commented Feb 29, 2024

Improve applyLoopGuards to preserve divisibility information of SCEVMul Expressions.
Before this patch, the code searched for a more complicated pattern, but now it simply searches for a pattern of (Expr * constant) and tries to prove that the whole SCEV divides by this constant.
For example, the SCEV ((2 * a) umin (4 * b)) is now known to divide by 2.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-llvm-analysis

Author: None (komalon1)

Changes

Improve applyLoopGuards to preserve divisibility information of SCEVMul Expressions.

This fixes #82367.


Full diff: https://github.com/llvm/llvm-project/pull/83428.diff

3 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+18-16)
  • (modified) llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll (+2-2)
  • (modified) llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll (+38)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 4b2db80bc1ec30..052ae4923d4a43 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -15000,6 +15000,13 @@ class SCEVLoopGuardRewriter : public SCEVRewriteVisitor<SCEVLoopGuardRewriter> {
       return SCEVRewriteVisitor<SCEVLoopGuardRewriter>::visitSMinExpr(Expr);
     return I->second;
   }
+
+  const SCEV *visitMulExpr(const SCEVMulExpr *Expr) {
+    auto I = Map.find(Expr);
+    if (I == Map.end())
+      return SCEVRewriteVisitor<SCEVLoopGuardRewriter>::visitMulExpr(Expr);
+    return I->second;
+  }
 };
 
 const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
@@ -15149,17 +15156,15 @@ const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
       const SCEV *URemLHS = nullptr;
       const SCEV *URemRHS = nullptr;
       if (matchURem(LHS, URemLHS, URemRHS)) {
-        if (const SCEVUnknown *LHSUnknown = dyn_cast<SCEVUnknown>(URemLHS)) {
-          auto I = RewriteMap.find(LHSUnknown);
-          const SCEV *RewrittenLHS =
-              I != RewriteMap.end() ? I->second : LHSUnknown;
-          RewrittenLHS = ApplyDivisibiltyOnMinMaxExpr(RewrittenLHS, URemRHS);
-          const auto *Multiple =
-              getMulExpr(getUDivExpr(RewrittenLHS, URemRHS), URemRHS);
-          RewriteMap[LHSUnknown] = Multiple;
-          ExprsToRewrite.push_back(LHSUnknown);
-          return;
-        }
+        auto I = RewriteMap.find(URemLHS);
+        const SCEV *RewrittenLHS =
+            I != RewriteMap.end() ? I->second : URemLHS;
+        RewrittenLHS = ApplyDivisibiltyOnMinMaxExpr(RewrittenLHS, URemRHS);
+        const auto *Multiple =
+            getMulExpr(getUDivExpr(RewrittenLHS, URemRHS), URemRHS);
+        RewriteMap[URemLHS] = Multiple;
+        ExprsToRewrite.push_back(URemLHS);
+        return;
       }
     }
 
@@ -15208,11 +15213,8 @@ const SCEV *ScalarEvolution::applyLoopGuards(const SCEV *Expr, const Loop *L) {
             auto *MulRHS = Mul->getOperand(1);
             if (isa<SCEVConstant>(MulLHS))
               std::swap(MulLHS, MulRHS);
-            if (auto *Div = dyn_cast<SCEVUDivExpr>(MulLHS))
-              if (Div->getOperand(1) == MulRHS) {
-                DividesBy = MulRHS;
-                return true;
-              }
+            DividesBy = MulRHS;
+            return true;
           }
           if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr))
             return HasDivisibiltyInfo(MinMax->getOperand(0), DividesBy) ||
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
index 7d4876baa9e5d9..482accc38cb391 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-minmax.ll
@@ -65,7 +65,7 @@ define void @umin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + ((2 * %a) umin (4 * %b)))
 ; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void umin(unsigned a, unsigned b) {
 ;   a *= 2;
@@ -165,7 +165,7 @@ define void @smin(i32 noundef %a, i32 noundef %b) {
 ; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + ((2 * %a)<nsw> smin (4 * %b)<nsw>))
 ; CHECK-NEXT:   Predicates:
-; CHECK-NEXT:  Loop %for.body: Trip multiple is 1
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 2
 ;
 ; void smin(signed a, signed b) {
 ;   a *= 2;
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll b/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
index a0a5158bdff160..25fe63e9d61bc0 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
@@ -607,5 +607,43 @@ exit:
   ret void
 }
 
+define void @test_trip_scevmul_multiple_5(i32 %num1, i32 %num2) {
+; CHECK-LABEL: 'test_trip_scevmul_multiple_5'
+; CHECK-NEXT:  Classifying expressions for: @test_trip_scevmul_multiple_5
+; CHECK-NEXT:    %num = mul i32 %num1, %num2
+; CHECK-NEXT:    --> (%num1 * %num2) U: full-set S: full-set
+; CHECK-NEXT:    %u = urem i32 %num, 5
+; CHECK-NEXT:    --> ((-5 * ((%num1 * %num2) /u 5)) + (%num1 * %num2)) U: full-set S: full-set
+; CHECK-NEXT:    %i.010 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+; CHECK-NEXT:    --> {0,+,1}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,-2147483648) Exits: (-1 + (%num1 * %num2)) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:    %inc = add nuw nsw i32 %i.010, 1
+; CHECK-NEXT:    --> {1,+,1}<nuw><nsw><%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: (%num1 * %num2) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @test_trip_scevmul_multiple_5
+; CHECK-NEXT:  Loop %for.body: backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:  Loop %for.body: constant max backedge-taken count is -2
+; CHECK-NEXT:  Loop %for.body: symbolic max backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:  Loop %for.body: Predicated backedge-taken count is (-1 + (%num1 * %num2))
+; CHECK-NEXT:   Predicates:
+; CHECK-NEXT:  Loop %for.body: Trip multiple is 5
+;
+entry:
+  %num = mul i32 %num1, %num2
+  %u = urem i32 %num, 5
+  %cmp = icmp eq i32 %u, 0
+  tail call void @llvm.assume(i1 %cmp)
+  %cmp.1 = icmp uge i32 %num, 5
+  tail call void @llvm.assume(i1 %cmp.1)
+  br label %for.body
+
+for.body:
+  %i.010 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %inc = add nuw nsw i32 %i.010, 1
+  %cmp2 = icmp ult i32 %inc, %num
+  br i1 %cmp2, label %for.body, label %exit
+
+exit:
+  ret void
+}
+
 declare void @llvm.assume(i1)
 declare void @llvm.experimental.guard(i1, ...)

Copy link

github-actions bot commented Feb 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn
Copy link
Contributor

fhahn commented Feb 29, 2024

Could you update the description to explain how applyLoopGuards is improved?

@komalon1 komalon1 force-pushed the users/komalon1/scevmul-applyloopguards branch from 84b4d5a to 9e0555c Compare February 29, 2024 15:24
@komalon1
Copy link
Contributor Author

komalon1 commented Mar 3, 2024

Could you update the description to explain how applyLoopGuards is improved?

Done

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

@komalon1
Copy link
Contributor Author

komalon1 commented Mar 6, 2024

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

This logic is not new, it already exists in ScalarEvolution.cpp:15144. It simply checks for the assumption of (X % constant == 0). Knowing that the assumption holds, it is fine to rewrite the SCEV X to (X / constant) * constant. Please notice that we first divide, and then multiply, so overflow shouldn't be an issue.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

no problem, I will separate.

Improve applyLoopGuards to preserve divisibility information
of SCEVMul Expressions.
It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.
It also generalized the assumption cache divisibility propagation to non-SCEVUnkinowns.
For example if:

TC = TC1 * TC2;
__builtin_assume(TC % 8 == 0);

We now propagate the information that SCEVMul divides by 8, by rewriting it to ((TC1 * TC2) \ 8) * 8

This fixes llvm#82367.
@komalon1 komalon1 force-pushed the users/komalon1/scevmul-applyloopguards branch from 9e0555c to 8c87ab0 Compare March 10, 2024 09:36
@komalon1
Copy link
Contributor Author

It does so by preserving the information that a SCEVMul with a SCEVConstant operand divides by this constant.

Maybe I'm misunderstanding what you are doing, but I don't think that's correct due to overflow? E.g. consider https://alive2.llvm.org/ce/z/Ljc8Jj.

You'd have to restrict this to either power of two constants, or check for appropriate nowrap flags.

It also generalized the assumption cache divisibility propagation to non-SCEVUnknowns.

This improvement sounds unrelated to the other one -- can it be split into a separate patch?

@nikic I split the patches. Any other comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants