Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopInterchange] Prevent interchange if one index is constant and the other has loop carried dependence #79123

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ShivaChen
Copy link
Collaborator

Consider the following loop from #54176.

  for (int j = 1; j < M; j++)     // loop j
    for (int i = 1; i < N; i++) { // loop i
      aa[1][j-1] += bb[i][j]; // statement 1
      cc[i][j] = aa[1][j]; 
    }
 

i = 2 | cc[2][1] = aa[1][1]
i = 1 |                              aa[1][1]+= bb[1][2]              
         j = 1                       j = 2

When loop i as inner most loop, cc[2][1] will be assigned to aa[1][1] before statement 1 update aa[1][1].
After loop interchange, cc[2][1] will be assigned to aa[1][1] after statement 1 update aa[1][1].

It seems when one index is constant and the other index has loop carried dependence, interchange may change the results.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 23, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: None (ShivaChen)

Changes

Consider the following loop from #54176.

  for (int j = 1; j &lt; M; j++)     // loop j
    for (int i = 1; i &lt; N; i++) { // loop i
      aa[1][j-1] += bb[i][j]; // statement 1
      cc[i][j] = aa[1][j]; 
    }
 

i = 2 | cc[2][1] = aa[1][1]
i = 1 |                              aa[1][1]+= bb[1][2]              
         j = 1                       j = 2

When loop i as inner most loop, cc[2][1] will be assigned to aa[1][1] before statement 1 update aa[1][1].
After loop interchange, cc[2][1] will be assigned to aa[1][1] after statement 1 update aa[1][1].

It seems when one index is constant and the other index has loop carried dependence, interchange may change the results.


Full diff: https://github.com/llvm/llvm-project/pull/79123.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/DependenceAnalysis.h (+3-3)
  • (modified) llvm/lib/Analysis/DependenceAnalysis.cpp (+4-1)
  • (modified) llvm/lib/Transforms/Scalar/LoopInterchange.cpp (+9-1)
  • (added) llvm/test/Transforms/LoopInterchange/pr54176.ll (+54)
diff --git a/llvm/include/llvm/Analysis/DependenceAnalysis.h b/llvm/include/llvm/Analysis/DependenceAnalysis.h
index f0a09644e0f4b6..63aa04abc6df79 100644
--- a/llvm/include/llvm/Analysis/DependenceAnalysis.h
+++ b/llvm/include/llvm/Analysis/DependenceAnalysis.h
@@ -306,9 +306,9 @@ namespace llvm {
     /// The flag PossiblyLoopIndependent should be set by the caller
     /// if it appears that control flow can reach from Src to Dst
     /// without traversing a loop back edge.
-    std::unique_ptr<Dependence> depends(Instruction *Src,
-                                        Instruction *Dst,
-                                        bool PossiblyLoopIndependent);
+    std::unique_ptr<Dependence> depends(Instruction *Src, Instruction *Dst,
+                                        bool PossiblyLoopIndependent,
+                                        bool *HasConstantIndex = nullptr);
 
     /// getSplitIteration - Give a dependence that's splittable at some
     /// particular level, return the iteration that should be used to split
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 1bce9aae09bb26..392d5329a3a356 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3588,7 +3588,7 @@ bool DependenceInfo::invalidate(Function &F, const PreservedAnalyses &PA,
 // up to date with respect to this routine.
 std::unique_ptr<Dependence>
 DependenceInfo::depends(Instruction *Src, Instruction *Dst,
-                        bool PossiblyLoopIndependent) {
+                        bool PossiblyLoopIndependent, bool *HasConstantIndex) {
   if (Src == Dst)
     PossiblyLoopIndependent = false;
 
@@ -3674,6 +3674,9 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst,
     LLVM_DEBUG(dbgs() << "\tclass = " << Pair[P].Classification << "\n");
     LLVM_DEBUG(dbgs() << "\tloops = ");
     LLVM_DEBUG(dumpSmallBitVector(Pair[P].Loops));
+    if (HasConstantIndex)
+      if (isa<SCEVConstant>(Pair[P].Src) || isa<SCEVConstant>(Pair[P].Dst))
+        *HasConstantIndex = true;
   }
 
   SmallBitVector Separable(Pairs);
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 277f530ee25fc1..d5825052bcdf2f 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -117,11 +117,12 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
       std::vector<char> Dep;
       Instruction *Src = cast<Instruction>(*I);
       Instruction *Dst = cast<Instruction>(*J);
+      bool HasConstantIndex = false;
       // Ignore Input dependencies.
       if (isa<LoadInst>(Src) && isa<LoadInst>(Dst))
         continue;
       // Track Output, Flow, and Anti dependencies.
-      if (auto D = DI->depends(Src, Dst, true)) {
+      if (auto D = DI->depends(Src, Dst, true, &HasConstantIndex)) {
         assert(D->isOrdered() && "Expected an output, flow or anti dep.");
         // If the direction vector is negative, normalize it to
         // make it non-negative.
@@ -150,6 +151,13 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
               Direction = '=';
             else
               Direction = '*';
+
+            // Bail out if there is constant index and the other has loop
+            // carried dependence.
+            if (HasConstantIndex && (Direction == '>' || Direction == '<')) {
+              dbgs() << "Has Constant Index and loop carried dependence\n";
+              return false;
+            }
             Dep.push_back(Direction);
           }
         }
diff --git a/llvm/test/Transforms/LoopInterchange/pr54176.ll b/llvm/test/Transforms/LoopInterchange/pr54176.ll
new file mode 100644
index 00000000000000..79bf712b2b6968
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/pr54176.ll
@@ -0,0 +1,54 @@
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info \
+; RUN:     -S -debug 2>&1 | FileCheck %s
+
+@bb = global [1024 x [128 x float]] zeroinitializer, align 4
+@aa = global [1024 x [128 x float]] zeroinitializer, align 4
+@cc = global [1024 x [128 x float]] zeroinitializer, align 4
+
+
+;; Loops should not be interchanged in this case as it will change the
+;; cc array results.
+;;
+;;  for (int j = 1; j < M; j++)
+;;    for (int i = 1; i < N; i++) {
+;;      aa[1][j-1] += bb[i][j];
+;;      cc[i][j] = aa[1][j];
+;;    }
+
+; CHECK-NOT: Loops interchanged.
+
+define void @pr54176() {
+entry:
+  br label %for.cond1.preheader
+
+; Loop:
+for.cond1.preheader:                              ; preds = %entry, %for.cond.cleanup3
+  %indvars.iv28 = phi i64 [ 1, %entry ], [ %indvars.iv.next29, %for.cond.cleanup3 ]
+  %0 = add nsw i64 %indvars.iv28, -1
+  %arrayidx8 = getelementptr inbounds [1024 x [128 x float]], ptr @aa, i64 0, i64 1, i64 %0
+  %arrayidx10 = getelementptr inbounds [1024 x [128 x float]], ptr @aa, i64 0, i64 1, i64 %indvars.iv28
+  br label %for.body4
+
+for.cond.cleanup3:                                ; preds = %for.body4
+  %indvars.iv.next29 = add nuw nsw i64 %indvars.iv28, 1
+  %exitcond31 = icmp ne i64 %indvars.iv.next29, 128
+  br i1 %exitcond31, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.body4:                                        ; preds = %for.cond1.preheader, %for.body4
+  %indvars.iv = phi i64 [ 1, %for.cond1.preheader ], [ %indvars.iv.next, %for.body4 ]
+  %arrayidx6 = getelementptr inbounds [1024 x [128 x float]], ptr @bb, i64 0, i64 %indvars.iv, i64 %indvars.iv28
+  %1 = load float, ptr %arrayidx6, align 4
+  %2 = load float, ptr %arrayidx8, align 4
+  %add = fadd float %1, %2
+  store float %add, ptr %arrayidx8, align 4
+  %3 = load float, ptr %arrayidx10, align 4
+  %arrayidx14 = getelementptr inbounds [1024 x [128 x float]], ptr @cc, i64 0, i64 %indvars.iv, i64 %indvars.iv28
+  store float %3, ptr %arrayidx14, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp ne i64 %indvars.iv.next, 1024
+  br i1 %exitcond, label %for.body4, label %for.cond.cleanup3
+
+; Exit blocks
+for.cond.cleanup:                                 ; preds = %for.cond.cleanup3
+  ret void
+}

Copy link
Collaborator

@bmahjour bmahjour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like the correct approach to me. A constant index should result in a scalar dependence, so there is no need to specify an output parameter to the depends interface.

@ShivaChen
Copy link
Collaborator Author

This doesn't look like the correct approach to me. A constant index should result in a scalar dependence, so there is no need to specify an output parameter to the depends interface.

It seems SIV test of the second index will dominate the direction results. I try to use another approach. That is using ZIV test to catch the scalar index and then compare SIV test result to catch the one scalar index and the other index has loop carried.
Does it sound reasonable?

@ShivaChen
Copy link
Collaborator Author

Use SCEV form to detect the constant index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants