-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoopInterchange] Forbid interchange if load store type wider than element type #77885
Conversation
…ement type Consider the following case, interchange is not valid as the store type(int) is wider than the array element type(char). char p[7]; for (int j = 0; j < 2; ++j) for (int i = 0; i < 2; ++i) *((int*)&p[2*i+j]) = 2*i+j+1;
@llvm/pr-subscribers-llvm-transforms Author: None (ShivaChen) ChangesConsider the following case, interchange is not valid as the store type(int) is wider than the array element type(char).
Full diff: https://github.com/llvm/llvm-project/pull/77885.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 277f530ee25fc1..443644c40cebab 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -82,6 +82,40 @@ static void printDepMatrix(CharMatrix &DepMatrix) {
}
#endif
+// Consider the following case, interchange is not valid as the store type(int)
+// is wider than the array element type(char).
+//
+// char p[7];
+// for (int j = 0; j < 2; ++j)
+// for (int i = 0; i < 2; ++i)
+// *((int*)&p[2*i+j]) = 2*i+j+1;
+//
+// Return true if the load/store type is wider than the element type.
+static bool isWiderLoadStore(Instruction *I) {
+ Value *Ptr = getLoadStorePointerOperand(I);
+ Type *ITy = getLoadStoreType(I);
+
+ if (!ITy->isSingleValueType())
+ return false;
+
+ if (auto *GEP = dyn_cast<GetElementPtrInst>(Ptr)) {
+ Type *ElementTy = GEP->getSourceElementType();
+
+ while (isa<ArrayType>(ElementTy)) {
+ ElementTy = ElementTy->getArrayElementType();
+ if (ElementTy->isSingleValueType())
+ if (ITy->getScalarSizeInBits() > ElementTy->getScalarSizeInBits()) {
+ LLVM_DEBUG(dbgs() << "Loads or Stores Type " << *ITy
+ << " is wider than the element type " << *ElementTy
+ << "\n");
+ return true;
+ }
+ }
+ }
+
+ return false;
+}
+
static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
Loop *L, DependenceInfo *DI,
ScalarEvolution *SE) {
@@ -98,10 +132,14 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
if (auto *Ld = dyn_cast<LoadInst>(&I)) {
if (!Ld->isSimple())
return false;
+ if (isWiderLoadStore(&I))
+ return false;
MemInstr.push_back(&I);
} else if (auto *St = dyn_cast<StoreInst>(&I)) {
if (!St->isSimple())
return false;
+ if (isWiderLoadStore(&I))
+ return false;
MemInstr.push_back(&I);
}
}
diff --git a/llvm/test/Transforms/LoopInterchange/not-interchanged-wider-store.ll b/llvm/test/Transforms/LoopInterchange/not-interchanged-wider-store.ll
new file mode 100644
index 00000000000000..a50751b1344c78
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/not-interchanged-wider-store.ll
@@ -0,0 +1,46 @@
+; REQUIRES: asserts
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info \
+; RUN: -S -debug 2>&1 | FileCheck %s
+
+;; Loops should not be interchanged in this case as the store is wider than
+;; array element type
+;;
+;; char p[7];
+;; for (int j = 0; j < 2; ++j)
+;; for (int i = 0; i < 2; ++i)
+;; *((int*)&p[2*i+j]) = 2*i+j+1;
+
+; CHECK: Loads or Stores Type i32 is wider than the element type i8
+; CHECK: Populating dependency matrix failed
+
+define i32 @main() {
+entry:
+ %p = alloca [7 x i8], align 1
+ br label %for.cond1.preheader
+
+; Loop:
+for.cond1.preheader: ; preds = %entry, %for.cond.cleanup3
+ %indvars.iv29 = phi i64 [ 0, %entry ], [ %indvars.iv.next30, %for.cond.cleanup3 ]
+ br label %for.body4
+
+for.cond.cleanup3: ; preds = %for.body4
+ %indvars.iv.next30 = add nuw nsw i64 %indvars.iv29, 1
+ %cmp = icmp eq i64 %indvars.iv29, 0
+ br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.body4: ; preds = %for.cond1.preheader, %for.body4
+ %indvars.iv = phi i64 [ 0, %for.cond1.preheader ], [ %indvars.iv.next, %for.body4 ]
+ %0 = shl nuw nsw i64 %indvars.iv, 1
+ %1 = add nuw nsw i64 %0, %indvars.iv29
+ %2 = add nuw nsw i64 %1, 1
+ %arrayidx = getelementptr inbounds [7 x i8], ptr %p, i64 0, i64 %1
+ %3 = trunc i64 %2 to i32
+ store i32 %3, ptr %arrayidx, align 1
+ %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+ %cmp2 = icmp eq i64 %indvars.iv, 0
+ br i1 %cmp2, label %for.body4, label %for.cond.cleanup3
+
+; Exit blocks
+for.cond.cleanup: ; preds = %for.cond.cleanup3
+ ret i32 0
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using different type widths is not a sufficient condition for testing overwrites. It could be the same type, but with offsets single bytes apart. There could be other conditions, e.g. function calls/atomics that only partially overwrite the target. Admittedly, DependenceAnalysis currently bails out on those, but in the very least the comment should mention all the necessary conditions and explain why they don't need to be checked. Your comments only explain the what, not the why.
return false; | ||
|
||
if (auto *GEP = dyn_cast<GetElementPtrInst>(Ptr)) { | ||
Type *ElementTy = GEP->getSourceElementType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot rely on the GEP data type for semantics, and will probably go away in the future. See https://llvm.org/devmtg/2022-04-03/slides/keynote.Opaque.Pointers.Are.Coming.pdf slide 68.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the information. I was mimicking how the delinearization extract the subscripts. I assume we should not increase the dependency to GEP. Thanks.
// Consider the following case, interchange is not valid as the store type(int) | ||
// is wider than the array element type(char). | ||
// | ||
// char p[7]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of char
, I could also have used int
and casted it to (char*)
before accessing. Or use array base pointers that are single bytes apart (PartialAlias).
Thanks for your kind explanation and bring more cases that could have overwritten. |
Yes, I suggest to abandon this approach. Delinearization by use the GEP data type, but would also have to change if the GEP type goes away. Better not add a dependency on it. However, it is "safe" for delinerization because passes are required to add runtime checks that check that verify that the indices fall into the expected range (e.g. no negative). In worst case when the delinerization heuristic is completely wrong the runtime check would always fail and the fallback (non-optimized) version to be executed. |
Abandon due to the case might be undefined behavior. |
Consider the following case, interchange is not valid as the store type(int) is wider than the array element type(char).