-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[SCEV] Consider non-volatile memory intrinsics as not having side-effect for forward progress #150916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SCEV] Consider non-volatile memory intrinsics as not having side-effect for forward progress #150916
Conversation
… in the outer loop Before loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev. Ideally, I would expect store to be considered as an instruction that introduces side effects like the memset but this generates lot of test failures. Fixes llvm#149377
I have by mistake pulled the test in AArch64. Will move it out |
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Sushant Gokhale (sushgokh) Changes… in the outer loop Before loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev. Fixes #149377 Full diff: https://github.com/llvm/llvm-project/pull/150916.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 03b92d3338a98..7cee8a7525c42 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -1154,6 +1154,11 @@ bool LoopIdiomRecognize::processLoopStridedStore(
NewCall->setDebugLoc(TheStore->getDebugLoc());
+ auto *MemsetLoop = LI->getLoopFor(NewCall->getParent());
+ // FIXME: We should invalidate scev only if the loop already does not have any
+ // side effects.
+ SE->forgetLoop(MemsetLoop);
+
if (MSSAU) {
MemoryAccess *NewMemAcc = MSSAU->createMemoryAccessInBB(
NewCall, nullptr, NewCall->getParent(), MemorySSA::BeforeTerminator);
diff --git a/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll b/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll
new file mode 100644
index 0000000000000..924db26cd573e
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll
@@ -0,0 +1,48 @@
+; REQUIRES: asserts
+; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null
+
+
+; IR corresponds to the following C test:
+; extern char a[];
+; void foo() {
+; for (long c = 0; c < 6; c += -2078836808675943215)
+; for (long d; d < 6; d++)
+; a[c + d] = 0;
+; }
+
+; Make sure we do not crash when verifying scev.
+
+@a = external global [0 x i8]
+
+define void @foo() {
+entry:
+ br label %outerL
+
+outerL: ; preds = %entry, %outerLatch
+ %e = phi i64 [ undef, %entry ], [ %lcssa, %outerLatch ]
+ %c = phi i64 [ 0, %entry ], [ %c.next, %outerLatch ]
+ %e.cmp = icmp slt i64 %e, 6
+ br i1 %e.cmp, label %innerL, label %outerLatch
+
+innerL: ; preds = %outerL, %innerL
+ %d = phi i64 [ %d.next, %innerL ], [ %e, %outerL ]
+ %add = add nsw i64 %d, %c
+ %arrayidx = getelementptr inbounds [0 x i8], ptr @a, i64 0, i64 %add
+ store i8 0, ptr %arrayidx
+ %d.next = add nsw i64 %d, 1
+ %d.cmp = icmp slt i64 %d, 5
+ br i1 %d.cmp, label %innerL, label %outerLatch, !llvm.loop !0
+
+outerLatch: ; preds = %innerL, %outerL
+ %lcssa = phi i64 [ %e, %outerL ], [ %d.next, %innerL ]
+ %c.next = add nsw i64 %c, -2078836808675943215
+ %c.cmp = icmp slt i64 %c, 2078836808675943221
+ br i1 %c.cmp, label %outerL, label %exit, !llvm.loop !1
+
+exit: ; preds = %outerLatch
+ ret void
+}
+
+!0 = distinct !{!0, !2}
+!1 = distinct !{!1, !2}
+!2 = !{!"llvm.loop.mustprogress"}
|
@@ -0,0 +1,48 @@ | |||
; REQUIRES: asserts | |||
; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by mistake, pulled this test in AArch64. Will move this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might as well check the IR output and remove REQUIRES: asserts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, is checking with asserts expensive and one should go with IR output if its crash test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, if I remove the assert, the crash wont be produced and secondly, there is no difference in IR pre/post application of the patch.
So, I am not sure I understood correctly and how removing assert will help here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REQUIRES: asserts
only means that the test will only be run with builds with asserts enabled. There is no reason to not run it unconditionally and check the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, so I was testing the crash here(which was only reproducible with the asserts build). But as you say, maybe I can check the backedge taken count pre/post loop-idiom transform. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might as well check the IR output and remove
REQUIRES: asserts
done
✅ With the latest revision this PR passed the undef deprecator. |
@@ -0,0 +1,48 @@ | |||
; REQUIRES: asserts | |||
; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserts is not needed. Can you also use UTC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have used partial output for checking since only thats relevant
NewCall->setDebugLoc(TheStore->getDebugLoc()); | ||
|
||
auto *MemsetLoop = LI->getLoopFor(NewCall->getParent()); | ||
// FIXME: We should invalidate scev only if the loop already does not have any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/scev/SCEV/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed code.
%lcssa = phi i64 [ %e, %outerL ], [ %d.next, %innerL ] | ||
%c.next = add nsw i64 %c, -2078836808675943215 | ||
%c.cmp = icmp slt i64 %c, 2078836808675943221 | ||
br i1 %c.cmp, label %outerL, label %exit, !llvm.loop !1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need loop metadata for the patch to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for the patch to work, but for reproducing the crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make more sense to adjust the loop properties calculation to not consider a non-volatile memset to be a side effect (in the sense of the forward progress guarantee).
The way to generalize that would probably to say that the call can't write inaccessible memory, assuming we always model synchronization as accessing inaccessible memory. But conservatively just special casing memory intrinsics would be ok.
Are the side effects limited to writing to inaccessible memory? |
return false; | ||
} | ||
|
||
return I->mayThrow() || I->mayWriteToMemory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mayThrow() needs to be checked first, otherwise the early exit may bypass it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK to me, but I'd like to have a second opinion on only counting inaccessible memory as a side effect (for forward progress) here.
is there any criteria that what should be considered as side effect for a specific instruction? e.g. memset here can modify globals and hence, I am curious to know if even this should be considered as side effect |
Hm, it looks like FunctionAttrs currently does not infer inaccessible memory for atomic accesses: https://llvm.godbolt.org/z/cG6jn34ds That's not what I expected. Given that, I think it would be better if you did something like
In this context, "side effect" has a different meaning from what this term usually means in LLVM. See this part of the mustprogress spec:
Side effect here is exclusively about volatile accesses, synchronization and I/O, but not, for example, writing to plain memory. |
The clause "interact with the environment in an observable way" is quite subjective I think. why not consider a simple store or memset doing some observable interaction with the environment? For memset, the last sentence of "mustprogress" says:
which means memset must be considered as having side effect. In fact, what I dont understand is how volatile accesses interact in observable way ? Now, coming back, as per lang ref, if instruction is not volatile, thats not causing side effect. But this may be very narrow interpretation of the mustprogress clause. |
ping |
// Check if the function accesses inaccessible memory. | ||
if (auto *CI = dyn_cast<CallInst>(I)) { | ||
auto ME = CI->getMemoryEffects(); | ||
if (!isModSet(ME.getModRef(IRMemLocation::InaccessibleMem))) | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Check if the function accesses inaccessible memory. | |
if (auto *CI = dyn_cast<CallInst>(I)) { | |
auto ME = CI->getMemoryEffects(); | |
if (!isModSet(ME.getModRef(IRMemLocation::InaccessibleMem))) | |
return false; | |
} | |
// Non-volatile memset / memcpy do not count as side-effect | |
// for forward progress. | |
if (isa<MemIntrinsic>(I) && !I->isVolatile()) | |
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but PR description needs an update.
#include "llvm/Support/ErrorHandling.h" | ||
#include "llvm/Support/InterleavedRange.h" | ||
#include "llvm/Support/KnownBits.h" | ||
#include "llvm/Support/ModRef.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
For the attached test:
Before the loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev.
We try to consider non-volatile memory intrinsics as not having side-effect for forward progress to fix the issue.
Fixes #149377