Skip to content

Conversation

sushgokh
Copy link
Contributor

@sushgokh sushgokh commented Jul 28, 2025

For the attached test:
Before the loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev.

We try to consider non-volatile memory intrinsics as not having side-effect for forward progress to fix the issue.

Fixes #149377

… in the outer loop

Before loop-idiom pass, we have a store into the inner loop which is
considered simple and one that does not have any side effects on the
loop. Post loop-idiom pass, we get a memset into the outer loop that is
considered to introduce side effects on the loop. This changes the
backedge taken count before and after the pass and hence, the crash with
verify-scev.
Ideally, I would expect store to be considered as an instruction that
introduces side effects like the memset but this generates lot of test
failures.

Fixes llvm#149377
@sushgokh
Copy link
Contributor Author

I have by mistake pulled the test in AArch64. Will move it out

@llvmbot
Copy link
Member

llvmbot commented Jul 28, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Sushant Gokhale (sushgokh)

Changes

… in the outer loop

Before loop-idiom pass, we have a store into the inner loop which is considered simple and one that does not have any side effects on the loop. Post loop-idiom pass, we get a memset into the outer loop that is considered to introduce side effects on the loop. This changes the backedge taken count before and after the pass and hence, the crash with verify-scev.
Ideally, I would expect store to be considered as an instruction that introduces side effects like the memset but this generates lot of test failures.

Fixes #149377


Full diff: https://github.com/llvm/llvm-project/pull/150916.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (+5)
  • (added) llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll (+48)
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 03b92d3338a98..7cee8a7525c42 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -1154,6 +1154,11 @@ bool LoopIdiomRecognize::processLoopStridedStore(
 
   NewCall->setDebugLoc(TheStore->getDebugLoc());
 
+  auto *MemsetLoop = LI->getLoopFor(NewCall->getParent());
+  // FIXME: We should invalidate scev only if the loop already does not have any
+  // side effects.
+  SE->forgetLoop(MemsetLoop);
+
   if (MSSAU) {
     MemoryAccess *NewMemAcc = MSSAU->createMemoryAccessInBB(
         NewCall, nullptr, NewCall->getParent(), MemorySSA::BeforeTerminator);
diff --git a/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll b/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll
new file mode 100644
index 0000000000000..924db26cd573e
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/AArch64/introduce-memset-in-outerloop.ll
@@ -0,0 +1,48 @@
+; REQUIRES: asserts
+; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null
+
+
+; IR corresponds to the following C test:
+; extern char a[];
+; void foo() {
+;   for (long c = 0; c < 6; c += -2078836808675943215)
+;     for (long d; d < 6; d++)
+;       a[c + d] = 0;
+; }
+
+; Make sure we do not crash when verifying scev.
+
+@a = external global [0 x i8]
+
+define void @foo()  {
+entry:
+  br label %outerL
+
+outerL:                              ; preds = %entry, %outerLatch
+  %e = phi i64 [ undef, %entry ], [ %lcssa, %outerLatch ]
+  %c = phi i64 [ 0, %entry ], [ %c.next, %outerLatch ]
+  %e.cmp = icmp slt i64 %e, 6
+  br i1 %e.cmp, label %innerL, label %outerLatch
+
+innerL:                                          ; preds = %outerL, %innerL
+  %d = phi i64 [ %d.next, %innerL ], [ %e, %outerL ]
+  %add = add nsw i64 %d, %c
+  %arrayidx = getelementptr inbounds [0 x i8], ptr @a, i64 0, i64 %add
+  store i8 0, ptr %arrayidx
+  %d.next = add nsw i64 %d, 1
+  %d.cmp = icmp slt i64 %d, 5
+  br i1 %d.cmp, label %innerL, label %outerLatch, !llvm.loop !0
+
+outerLatch:                                         ; preds = %innerL, %outerL
+  %lcssa = phi i64 [ %e, %outerL ], [ %d.next, %innerL ]
+  %c.next = add nsw i64 %c, -2078836808675943215
+  %c.cmp = icmp slt i64 %c, 2078836808675943221
+  br i1 %c.cmp, label %outerL, label %exit, !llvm.loop !1
+
+exit:                                         ; preds = %outerLatch
+  ret void
+}
+
+!0 = distinct !{!0, !2}
+!1 = distinct !{!1, !2}
+!2 = !{!"llvm.loop.mustprogress"}

@@ -0,0 +1,48 @@
; REQUIRES: asserts
; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by mistake, pulled this test in AArch64. Will move this out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as well check the IR output and remove REQUIRES: asserts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, is checking with asserts expensive and one should go with IR output if its crash test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, if I remove the assert, the crash wont be produced and secondly, there is no difference in IR pre/post application of the patch.
So, I am not sure I understood correctly and how removing assert will help here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUIRES: asserts only means that the test will only be run with builds with asserts enabled. There is no reason to not run it unconditionally and check the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so I was testing the crash here(which was only reproducible with the asserts build). But as you say, maybe I can check the backedge taken count pre/post loop-idiom transform. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as well check the IR output and remove REQUIRES: asserts

done

Copy link

github-actions bot commented Jul 28, 2025

✅ With the latest revision this PR passed the undef deprecator.

@@ -0,0 +1,48 @@
; REQUIRES: asserts
; RUN: opt <%s -p "loop(loop-idiom)" -verify-scev -o /dev/null
Copy link
Contributor

@madhur13490 madhur13490 Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asserts is not needed. Can you also use UTC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have used partial output for checking since only thats relevant

NewCall->setDebugLoc(TheStore->getDebugLoc());

auto *MemsetLoop = LI->getLoopFor(NewCall->getParent());
// FIXME: We should invalidate scev only if the loop already does not have any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/scev/SCEV/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed code.

%lcssa = phi i64 [ %e, %outerL ], [ %d.next, %innerL ]
%c.next = add nsw i64 %c, -2078836808675943215
%c.cmp = icmp slt i64 %c, 2078836808675943221
br i1 %c.cmp, label %outerL, label %exit, !llvm.loop !1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need loop metadata for the patch to work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the patch to work, but for reproducing the crash

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to adjust the loop properties calculation to not consider a non-volatile memset to be a side effect (in the sense of the forward progress guarantee).

The way to generalize that would probably to say that the call can't write inaccessible memory, assuming we always model synchronization as accessing inaccessible memory. But conservatively just special casing memory intrinsics would be ok.

@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 29, 2025
@sushgokh
Copy link
Contributor Author

I think it would make more sense to adjust the loop properties calculation to not consider a non-volatile memset to be a side effect (in the sense of the forward progress guarantee).

The way to generalize that would probably to say that the call can't write inaccessible memory, assuming we always model synchronization as accessing inaccessible memory. But conservatively just special casing memory intrinsics would be ok.

Are the side effects limited to writing to inaccessible memory?

return false;
}

return I->mayThrow() || I->mayWriteToMemory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mayThrow() needs to be checked first, otherwise the early exit may bypass it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Thanks

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me, but I'd like to have a second opinion on only counting inaccessible memory as a side effect (for forward progress) here.

@sushgokh
Copy link
Contributor Author

This looks OK to me, but I'd like to have a second opinion on only counting inaccessible memory as a side effect (for forward progress) here.

is there any criteria that what should be considered as side effect for a specific instruction? e.g. memset here can modify globals and hence, I am curious to know if even this should be considered as side effect

@nikic
Copy link
Contributor

nikic commented Jul 29, 2025

Hm, it looks like FunctionAttrs currently does not infer inaccessible memory for atomic accesses: https://llvm.godbolt.org/z/cG6jn34ds That's not what I expected.

Given that, I think it would be better if you did something like if (isa<MemIntrinsic>(I) && !I->isVolatile()) instead. That should be safe and sufficient for what you need here.

is there any criteria that what should be considered as side effect for a specific instruction? e.g. memset here can modify globals and hence, I am curious to know if even this should be considered as side effect

In this context, "side effect" has a different meaning from what this term usually means in LLVM. See this part of the mustprogress spec:

This attribute indicates that the function is required to return, unwind, or interact with the environment in an observable way e.g. via a volatile memory access, I/O, or other synchronization.

Side effect here is exclusively about volatile accesses, synchronization and I/O, but not, for example, writing to plain memory.

@sushgokh
Copy link
Contributor Author

This attribute indicates that the function is required to return, unwind, or interact with the environment in an observable way e.g. via a volatile memory access, I/O, or other synchronization.

The clause "interact with the environment in an observable way" is quite subjective I think.

why not consider a simple store or memset doing some observable interaction with the environment? For memset, the last sentence of "mustprogress" says:

Note that willreturn implies mustprogress.

which means memset must be considered as having side effect.

In fact, what I dont understand is how volatile accesses interact in observable way ?

Now, coming back, as per lang ref, if instruction is not volatile, thats not causing side effect. But this may be very narrow interpretation of the mustprogress clause.

@sushgokh
Copy link
Contributor Author

sushgokh commented Aug 5, 2025

ping

Comment on lines 7428 to 7433
// Check if the function accesses inaccessible memory.
if (auto *CI = dyn_cast<CallInst>(I)) {
auto ME = CI->getMemoryEffects();
if (!isModSet(ME.getModRef(IRMemLocation::InaccessibleMem)))
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Check if the function accesses inaccessible memory.
if (auto *CI = dyn_cast<CallInst>(I)) {
auto ME = CI->getMemoryEffects();
if (!isModSet(ME.getModRef(IRMemLocation::InaccessibleMem)))
return false;
}
// Non-volatile memset / memcpy do not count as side-effect
// for forward progress.
if (isa<MemIntrinsic>(I) && !I->isVolatile())
return false;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but PR description needs an update.

#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/InterleavedRange.h"
#include "llvm/Support/KnownBits.h"
#include "llvm/Support/ModRef.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@sushgokh sushgokh changed the title [loop-idiom] Forget outer loop scev when loop-idiom introduces memset… [SCEV] Consider non-volatile memory intrinsics as not having side-effect for forward progress Aug 11, 2025
@sushgokh sushgokh merged commit e8918c3 into llvm:main Aug 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SCEV] llvm::ScalarEvolution::verify() expensive check failure
5 participants