Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCEV] Remove EqCacheSCEV #133186

Merged
merged 2 commits into from
Apr 1, 2025
Merged

[SCEV] Remove EqCacheSCEV #133186

merged 2 commits into from
Apr 1, 2025

Conversation

aeubanks
Copy link
Contributor

@aeubanks aeubanks commented Mar 27, 2025

This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth.

This also improves compile time in some compiles:
https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to #100721.

Fixes #130688.

This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth.

This doesn't fully fix the issue, since we may still consider deeply
nested SCEVs that only differ in the innermost expression as equivalent
due to hitting depth limits, but this helps with part of the problem.

This also improves compile time in some compiles:
https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to llvm#100721.

Partially fixes llvm#130688 and a miscompile we're seeing.
@aeubanks aeubanks requested review from rnk and jreiffers March 27, 2025 00:38
@aeubanks aeubanks requested a review from nikic as a code owner March 27, 2025 00:38
@llvmbot
Copy link
Member

llvmbot commented Mar 27, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Arthur Eubanks (aeubanks)

Changes

This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth.

This doesn't fully fix the issue, since we may still consider deeply nested SCEVs that only differ in the innermost expression as equivalent due to hitting depth limits, but this helps with part of the problem.

This also improves compile time in some compiles:
https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to #100721.

Partially fixes #130688 and a miscompile we're seeing.


Full diff: https://github.com/llvm/llvm-project/pull/133186.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+3-13)
  • (modified) llvm/unittests/Analysis/ScalarEvolutionTest.cpp (+29)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 314baa7c7aee1..82e46e8800b3a 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -653,8 +653,7 @@ static int CompareValueComplexity(const LoopInfo *const LI, Value *LV,
 // If the max analysis depth was reached, return std::nullopt, assuming we do
 // not know if they are equivalent for sure.
 static std::optional<int>
-CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
-                      const LoopInfo *const LI, const SCEV *LHS,
+CompareSCEVComplexity(const LoopInfo *const LI, const SCEV *LHS,
                       const SCEV *RHS, DominatorTree &DT, unsigned Depth = 0) {
   // Fast-path: SCEVs are uniqued so we can do a quick equality check.
   if (LHS == RHS)
@@ -665,9 +664,6 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
   if (LType != RType)
     return (int)LType - (int)RType;
 
-  if (EqCacheSCEV.isEquivalent(LHS, RHS))
-    return 0;
-
   if (Depth > MaxSCEVCompareDepth)
     return std::nullopt;
 
@@ -681,8 +677,6 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
 
     int X =
         CompareValueComplexity(LI, LU->getValue(), RU->getValue(), Depth + 1);
-    if (X == 0)
-      EqCacheSCEV.unionSets(LHS, RHS);
     return X;
   }
 
@@ -747,12 +741,10 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
       return (int)LNumOps - (int)RNumOps;
 
     for (unsigned i = 0; i != LNumOps; ++i) {
-      auto X = CompareSCEVComplexity(EqCacheSCEV, LI, LOps[i], ROps[i], DT,
-                                     Depth + 1);
+      auto X = CompareSCEVComplexity(LI, LOps[i], ROps[i], DT, Depth + 1);
       if (X != 0)
         return X;
     }
-    EqCacheSCEV.unionSets(LHS, RHS);
     return 0;
   }
 
@@ -775,11 +767,9 @@ static void GroupByComplexity(SmallVectorImpl<const SCEV *> &Ops,
                               LoopInfo *LI, DominatorTree &DT) {
   if (Ops.size() < 2) return;  // Noop
 
-  EquivalenceClasses<const SCEV *> EqCacheSCEV;
-
   // Whether LHS has provably less complexity than RHS.
   auto IsLessComplex = [&](const SCEV *LHS, const SCEV *RHS) {
-    auto Complexity = CompareSCEVComplexity(EqCacheSCEV, LI, LHS, RHS, DT);
+    auto Complexity = CompareSCEVComplexity(LI, LHS, RHS, DT);
     return Complexity && *Complexity < 0;
   };
   if (Ops.size() == 2) {
diff --git a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
index c72cecbba3cb8..84a958f2c95e2 100644
--- a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
+++ b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
@@ -1706,4 +1706,33 @@ TEST_F(ScalarEvolutionsTest, ComplexityComparatorIsStrictWeakOrdering) {
   });
 }
 
+TEST_F(ScalarEvolutionsTest, ComplexityComparatorIsStrictWeakOrdering2) {
+  // Regression test for a case where caching of equivalent values caused the
+  // comparator to get inconsistent.
+
+  Type *Int64Ty = Type::getInt64Ty(Context);
+  Type *PtrTy = PointerType::get(Context, 0);
+  FunctionType *FTy = FunctionType::get(Type::getVoidTy(Context),
+                                        {PtrTy, PtrTy, PtrTy, Int64Ty}, false);
+  Function *F = Function::Create(FTy, Function::ExternalLinkage, "f", M);
+  BasicBlock *BB = BasicBlock::Create(Context, "entry", F);
+  ReturnInst::Create(Context, nullptr, BB);
+
+  ScalarEvolution SE = buildSE(*F);
+
+  const SCEV *S0 = SE.getSCEV(F->getArg(0));
+  const SCEV *S1 = SE.getSCEV(F->getArg(1));
+  const SCEV *S2 = SE.getSCEV(F->getArg(2));
+
+  const SCEV *P0 = SE.getPtrToIntExpr(S0, Int64Ty);
+  const SCEV *P1 = SE.getPtrToIntExpr(S1, Int64Ty);
+  const SCEV *P2 = SE.getPtrToIntExpr(S2, Int64Ty);
+
+  const SCEV *M0 = SE.getNegativeSCEV(P0);
+  const SCEV *M2 = SE.getNegativeSCEV(P2);
+
+  SmallVector<const SCEV *, 6> Ops = {M2, P0, M0, P1, P2};
+  SE.getAddExpr(Ops);
+}
+
 }  // end namespace llvm

@rnk
Copy link
Collaborator

rnk commented Mar 27, 2025

Is this a partial fix? I guess the question is, is SCEV deduplication a correctness requirement? Is the depth cutoff sound?

This code is effectively trying to do a bucket/radix sort, like sorting strings by the first three characters:

abc
abb
aba
abd
abcf
abce
abcd

--> 3 char depth sort -->

aba
abb
abc
abcf # not strictly sorted, but ordered w.r.t others
abce # not strictly sorted
abcd # not strictly sorted
abd

edit: Bucket and radix sort are not good names for what this is doing. This is more like a partial radix sort, starting with the most-significant portion of the sorting key.

@aeubanks
Copy link
Contributor Author

I believe this is only a partial fix because if we limit ourselves to three characters we can still end up in a situation like

abc
abcd
abce
abcd

and we don't end up deduplicating the abcds. However, removing the depth limit may have compile time consequences, whereas the usefulness of the cache I think is unknown (discussed in https://reviews.llvm.org/D26389). this PR is an improvement on its own

@rnk
Copy link
Collaborator

rnk commented Mar 27, 2025

and we don't end up deduplicating the abcds.

I guess that's kind of what I'm getting at, it's not clear to me if removing duplicate, complex SCEVs is a correctness requirement.

Regardless, we can say affirmatively that this change fixes the comparator, it creates a valid partial ordering.

@aeubanks
Copy link
Contributor Author

and we don't end up deduplicating the abcds.

I guess that's kind of what I'm getting at, it's not clear to me if removing duplicate, complex SCEVs is a correctness requirement.

Regardless, we can say affirmatively that this change fixes the comparator, it creates a valid partial ordering.

I'm not sure. It does look like this is used to canonicalize SCEV expressions, unsure if that's required for correctness or not, but that seems important.

@nikic
Copy link
Contributor

nikic commented Mar 28, 2025

Would #101022 also fix your issue (the PR needs a rebase tho).

And no, I don't believe that the complexity logic is correctness relevant -- so I'm not sure this is really the root cause of the miscompile you're seeing?

@aeubanks
Copy link
Contributor Author

aeubanks commented Mar 28, 2025

Would #101022 also fix your issue (the PR needs a rebase tho).

Something along those lines would also fix this (also CompareSCEVComplexity isn't actually checking if the std::optional contains a value in that PR). But removing the cache altogether seems like a cleaner fix. Do you prefer that fix over this one?

And no, I don't believe that the complexity logic is correctness relevant -- so I'm not sure this is really the root cause of the miscompile you're seeing?

It's quite possible that this is not fixing a miscompile since it's very hard for us to reproduce the miscompile, any change seems to make it go away. But at least this fixes the comparator.

@aeubanks
Copy link
Contributor Author

I've updated the description to not claim that this is fixing a miscompile

@rnk
Copy link
Collaborator

rnk commented Mar 29, 2025

+1 to everything Arthur said. Your fix would fix the assertion as well. Returning std::optional as an addition might be a nice improvement, but IIRC we don't even use the 3-way return value, we boil it all down to a strict-less-than comparator, so having a 4-way return value (UNKNOWN, LT, EQ, GT) seems like extra complexity. UNKNOWN and EQ are equivalent for sorting purposes here, the SCEVs are equal for the purpose of prefix comparison.

@rnk rnk requested a review from xortator March 30, 2025 03:21
@rnk
Copy link
Collaborator

rnk commented Mar 30, 2025

The commit that introduced the Optional/std::optional 8a4ad88 basically fixed this bug for complex SCEVs, but it didn't touch the value cache. Regardless, I think removing it is best if it doesn't regress compile time.

@nikic
Copy link
Contributor

nikic commented Mar 30, 2025

As this cache is dealing with pathological cases, not average ones, have you verified this does not regress the motivating cases? It looks like the differential for the original change links a bunch. The MaxSCEVCompareDepth is pretty large as these depth limits go.

@aeubanks
Copy link
Contributor Author

indvars, require<scalar-evolution>, nary-reassociate are still fast on the IR in https://reviews.llvm.org/D3127, https://bugs.llvm.org/show_bug.cgi?id=18606, https://bugs.llvm.org/show_bug.cgi?id=30257, and https://bugs.llvm.org/show_bug.cgi?id=28721

@aeubanks
Copy link
Contributor Author

aeubanks commented Apr 1, 2025

indvars, require<scalar-evolution>, nary-reassociate are still fast on the IR in https://reviews.llvm.org/D3127, https://bugs.llvm.org/show_bug.cgi?id=18606, https://bugs.llvm.org/show_bug.cgi?id=30257, and https://bugs.llvm.org/show_bug.cgi?id=28721

@nikic is this sufficient enough to proceed?

@nikic
Copy link
Contributor

nikic commented Apr 1, 2025 via email

@aeubanks aeubanks merged commit e871143 into llvm:main Apr 1, 2025
11 checks passed
@aeubanks aeubanks deleted the scevcache branch April 1, 2025 16:49
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request Apr 2, 2025
This was added in https://reviews.llvm.org/D26389 to help with extremely
deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent
that CompareValueComplexity returned 0 due to hitting the max comparison
depth.

This also improves compile time in some compiles:

https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to llvm#100721.

Fixes llvm#130688.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion failure in SCEV - "Your comparator is not a valid strict-weak ordering"
5 participants