Skip to content

Conversation

@madhur13490
Copy link
Contributor

This patch skips GVN is !prof metadata indicates zero frequency.

@llvmbot
Copy link
Member

llvmbot commented Nov 4, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Madhur Amilkanthwar (madhur13490)

Changes

This patch skips GVN is !prof metadata indicates zero frequency.


Full diff: https://github.com/llvm/llvm-project/pull/166336.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/GVN.cpp (+10)
  • (added) llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll (+41)
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index 72e1131a54a86..0dcc194ae05db 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -894,6 +894,16 @@ PreservedAnalyses GVNPass::run(Function &F, FunctionAnalysisManager &AM) {
     MSSA = &AM.getResult<MemorySSAAnalysis>(F);
   }
   auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
+  
+  // Skip the pass if function has zero entry count in PGO.
+  // This indicates that the function is never executed according to the profile data.
+  auto EntryCount = F.getEntryCount();
+  if (EntryCount && EntryCount->getCount() == 0) {
+    LLVM_DEBUG(dbgs() << "GVN: Skipping function '" << F.getName()
+                      << "' with zero profile entry count\n");
+    return PreservedAnalyses::all();
+  }
+  
   bool Changed = runImpl(F, AC, DT, TLI, AA, MemDep, LI, &ORE,
                          MSSA ? &MSSA->getMSSA() : nullptr);
   if (!Changed)
diff --git a/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll b/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll
new file mode 100644
index 0000000000000..d7088d4b4a014
--- /dev/null
+++ b/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll
@@ -0,0 +1,41 @@
+; Test that GVN is skipped when function has zero entry count in PGO
+; RUN: opt -passes='gvn' -S < %s | FileCheck %s
+
+; Function with ZERO entry count - GVN should skip this function
+; The redundant computation should remain because GVN doesn't run
+; CHECK-LABEL: @zero_freq_function(
+; CHECK-NEXT: entry:
+; CHECK-NEXT:   %a = add i32 %x, 1
+; CHECK-NEXT:   %b = add i32 %a, 2
+; CHECK-NEXT:   %c = add i32 %a, 2
+; CHECK-NEXT:   %result = add i32 %b, %c
+; CHECK-NEXT:   ret i32 %result
+define i32 @zero_freq_function(i32 %x) !prof !0 {
+entry:
+  %a = add i32 %x, 1
+  %b = add i32 %a, 2
+  %c = add i32 %a, 2    ; Redundant - but GVN should not  optimize due to zero freq
+  %result = add i32 %b, %c
+  ret i32 %result
+}
+
+; Function with NON-ZERO entry count - GVN should run normally
+; The redundant computation should be eliminated by GVN
+; CHECK-LABEL: @nonzero_freq_function(
+; CHECK-NEXT: entry:
+; CHECK-NEXT:   %a = add i32 %x, 1
+; CHECK-NEXT:   %b = add i32 %a, 2
+; CHECK-NEXT:   %result = add i32 %b, %b
+; CHECK-NEXT:   ret i32 %result
+define i32 @nonzero_freq_function(i32 %x) !prof !1 {
+entry:
+  %a = add i32 %x, 1
+  %b = add i32 %a, 2
+  %c = add i32 %a, 2    ; Redundant - GVN optimizes this
+  %result = add i32 %b, %c
+  ret i32 %result
+}
+
+!0 = !{!"function_entry_count", i64 0}      ; Zero frequency
+!1 = !{!"function_entry_count", i64 1000}   ; Non-zero frequency
+

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions cpp -- llvm/lib/Transforms/Scalar/GVN.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index e88fcdd26..c841b3957 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -896,7 +896,7 @@ PreservedAnalyses GVNPass::run(Function &F, FunctionAnalysisManager &AM) {
     MSSA = &AM.getResult<MemorySSAAnalysis>(F);
   }
   auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
-  
+
   // Skip the pass if function has zero entry count in PGO.
   // This indicates that the function is never executed according to the profile
   // data.

@nikic
Copy link
Contributor

nikic commented Nov 4, 2025

Missing justification for the change. This looks very wrong to me, as partial profiles are common. The optimization for the non-profiled parts should be the same as absence of the profile.

@madhur13490
Copy link
Contributor Author

Missing justification for the change. This looks very wrong to me, as partial profiles are common. The optimization for the non-profiled parts should be the same as absence of the profile.

Some of the internal workloads have GVN in the top 5 when profiled for compile-time. Is there a way to disambiguate from a partial profile?

@mtrofin
Copy link
Member

mtrofin commented Nov 4, 2025

Missing justification for the change. This looks very wrong to me, as partial profiles are common. The optimization for the non-profiled parts should be the same as absence of the profile.

Some of the internal workloads have GVN in the top 5 when profiled for compile-time. Is there a way to disambiguate from a partial profile?

To resolve this, could this be behind a flag?

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is to fix a compile time regression/issue, could there be some more investigation into what is actually causing it?

This seems like a very band-aid fix that might just paper over any actual issues.

This patch skips GVN is !prof metadata indicates zero frequency.
@madhur13490 madhur13490 force-pushed the features/madhura/skip-gvn branch from 461fe22 to 765dff4 Compare November 5, 2025 04:52
@madhur13490
Copy link
Contributor Author

Missing justification for the change. This looks very wrong to me, as partial profiles are common. The optimization for the non-profiled parts should be the same as absence of the profile.

Some of the internal workloads have GVN in the top 5 when profiled for compile-time. Is there a way to disambiguate from a partial profile?

To resolve this, could this be behind a flag?

Yes, this is perfectly fine with me. I added a new flag in the commit. Please have a look.

@madhur13490
Copy link
Contributor Author

If this is to fix a compile time regression/issue, could there be some more investigation into what is actually causing it?

This seems like a very band-aid fix that might just paper over any actual issues.

The real culprit is the MD algorithms, which easily achieve quadratic complexity. I have been pushing for MSSA migration, but we have had little progress over the last few weeks. Once we migrate to MSSA, I expect the issue to go away.

Copy link
Member

@mtrofin mtrofin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM because it sounds like the problem is understood, and the mitigation is interim and opt-in. I don't know if there's more background to this, though, so I'd wait for others to lgtm.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real culprit is the MD algorithms, which easily achieve quadratic complexity. I have been pushing for MSSA migration, but we have had little progress over the last few weeks. Once we migrate to MSSA, I expect the issue to go away.

This is a MSSA migration for an upstream pass or for an internal migration?

Code formatting needs fixing before landing, but otherwise LGTM. It would be good if you can commit to removing this flag once the motivating use case for this is resolved properly. Not sure if there's anyone else who would be good for reviewing GVN patches. Nikita is out until December. I think this is probably fine to land as is.

@madhur13490
Copy link
Contributor Author

The real culprit is the MD algorithms, which easily achieve quadratic complexity. I have been pushing for MSSA migration, but we have had little progress over the last few weeks. Once we migrate to MSSA, I expect the issue to go away.

This is a MSSA migration for an upstream pass or for an internal migration?

No, migration to MSSA in GVN is planned upstream. @antoniofrighetto is doing the transition.

Code formatting needs fixing before landing, but otherwise LGTM. It would be good if you can commit to removing this flag once the motivating use case for this is resolved properly. Not sure if there's anyone else who would be good for reviewing GVN patches. Nikita is out until December. I think this is probably fine to land as is.

Thanks! I will wait for a couple of days to land. I don't mind reverting the patch if the issue is gone. I think I will wait at least the migration to MSSA is done.

@madhur13490 madhur13490 force-pushed the features/madhura/skip-gvn branch from 1375301 to 5a4293b Compare November 6, 2025 07:47
@nikic
Copy link
Contributor

nikic commented Nov 6, 2025

I don't think this change makes sense, even on a temporary basis. If MDA in GVN is slow for some cases, that's because a complexity cutoff is missing or has a too high value. MSSA doesn't really fundamentally change the picture (especially as GVN needs to scan MemoryUses). It still needs to stop at some point, though the cutoffs are going to be different.

@rightrotate
Copy link

Hi,
My company, RRL, is a downstream user of PGO. We find this patch useful as it allows us to skip the pass. GVN has been painful several times, and skipping it on "dead" code is useful.

I don't think this change makes sense, even on a temporary basis

I disagree. This does make sense. What is the problem of having this and keeping it under a flag? There are numerous other flags that can help.

@nikic
Copy link
Contributor

nikic commented Nov 7, 2025

The problem is that this is papering over the issue in a way that might benefit some specific users, but still leaves it for everyone else (especially as this is behind a flag). Everyone will be better off if the problematic case can be addressed directly.

Is it possible to share any test cases that exhibit the compile time issue this PR is trying to address?

At this point it's mainly not clear to me that it's really impossible (with reasonable effort) to fix this in a way that does not rely on PGO plus an internal option.

@rightrotate
Copy link

The problem is that this is papering over the issue in a way that might benefit some specific users, but still leaves it for everyone else (especially as this is behind a flag). Everyone will be better off if the problematic case can be addressed directly.

Is it possible to share any test cases that exhibit the compile time issue this PR is trying to address?

At this point it's mainly not clear to me that it's really impossible (with reasonable effort) to fix this in a way that does not rely on PGO plus an internal option.

Unfortunately, I can't share the test case as it comes from a customer. However, I can share the nature of the file. It is a very large file, has 1000+ entrypoints called by another module. However, in one invocation, exactly one entry point is run. However, GVN and other passes are run on other entry points too, thus burning compile-time. We have seen that GVN is in top 3 in our profiling. This patch would allow us to skip the problematic pass and give us some breathing space. It may be possible to tune, but that will invite other issues like regression in unknown benchmarks, and thus tuning for just one case is what we want.

I still think having such an option would help us, thus, please allow it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants