-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[GVN][PGO] Skip GVN if entry BlockFreq is 0 #166336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-transforms Author: Madhur Amilkanthwar (madhur13490) ChangesThis patch skips GVN is !prof metadata indicates zero frequency. Full diff: https://github.com/llvm/llvm-project/pull/166336.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index 72e1131a54a86..0dcc194ae05db 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -894,6 +894,16 @@ PreservedAnalyses GVNPass::run(Function &F, FunctionAnalysisManager &AM) {
MSSA = &AM.getResult<MemorySSAAnalysis>(F);
}
auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
+
+ // Skip the pass if function has zero entry count in PGO.
+ // This indicates that the function is never executed according to the profile data.
+ auto EntryCount = F.getEntryCount();
+ if (EntryCount && EntryCount->getCount() == 0) {
+ LLVM_DEBUG(dbgs() << "GVN: Skipping function '" << F.getName()
+ << "' with zero profile entry count\n");
+ return PreservedAnalyses::all();
+ }
+
bool Changed = runImpl(F, AC, DT, TLI, AA, MemDep, LI, &ORE,
MSSA ? &MSSA->getMSSA() : nullptr);
if (!Changed)
diff --git a/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll b/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll
new file mode 100644
index 0000000000000..d7088d4b4a014
--- /dev/null
+++ b/llvm/test/Transforms/GVN/skip-gvn-blockfreq.ll
@@ -0,0 +1,41 @@
+; Test that GVN is skipped when function has zero entry count in PGO
+; RUN: opt -passes='gvn' -S < %s | FileCheck %s
+
+; Function with ZERO entry count - GVN should skip this function
+; The redundant computation should remain because GVN doesn't run
+; CHECK-LABEL: @zero_freq_function(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: %a = add i32 %x, 1
+; CHECK-NEXT: %b = add i32 %a, 2
+; CHECK-NEXT: %c = add i32 %a, 2
+; CHECK-NEXT: %result = add i32 %b, %c
+; CHECK-NEXT: ret i32 %result
+define i32 @zero_freq_function(i32 %x) !prof !0 {
+entry:
+ %a = add i32 %x, 1
+ %b = add i32 %a, 2
+ %c = add i32 %a, 2 ; Redundant - but GVN should not optimize due to zero freq
+ %result = add i32 %b, %c
+ ret i32 %result
+}
+
+; Function with NON-ZERO entry count - GVN should run normally
+; The redundant computation should be eliminated by GVN
+; CHECK-LABEL: @nonzero_freq_function(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: %a = add i32 %x, 1
+; CHECK-NEXT: %b = add i32 %a, 2
+; CHECK-NEXT: %result = add i32 %b, %b
+; CHECK-NEXT: ret i32 %result
+define i32 @nonzero_freq_function(i32 %x) !prof !1 {
+entry:
+ %a = add i32 %x, 1
+ %b = add i32 %a, 2
+ %c = add i32 %a, 2 ; Redundant - GVN optimizes this
+ %result = add i32 %b, %c
+ ret i32 %result
+}
+
+!0 = !{!"function_entry_count", i64 0} ; Zero frequency
+!1 = !{!"function_entry_count", i64 1000} ; Non-zero frequency
+
|
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions cpp -- llvm/lib/Transforms/Scalar/GVN.cpp --diff_from_common_commit
View the diff from clang-format here.diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index e88fcdd26..c841b3957 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -896,7 +896,7 @@ PreservedAnalyses GVNPass::run(Function &F, FunctionAnalysisManager &AM) {
MSSA = &AM.getResult<MemorySSAAnalysis>(F);
}
auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
-
+
// Skip the pass if function has zero entry count in PGO.
// This indicates that the function is never executed according to the profile
// data.
|
|
Missing justification for the change. This looks very wrong to me, as partial profiles are common. The optimization for the non-profiled parts should be the same as absence of the profile. |
Some of the internal workloads have GVN in the top 5 when profiled for compile-time. Is there a way to disambiguate from a partial profile? |
To resolve this, could this be behind a flag? |
boomanaiden154
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is to fix a compile time regression/issue, could there be some more investigation into what is actually causing it?
This seems like a very band-aid fix that might just paper over any actual issues.
This patch skips GVN is !prof metadata indicates zero frequency.
461fe22 to
765dff4
Compare
Yes, this is perfectly fine with me. I added a new flag in the commit. Please have a look. |
The real culprit is the MD algorithms, which easily achieve quadratic complexity. I have been pushing for MSSA migration, but we have had little progress over the last few weeks. Once we migrate to MSSA, I expect the issue to go away. |
mtrofin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM because it sounds like the problem is understood, and the mitigation is interim and opt-in. I don't know if there's more background to this, though, so I'd wait for others to lgtm.
boomanaiden154
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real culprit is the MD algorithms, which easily achieve quadratic complexity. I have been pushing for MSSA migration, but we have had little progress over the last few weeks. Once we migrate to MSSA, I expect the issue to go away.
This is a MSSA migration for an upstream pass or for an internal migration?
Code formatting needs fixing before landing, but otherwise LGTM. It would be good if you can commit to removing this flag once the motivating use case for this is resolved properly. Not sure if there's anyone else who would be good for reviewing GVN patches. Nikita is out until December. I think this is probably fine to land as is.
No, migration to MSSA in GVN is planned upstream. @antoniofrighetto is doing the transition.
Thanks! I will wait for a couple of days to land. I don't mind reverting the patch if the issue is gone. I think I will wait at least the migration to MSSA is done. |
Run clang-format
1375301 to
5a4293b
Compare
|
I don't think this change makes sense, even on a temporary basis. If MDA in GVN is slow for some cases, that's because a complexity cutoff is missing or has a too high value. MSSA doesn't really fundamentally change the picture (especially as GVN needs to scan MemoryUses). It still needs to stop at some point, though the cutoffs are going to be different. |
|
Hi,
I disagree. This does make sense. What is the problem of having this and keeping it under a flag? There are numerous other flags that can help. |
|
The problem is that this is papering over the issue in a way that might benefit some specific users, but still leaves it for everyone else (especially as this is behind a flag). Everyone will be better off if the problematic case can be addressed directly. Is it possible to share any test cases that exhibit the compile time issue this PR is trying to address? At this point it's mainly not clear to me that it's really impossible (with reasonable effort) to fix this in a way that does not rely on PGO plus an internal option. |
Unfortunately, I can't share the test case as it comes from a customer. However, I can share the nature of the file. It is a very large file, has 1000+ entrypoints called by another module. However, in one invocation, exactly one entry point is run. However, GVN and other passes are run on other entry points too, thus burning compile-time. We have seen that GVN is in top 3 in our profiling. This patch would allow us to skip the problematic pass and give us some breathing space. It may be possible to tune, but that will invite other issues like regression in unknown benchmarks, and thus tuning for just one case is what we want. I still think having such an option would help us, thus, please allow it. |
This patch skips GVN is !prof metadata indicates zero frequency.