Skip to content

Conversation

grigorypas
Copy link
Contributor

Branch probabilities from PGO profile data were not preserved during instruction selection at -O0 because BranchProbabilityInfo was only requested when OptLevel != None.

@llvmbot
Copy link
Member

llvmbot commented Oct 2, 2025

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-backend-loongarch

Author: Grigory Pastukhov (grigorypas)

Changes

Branch probabilities from PGO profile data were not preserved during instruction selection at -O0 because BranchProbabilityInfo was only requested when OptLevel != None.


Full diff: https://github.com/llvm/llvm-project/pull/161620.diff

8 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+7-8)
  • (modified) llvm/test/CodeGen/AArch64/O0-pipeline.ll (+6)
  • (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+5)
  • (modified) llvm/test/CodeGen/LoongArch/O0-pipeline.ll (+6)
  • (modified) llvm/test/CodeGen/PowerPC/O0-pipeline.ll (+6)
  • (modified) llvm/test/CodeGen/RISCV/O0-pipeline.ll (+6)
  • (modified) llvm/test/CodeGen/X86/O0-pipeline.ll (+6)
  • (added) llvm/test/CodeGen/X86/pgo-profile-o0.ll (+49)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index e61558c59bf0d..c7728ef1d28ec 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -398,15 +398,14 @@ void SelectionDAGISelLegacy::getAnalysisUsage(AnalysisUsage &AU) const {
   AU.addRequired<TargetLibraryInfoWrapperPass>();
   AU.addRequired<TargetTransformInfoWrapperPass>();
   AU.addRequired<AssumptionCacheTracker>();
-  if (UseMBPI && OptLevel != CodeGenOptLevel::None)
-      AU.addRequired<BranchProbabilityInfoWrapperPass>();
+  if (UseMBPI)
+    AU.addRequired<BranchProbabilityInfoWrapperPass>();
   AU.addRequired<ProfileSummaryInfoWrapperPass>();
   // AssignmentTrackingAnalysis only runs if assignment tracking is enabled for
   // the module.
   AU.addRequired<AssignmentTrackingAnalysis>();
   AU.addPreserved<AssignmentTrackingAnalysis>();
-  if (OptLevel != CodeGenOptLevel::None)
-      LazyBlockFrequencyInfoPass::getLazyBFIAnalysisUsage(AU);
+  LazyBlockFrequencyInfoPass::getLazyBFIAnalysisUsage(AU);
   MachineFunctionPass::getAnalysisUsage(AU);
 }
 
@@ -469,7 +468,7 @@ void SelectionDAGISel::initializeAnalysisResults(
   auto *PSI = MAMP.getCachedResult<ProfileSummaryAnalysis>(*Fn.getParent());
   BlockFrequencyInfo *BFI = nullptr;
   FAM.getResult<BlockFrequencyAnalysis>(Fn);
-  if (PSI && PSI->hasProfileSummary() && OptLevel != CodeGenOptLevel::None)
+  if (PSI && PSI->hasProfileSummary())
     BFI = &FAM.getResult<BlockFrequencyAnalysis>(Fn);
 
   FunctionVarLocs const *FnVarLocs = nullptr;
@@ -487,7 +486,7 @@ void SelectionDAGISel::initializeAnalysisResults(
   // into account).  That's unfortunate but OK because it just means we won't
   // ask for passes that have been required anyway.
 
-  if (UseMBPI && OptLevel != CodeGenOptLevel::None)
+  if (UseMBPI && (Fn.hasProfileData() || OptLevel != CodeGenOptLevel::None))
     FuncInfo->BPI = &FAM.getResult<BranchProbabilityAnalysis>(Fn);
   else
     FuncInfo->BPI = nullptr;
@@ -523,7 +522,7 @@ void SelectionDAGISel::initializeAnalysisResults(MachineFunctionPass &MFP) {
   AC = &MFP.getAnalysis<AssumptionCacheTracker>().getAssumptionCache(Fn);
   auto *PSI = &MFP.getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
   BlockFrequencyInfo *BFI = nullptr;
-  if (PSI && PSI->hasProfileSummary() && OptLevel != CodeGenOptLevel::None)
+  if (PSI && PSI->hasProfileSummary())
     BFI = &MFP.getAnalysis<LazyBlockFrequencyInfoPass>().getBFI();
 
   FunctionVarLocs const *FnVarLocs = nullptr;
@@ -544,7 +543,7 @@ void SelectionDAGISel::initializeAnalysisResults(MachineFunctionPass &MFP) {
   // into account).  That's unfortunate but OK because it just means we won't
   // ask for passes that have been required anyway.
 
-  if (UseMBPI && OptLevel != CodeGenOptLevel::None)
+  if (UseMBPI && (Fn.hasProfileData() || OptLevel != CodeGenOptLevel::None))
     FuncInfo->BPI =
         &MFP.getAnalysis<BranchProbabilityInfoWrapperPass>().getBPI();
   else
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index abc67eec32391..6029d3d4927ca 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -50,7 +50,13 @@
 ; CHECK-NEXT:       Analysis for ComputingKnownBits
 ; CHECK-NEXT:       InstructionSelect
 ; CHECK-NEXT:       ResetMachineFunction
+; CHECK-NEXT:       Dominator Tree Construction
+; CHECK-NEXT:       Natural Loop Information
+; CHECK-NEXT:       Post-Dominator Tree Construction
+; CHECK-NEXT:       Branch Probability Analysis
 ; CHECK-NEXT:       Assignment Tracking Analysis
+; CHECK-NEXT:       Lazy Branch Probability Analysis
+; CHECK-NEXT:       Lazy Block Frequency Analysis
 ; CHECK-NEXT:       AArch64 Instruction Selection
 ; CHECK-NEXT:       Finalize ISel and expand pseudo-instructions
 ; CHECK-NEXT:       Local Stack Slot Allocation
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index 65d0102a9d0dc..471c90ec73012 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -94,7 +94,12 @@
 ; GCN-O0-NEXT:        Dominator Tree Construction
 ; GCN-O0-NEXT:        Cycle Info Analysis
 ; GCN-O0-NEXT:        Uniformity Analysis
+; GCN-O0-NEXT:        Natural Loop Information
+; GCN-O0-NEXT:        Post-Dominator Tree Construction
+; GCN-O0-NEXT:        Branch Probability Analysis
 ; GCN-O0-NEXT:        Assignment Tracking Analysis
+; GCN-O0-NEXT:        Lazy Branch Probability Analysis
+; GCN-O0-NEXT:        Lazy Block Frequency Analysis
 ; GCN-O0-NEXT:        AMDGPU DAG->DAG Pattern Instruction Selection
 ; GCN-O0-NEXT:        MachineDominator Tree Construction
 ; GCN-O0-NEXT:        SI Fix SGPR copies
diff --git a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
index 9006b5c8d6fe1..7f368e59133a0 100644
--- a/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
+++ b/llvm/test/CodeGen/LoongArch/O0-pipeline.ll
@@ -34,7 +34,13 @@
 ; CHECK-NEXT:       Safe Stack instrumentation pass
 ; CHECK-NEXT:       Insert stack protectors
 ; CHECK-NEXT:       Module Verifier
+; CHECK-NEXT:       Dominator Tree Construction
+; CHECK-NEXT:       Natural Loop Information
+; CHECK-NEXT:       Post-Dominator Tree Construction
+; CHECK-NEXT:       Branch Probability Analysis
 ; CHECK-NEXT:       Assignment Tracking Analysis
+; CHECK-NEXT:       Lazy Branch Probability Analysis
+; CHECK-NEXT:       Lazy Block Frequency Analysis
 ; CHECK-NEXT:       LoongArch DAG->DAG Pattern Instruction Selection
 ; CHECK-NEXT:       Finalize ISel and expand pseudo-instructions
 ; CHECK-NEXT:       Local Stack Slot Allocation
diff --git a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
index 38b1074e55d22..a4a6d831d64fc 100644
--- a/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
+++ b/llvm/test/CodeGen/PowerPC/O0-pipeline.ll
@@ -33,7 +33,13 @@
 ; CHECK-NEXT:       Safe Stack instrumentation pass
 ; CHECK-NEXT:       Insert stack protectors
 ; CHECK-NEXT:       Module Verifier
+; CHECK-NEXT:       Dominator Tree Construction
+; CHECK-NEXT:       Natural Loop Information
+; CHECK-NEXT:       Post-Dominator Tree Construction
+; CHECK-NEXT:       Branch Probability Analysis
 ; CHECK-NEXT:       Assignment Tracking Analysis
+; CHECK-NEXT:       Lazy Branch Probability Analysis
+; CHECK-NEXT:       Lazy Block Frequency Analysis
 ; CHECK-NEXT:       PowerPC DAG->DAG Pattern Instruction Selection
 ; CHECK-NEXT:       PowerPC VSX Copy Legalization
 ; CHECK-NEXT:       Finalize ISel and expand pseudo-instructions
diff --git a/llvm/test/CodeGen/RISCV/O0-pipeline.ll b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
index 8714b286374a5..83379754a7507 100644
--- a/llvm/test/CodeGen/RISCV/O0-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O0-pipeline.ll
@@ -35,7 +35,13 @@
 ; CHECK-NEXT:       Safe Stack instrumentation pass
 ; CHECK-NEXT:       Insert stack protectors
 ; CHECK-NEXT:       Module Verifier
+; CHECK-NEXT:       Dominator Tree Construction
+; CHECK-NEXT:       Natural Loop Information
+; CHECK-NEXT:       Post-Dominator Tree Construction
+; CHECK-NEXT:       Branch Probability Analysis
 ; CHECK-NEXT:       Assignment Tracking Analysis
+; CHECK-NEXT:       Lazy Branch Probability Analysis
+; CHECK-NEXT:       Lazy Block Frequency Analysis
 ; CHECK-NEXT:       RISC-V DAG->DAG Pattern Instruction Selection
 ; CHECK-NEXT:       Finalize ISel and expand pseudo-instructions
 ; CHECK-NEXT:       Local Stack Slot Allocation
diff --git a/llvm/test/CodeGen/X86/O0-pipeline.ll b/llvm/test/CodeGen/X86/O0-pipeline.ll
index 0fbfb42d2a4dd..6aea51d8e87dc 100644
--- a/llvm/test/CodeGen/X86/O0-pipeline.ll
+++ b/llvm/test/CodeGen/X86/O0-pipeline.ll
@@ -35,7 +35,13 @@
 ; CHECK-NEXT:       Safe Stack instrumentation pass
 ; CHECK-NEXT:       Insert stack protectors
 ; CHECK-NEXT:       Module Verifier
+; CHECK-NEXT:       Dominator Tree Construction
+; CHECK-NEXT:       Natural Loop Information
+; CHECK-NEXT:       Post-Dominator Tree Construction
+; CHECK-NEXT:       Branch Probability Analysis
 ; CHECK-NEXT:       Assignment Tracking Analysis
+; CHECK-NEXT:       Lazy Branch Probability Analysis
+; CHECK-NEXT:       Lazy Block Frequency Analysis
 ; CHECK-NEXT:       X86 DAG->DAG Instruction Selection
 ; CHECK-NEXT:       X86 PIC Global Base Reg Initialization
 ; CHECK-NEXT:       Argument Stack Rebase
diff --git a/llvm/test/CodeGen/X86/pgo-profile-o0.ll b/llvm/test/CodeGen/X86/pgo-profile-o0.ll
new file mode 100644
index 0000000000000..009721fcbba78
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pgo-profile-o0.ll
@@ -0,0 +1,49 @@
+; RUN: llc -mtriple=x86_64-- -O0 -debug-pass=Structure %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=PASSES
+; RUN: llc -mtriple=x86_64-- -O0 -debug-only=branch-prob %s -o /dev/null 2>&1 | FileCheck %s --check-prefix=BRANCH_PROB
+; RUN: llc -mtriple=x86_64-- -O0 -stop-after=finalize-isel %s -o - | FileCheck %s --check-prefix=MIR
+
+; REQUIRES: asserts
+
+; This test verifies that PGO profile information (branch weights) is preserved
+; during instruction selection at -O0.
+
+; Test function with explicit branch weights from PGO.
+define i32 @test_pgo_preservation(i32 %x) !prof !15 {
+entry:
+  %cmp = icmp sgt i32 %x, 10
+  ; This branch has bias: 97 taken vs 3 not taken
+  br i1 %cmp, label %if.then, label %if.else, !prof !16
+
+if.then:
+  ; Hot path - should have high frequency
+  %add = add nsw i32 %x, 100
+  br label %if.end
+
+if.else:
+  ; Cold path - should have low frequency
+  %sub = sub nsw i32 %x, 50
+  br label %if.end
+
+if.end:
+  %result = phi i32 [ %add, %if.then ], [ %sub, %if.else ]
+  ret i32 %result
+}
+
+; Profile metadata with branch weights 97:3.
+!15 = !{!"function_entry_count", i64 100}
+!16 = !{!"branch_weights", i32 97, i32 3}
+
+; Verify that Branch Probability Analysis runs at O0.
+; PASSES: Branch Probability Analysis
+
+; Verify that the branch probabilities reflect the exact profile data.
+; BRANCH_PROB: ---- Branch Probability Info : test_pgo_preservation ----
+; BRANCH_PROB: set edge entry -> 0 successor probability to {{.*}} = 97.00%
+; BRANCH_PROB: set edge entry -> 1 successor probability to {{.*}} = 3.00%
+
+; Verify that machine IR preserves the branch probabilities from profile data
+; MIR: bb.0.entry:
+; MIR-NEXT: successors: %bb.{{[0-9]+}}({{0x03d70a3d|0x7c28f5c3}}), %bb.{{[0-9]+}}({{0x7c28f5c3|0x03d70a3d}})
+; The two successor probability values should be:
+; - 0x7c28f5c3: approximately 97% (high probability successor)
+; - 0x03d70a3d: approximately 3% (low probability successor)

@aemerson
Copy link
Contributor

aemerson commented Oct 2, 2025

...but why?

@grigorypas
Copy link
Contributor Author

...but why?

@aemerson This is useful in some situations such as for analyzing profile quality and calculating performance proxy. It is already enabled in LLVM IR (see #113985 and also motivation in the PR message). Also, this functionality #124334 will not work correctly in O0 without this PR.

@aemerson
Copy link
Contributor

aemerson commented Oct 2, 2025

...but why?

@aemerson This is useful in some situations such as for analyzing profile quality and calculating performance proxy. It is already enabled in LLVM IR (see #113985 and also motivation in the PR message). Also, this functionality #124334 will not work correctly in O0 without this PR.

That doesn't sound like a good justification for adding more passes to a configuration that's sensitive to compile time changes. Now I get that some passes may be effectively free if the information isn't requested, but for AArch64 this adds:

; CHECK-NEXT:       Dominator Tree Construction
; CHECK-NEXT:       Natural Loop Information
; CHECK-NEXT:       Post-Dominator Tree Construction
; CHECK-NEXT:       Branch Probability Analysis

... to the pipeline. Do these analyses not affect compile time?

@wlei-llvm wlei-llvm requested review from MatzeB, OCHyams, WenleiHe, apolloww, arsenm, sdesmalen-arm and wlei-llvm and removed request for arsenm October 2, 2025 17:00
@nikic nikic self-requested a review October 2, 2025 18:38
AU.addRequired<AssumptionCacheTracker>();
if (UseMBPI && OptLevel != CodeGenOptLevel::None)
AU.addRequired<BranchProbabilityInfoWrapperPass>();
if (UseMBPI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add extra check for PGO (Selector->TM.getPGOOption()) and only do the analysis when PGO is used.

@nikic
Copy link
Contributor

nikic commented Oct 2, 2025

... to the pipeline. Do these analyses not affect compile time?

They do: https://llvm-compile-time-tracker.com/compare.php?from=8a0d145d90df3eefd5e54a2b04d4f4f2515e99cc&to=3c131e564623813f71bb76011be3e812c5a65a7d&stat=instructions:u

@grigorypas
Copy link
Contributor Author

I added the check recommended by @apolloww.

I also had to add command line parameters to llc to populate PGOOptions (similar to how it is done for opt). Otherwise, it was hard to design test.

}

if (PGOOpt)
TM.setPGOOption(PGOOpt);
Copy link
Contributor

@apolloww apolloww Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the changes here are just to cover the case of directly invoke llc on an IR module with profile information (like the test does), I think we can use a simple flag instead of creating a PGOOption that acts like a boolean switch. In most scenarios, we'd feed cpp source and profile to clang driver, which would set PGOOption correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on a simple flag, would it make sense to still use PGOOption, but only with a single value? something like --pgo-kind=any-pgo-pipeline

we can pass any of the PGO option.

if (PGOKindFlag == AnyPGO)
 TM.setPGOOption(PGOOptions("", "", "", "",PGOOptions::SampleUse /*Any PGO action*/,
                            PGOOptions::NoAction);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it might be useful in the future. There is no way to inject PGOOption into llc. However, there might be logic relying on those parameters in CodeGen in the future that would be easier to test. Maybe I can keep just pgo-kind cmd parameter for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the others. Adding all of this just for a possible future use case seems unreasonable, we can always add more if necessary.

Also +1 to using --pgo-kind with some arbitrary value for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept only pgo-kind option with the only value pgo-sample-use-pipeline.


static bool shouldRegisterPGOPasses(const TargetMachine &TM,
CodeGenOptLevel OptLevel) {
bool RegisterPGOPasses = OptLevel != CodeGenOptLevel::None;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not need PGO passes if PGO is not on, even if OptLevel is not None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I make this change it breaks a lot of test cases (1024 tests). Some of the test cases use PGO information embedded in IR without setting PGOOptions flags.

Comment on lines 237 to 238
RegisterPGOPasses = Options.Action != PGOOptions::PGOAction::NoAction ||
Options.CSAction != PGOOptions::CSPGOAction::NoCSAction;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need BPI if PGO action was instrumentation or optimization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Fixed.


switch (PGOKindFlag) {
case SampleUse:
// Use default values for other PGOOptions parameters
Copy link
Contributor

@wlei-llvm wlei-llvm Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extend the comment to explain this is for the -O0 test.

(in case people would be confused why there is no profile file path)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@wlei-llvm wlei-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Member

@WenleiHe WenleiHe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm except a nit.

@grigorypas grigorypas merged commit 2c3f0e5 into llvm:main Oct 9, 2025
9 checks passed
svkeerthy pushed a commit that referenced this pull request Oct 9, 2025
… selection at -O0 (#161620)

Branch probabilities from PGO profile data were not preserved during
instruction selection at -O0 because BranchProbabilityInfo was only
requested when OptLevel != None.
@grigorypas grigorypas deleted the branch_weight_in_mir_o0 branch October 10, 2025 17:07
DharuniRAcharya pushed a commit to DharuniRAcharya/llvm-project that referenced this pull request Oct 13, 2025
… selection at -O0 (llvm#161620)

Branch probabilities from PGO profile data were not preserved during
instruction selection at -O0 because BranchProbabilityInfo was only
requested when OptLevel != None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants