[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor #157888

lukel97 · 2025-09-10T16:03:15Z

Function analyses in LoopStandardAnalysisResults are marked as preserved by the loop pass adaptor, because LoopAnalysisManagerFunctionProxy manually invalidates most of them.

However the proxy doesn't invalidate BFI, since it is only preserved on a "lossy" basis: see https://reviews.llvm.org/D86156 and https://reviews.llvm.org/D110438.

So any changes to the CFG will result in BFI giving incorrect results, which is fine for loop passes which deal with the lossiness.

But the loop pass adapator still marks it as preserved, which causes the lossy result to leak out into function passes.

This causes incorrect results when viewed from e.g. LoopVectorizer, where an innermost loop header may be reported to have a smaller frequency than its successors.

This fixes this by dropping the call to preserve, and adds a test with the -O1 pipeline which shows the effects whenever the CFG is changed and UseBlockFrequencyInfo is set.

I've also dropped it for BranchProbabilityAnalysis too, but I couldn't test for it since UseBranchProbabilityInfo always seems to be false? This may be dead code.

Function analyses in LoopStandardAnalysisResults are marked as preserved by the loop pass adaptor, because LoopAnalysisManagerFunctionProxy manually invalidates most of them. However the proxy doesn't invalidate BFI, since it is only preserved on a "lossy" basis: see https://reviews.llvm.org/D86156 and https://reviews.llvm.org/D110438. So any changes to the CFG will result in BFI giving incorrect results, which is fine for loop passes which deal with the lossiness. But the loop pass adapator still marks it as preserved, which causes the lossy result to leak out into function passes. This causes incorrect results when viewed from e.g. LoopVectorizer, where an innermost loop header may be reported to have a smaller frequency than its successors. This fixes this by dropping the call to preserve, and adds a test with the -O1 pipeline which shows the effects whenever the CFG is changed and UseBlockFrequencyInfo is set. I've also dropped it for BranchProbabilityAnalysis too, but I couldn't test for it since UseBranchProbabilityInfo always seems to be false? This may be dead code.

llvmbot · 2025-09-10T16:03:49Z

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

Function analyses in LoopStandardAnalysisResults are marked as preserved by the loop pass adaptor, because LoopAnalysisManagerFunctionProxy manually invalidates most of them.

However the proxy doesn't invalidate BFI, since it is only preserved on a "lossy" basis: see https://reviews.llvm.org/D86156 and https://reviews.llvm.org/D110438.

So any changes to the CFG will result in BFI giving incorrect results, which is fine for loop passes which deal with the lossiness.

But the loop pass adapator still marks it as preserved, which causes the lossy result to leak out into function passes.

This causes incorrect results when viewed from e.g. LoopVectorizer, where an innermost loop header may be reported to have a smaller frequency than its successors.

This fixes this by dropping the call to preserve, and adds a test with the -O1 pipeline which shows the effects whenever the CFG is changed and UseBlockFrequencyInfo is set.

I've also dropped it for BranchProbabilityAnalysis too, but I couldn't test for it since UseBranchProbabilityInfo always seems to be false? This may be dead code.

Full diff: https://github.com/llvm/llvm-project/pull/157888.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Scalar/LoopPassManager.cpp (-4)
(modified) llvm/test/Other/loop-pm-invalidation.ll (+30)

diff --git a/llvm/lib/Transforms/Scalar/LoopPassManager.cpp b/llvm/lib/Transforms/Scalar/LoopPassManager.cpp
index e390fdfb81e7b..4b26ce12a28db 100644
--- a/llvm/lib/Transforms/Scalar/LoopPassManager.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopPassManager.cpp
@@ -349,10 +349,6 @@ PreservedAnalyses FunctionToLoopPassAdaptor::run(Function &F,
   PA.preserve<DominatorTreeAnalysis>();
   PA.preserve<LoopAnalysis>();
   PA.preserve<ScalarEvolutionAnalysis>();
-  if (UseBlockFrequencyInfo && F.hasProfileData())
-    PA.preserve<BlockFrequencyAnalysis>();
-  if (UseBranchProbabilityInfo && F.hasProfileData())
-    PA.preserve<BranchProbabilityAnalysis>();
   if (UseMemorySSA)
     PA.preserve<MemorySSAAnalysis>();
   return PA;
diff --git a/llvm/test/Other/loop-pm-invalidation.ll b/llvm/test/Other/loop-pm-invalidation.ll
index 25552f7f139fd..f680aa970d0c4 100644
--- a/llvm/test/Other/loop-pm-invalidation.ll
+++ b/llvm/test/Other/loop-pm-invalidation.ll
@@ -16,6 +16,11 @@
 ; RUN: opt -disable-output -disable-verify -verify-analysis-invalidation=0  -debug-pass-manager %s -aa-pipeline= 2>&1 \
 ; RUN:     -passes='loop(no-op-loop,loop-deletion),invalidate<scalar-evolution>,loop(no-op-loop)' \
 ; RUN:     | FileCheck %s --check-prefix=CHECK-SCEV-INV-AFTER-DELETE
+;
+; Test that BFI is invalidated after the loop adapter if any of the loop passes
+; invalidated it.
+; RUN: opt -disable-output -disable-verify -verify-analysis-invalidation=0  -debug-pass-manager %s -aa-pipeline= 2>&1 \
+; RUN:     -O1 | FileCheck %s --check-prefix=CHECK-BFI-INV
 
 define void @no_loops() {
 ; CHECK-LOOP-INV: Running pass: LoopSimplifyPass
@@ -242,3 +247,28 @@ l0.header:
 exit:
   ret void
 }
+
+; CHECK-BFI-INV-LABEL: Running analysis: OuterAnalysisManagerProxy<FunctionAnalysisManager, Loop, LoopStandardAnalysisResults &> on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: LoopInstSimplifyPass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: LoopSimplifyCFGPass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: LICMPass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: LoopRotatePass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: LICMPass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: SimpleLoopUnswitchPass on loop %l0.header in function simplifiable_loop
+; CHECK-BFI-INV-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on simplifiable_loop
+; CHECK-BFI-INV-NEXT: Invalidating analysis: BranchProbabilityAnalysis on simplifiable_loop
+; CHECK-BFI-INV-NEXT: Invalidating analysis: BlockFrequencyAnalysis on simplifiable_loop
+; CHECK-BFI-INV-NEXT: Running pass: SimplifyCFGPass on simplifiable_loop (5 instructions)
+
+define void @simplifiable_loop(i1 %c) !prof !0 {
+entry:
+  br label %l0.header
+
+l0.header:
+  br label %l0.latch
+
+l0.latch:
+  br i1 %c, label %l0.header, label %l0.latch
+}
+
+!0 = !{!"function_entry_count", i64 1}

Apparently the pass ID is based off the demangled type name, which I guess differs on MSVC?

alinas · 2025-09-11T18:14:07Z

Could you add a testcase showing the incorrect results when followed by the LoopVectorizer?

aeubanks · 2025-09-11T18:17:42Z

or at least a test that runs print<block-freq> that changes with this commit

lukel97 · 2025-09-15T09:31:54Z

Could you add a testcase showing the incorrect results when followed by the LoopVectorizer?

This mostly causes incorrect results in an upcoming patch where I'm trying to use BFI more in the LoopVectorizer cost model. The smallest reproducer I have so far is with opt -p 'lto<O3>':

target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"

define void @_Z22SetupPrecalculatedDatav(ptr readonly captures(none) %KingSafetyMask, ptr writeonly captures(none) initializes((0, 8)) %Mask, i1 %0) local_unnamed_addr #0 !prof !0 {
  br label %2

2:                                                ; preds = %5, %1
  %.pre = load i64, ptr %KingSafetyMask, align 8
  br label %3

3:                                                ; preds = %3, %2
  %4 = phi i1 [ false, %2 ], [ true, %3 ]
  store i64 %.pre, ptr %Mask, align 8
  br i1 %4, label %5, label %3

5:                                                ; preds = %3
  br i1 %0, label %6, label %2

6:                                                ; preds = %5
  ret void
}

attributes #0 = { "target-features"="+v" }

!0 = !{!"function_entry_count", i64 1}

This doesn't crash today but if you add this assert that the BFI is accurate for the loops that the loop vectorizer processes, it fails:

    // Stick this somewhere in LoopVectorizePass::runImpl
    if (!L->isInvalid() && L->isInnermost() && !L->hasNoExitBlocks())
      for (BasicBlock *BB : L->blocks())
        assert(BFI->getBlockFreq(L->getHeader()) >= BFI->getBlockFreq(BB))

I'll try and see if I can get a test case with print<block-freq>. So far I've had no luck since I can't seem to trigger the assertion when manually specifying the pipeline with the output of print-pipeline-passes, i.e. it seems to need -p 'lto<O3>'

lukel97 · 2025-09-15T17:39:18Z

@alinas @aeubanks I gave this a try, but unfortunately I don't think it's (easily) possible to add a test which shows any difference in LoopVectorizer behaviour today. LoopVectorizer currently only uses BFI to check if it's optimizing for size, but it's very difficult to recreate the conditions to trigger this in combination with the loop simplifications needed to trigger the invalid BFI state.

Also to recreate this UseBlockFrequencyInfo needs to be set to true in createFunctionToLoopPassAdaptor, but we only do this with the LTO pipelines in e.g. buildLTODefaultPipeline, so AFAICT we can't do 'print'.

I've instead posted the motivating LoopVectorizer patch which includes a test that crashes without this fix: https://github.com/llvm/llvm-project/pull/158690/files#diff-a388af039904a990257f3235bf23eb5271c54ea5a31f916b760b36143d937d81

Specifically, it crashes because the BFI results are inaccurate: The loop header will have a lower frequency than one of the blocks it dominates, which causes a divide by zero when calculating the probability of the block. I hope this illustrates the problem a bit better.

aeubanks · 2025-09-17T21:52:48Z

paging back in new pass manager into my brain... I'm remembering the compile-time reason loop passes must preserve function analyses (search "quadratic" in https://llvm.org/docs/NewPassManager.html#using-analyses). but this is doing something different, it's not preserving after all the loop passes run in the function -> loop adaptor. I think that should be fine compile-time-wise, it's equivalent to a function pass using and not preserving BFI

I think this makes sense to me, does anyone else have any objections?

lukel97 · 2025-09-18T05:06:44Z

but this is doing something different, it's not preserving after all the loop passes run in the function -> loop adaptor. I think that should be fine compile-time-wise, it's equivalent to a function pass using and not preserving BFI

Thanks, that matches my understanding too.

FWIW it looks like after LoopPredication was changed to not use BFI in https://reviews.llvm.org/D111668, SimpleLoopUnswitch is the last user of BFI in LoopStandardAnalysisResults. I wonder if it's possible to do something similar to LoopPredication to remove the dependency?

BranchProbabilityInfo also seems to be unused, I'll try and open a PR to remove it.

lukel97 · 2025-09-18T05:31:11Z

FWIW it looks like after LoopPredication was changed to not use BFI in https://reviews.llvm.org/D111668, SimpleLoopUnswitch is the last user of BFI in LoopStandardAnalysisResults.

Whoops sorry it removed the use of BPI not BFI. It looks like at the time that BFI was made added to LoopStandardAnalysisResults in https://reviews.llvm.org/D86156#change-bAzAQih0v9Mp LICM was the main user? But LICM removed the use in 1a25d0b 04f3c20

lukel97 added 2 commits September 10, 2025 23:33

Precommit tests

d62b028

lukel97 requested review from alinas, aeubanks, annamthomas, paperchalice and vikramRH September 10, 2025 16:03

llvmbot added the llvm:transforms label Sep 10, 2025

Try fix windows test failure

0c21c2e

Apparently the pass ID is based off the demangled type name, which I guess differs on MSVC?

lukel97 mentioned this pull request Sep 15, 2025

[VPlan] Use BlockFrequencyInfo in getPredBlockCostDivisor #158690

Open

aeubanks requested a review from mtrofin September 17, 2025 21:52

aeubanks approved these changes Sep 17, 2025

View reviewed changes

mtrofin approved these changes Sep 17, 2025

View reviewed changes

Merge branch 'main' into loop-pm-invalidate-bfi

329acd5

lukel97 enabled auto-merge (squash) September 18, 2025 05:07

lukel97 merged commit 4663d25 into llvm:main Sep 18, 2025
8 checks passed

lukel97 mentioned this pull request Sep 18, 2025

[SimpleLoopUnswitch] Don't use BlockFrequencyInfo to skip cold loops #159522

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor #157888

[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor #157888

Uh oh!

lukel97 commented Sep 10, 2025

Uh oh!

llvmbot commented Sep 10, 2025

Uh oh!

alinas commented Sep 11, 2025

Uh oh!

aeubanks commented Sep 11, 2025

Uh oh!

lukel97 commented Sep 15, 2025 •

edited

Loading

Uh oh!

lukel97 commented Sep 15, 2025 •

edited

Loading

Uh oh!

aeubanks commented Sep 17, 2025

Uh oh!

lukel97 commented Sep 18, 2025

Uh oh!

lukel97 commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor #157888

[NewPM] Don't preserve BlockFrequencyInfo in FunctionToLoopPassAdaptor #157888

Uh oh!

Conversation

lukel97 commented Sep 10, 2025

Uh oh!

llvmbot commented Sep 10, 2025

Uh oh!

alinas commented Sep 11, 2025

Uh oh!

aeubanks commented Sep 11, 2025

Uh oh!

lukel97 commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukel97 commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aeubanks commented Sep 17, 2025

Uh oh!

lukel97 commented Sep 18, 2025

Uh oh!

lukel97 commented Sep 18, 2025

Uh oh!

Uh oh!

Uh oh!

lukel97 commented Sep 15, 2025 •

edited

Loading

lukel97 commented Sep 15, 2025 •

edited

Loading