[DFAJumpThreading] Constraint the number of cloned instructions #161632

XChy · 2025-10-02T06:18:05Z

Duplicating blocks of threaded paths may cause a significant regression in IR size and slow down compile-time in later optimizations. This patch adds a coarse constraint on the number of duplicated instructions.

New compile-time number: https://llvm-compile-time-tracker.com/compare.php?from=6935a3362628e823451baf86d15694d57f9664ed&to=455bd99ac151dd0445737f6e02210c2a4ec2dac5&stat=instructions%3Au

llvmbot · 2025-10-02T06:18:42Z

@llvm/pr-subscribers-llvm-transforms

Author: Hongyu Chen (XChy)

Changes

Duplicating blocks of threaded paths may cause a significant regression in IR size and slow down compile-time in later optimizations. This patch adds a coarse constraint on the number of duplicated instructions.

New compile-time number: https://llvm-compile-time-tracker.com/compare.php?from=6935a3362628e823451baf86d15694d57f9664ed&to=455bd99ac151dd0445737f6e02210c2a4ec2dac5&stat=instructions%3Au

Full diff: https://github.com/llvm/llvm-project/pull/161632.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp (+25-1)

diff --git a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
index e9a3e983bc1e2..3bd7f8e2fcf93 100644
--- a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
+++ b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
@@ -120,6 +120,12 @@ static cl::opt<unsigned>
                   cl::desc("Maximum cost accepted for the transformation"),
                   cl::Hidden, cl::init(50));
 
+static cl::opt<double> MaxClonedRate(
+    "dfa-max-cloned-rate",
+    cl::desc(
+        "Maximum cloned instructions rate accepted for the transformation"),
+    cl::Hidden, cl::init(7.5));
+
 namespace {
 
 class SelectInstToUnfold {
@@ -828,6 +834,7 @@ struct TransformDFA {
   /// also returns false if it is illegal to clone some required block.
   bool isLegalAndProfitableToTransform() {
     CodeMetrics Metrics;
+    uint64_t NumClonedInst = 0;
     SwitchInst *Switch = SwitchPaths->getSwitchInst();
 
     // Don't thread switch without multiple successors.
@@ -837,7 +844,6 @@ struct TransformDFA {
     // Note that DuplicateBlockMap is not being used as intended here. It is
     // just being used to ensure (BB, State) pairs are only counted once.
     DuplicateBlockMap DuplicateMap;
-
     for (ThreadingPath &TPath : SwitchPaths->getThreadingPaths()) {
       PathType PathBBs = TPath.getPath();
       APInt NextState = TPath.getExitValue();
@@ -848,6 +854,7 @@ struct TransformDFA {
       BasicBlock *VisitedBB = getClonedBB(BB, NextState, DuplicateMap);
       if (!VisitedBB) {
         Metrics.analyzeBasicBlock(BB, *TTI, EphValues);
+        NumClonedInst += range_size(*BB);
         DuplicateMap[BB].push_back({BB, NextState});
       }
 
@@ -865,6 +872,7 @@ struct TransformDFA {
         if (VisitedBB)
           continue;
         Metrics.analyzeBasicBlock(BB, *TTI, EphValues);
+        NumClonedInst += range_size(*BB);
         DuplicateMap[BB].push_back({BB, NextState});
       }
 
@@ -901,6 +909,22 @@ struct TransformDFA {
       }
     }
 
+    // Too much cloned instructions slow down later optimizations, especially
+    // SLPVectorizer.
+    // TODO: Thread the switch partially before reaching the threshold.
+    uint64_t NumOrigInst = 0;
+    for (auto &[BB, _] : DuplicateMap)
+      NumOrigInst += range_size(*BB);
+    if (double(NumClonedInst) / double(NumOrigInst) > MaxClonedRate) {
+      LLVM_DEBUG(dbgs() << "DFA Jump Threading: Not jump threading, too much "
+                           "instructions wll be cloned\n");
+      ORE->emit([&]() {
+        return OptimizationRemarkMissed(DEBUG_TYPE, "NotProfitable", Switch)
+               << "Too much instructions will be cloned.";
+      });
+      return false;
+    }
+
     InstructionCost DuplicationCost = 0;
 
     unsigned JumpTableSize = 0;

UsmanNadeem · 2025-10-03T16:11:42Z

How many transformations are restricted by this (would it be possible to check stats on the test suite)? I fear that restricting possibly beneficial transformations to avoid second-order effects might not be the best way forward.

XChy · 2025-10-03T16:53:01Z

How many transformations are restricted by this (would it be possible to check stats on the test suite)?

I have checked the statistics on the test suite. Only reduce one transformation in the outlier libclamav_nsis_LZMADecode.c.

I fear that restricting possibly beneficial transformations to avoid second-order effects might not be the best way forward.

Passes like inlining and loop unrolling would prevent massive expansion; otherwise, the compile time will blow up in later optimizations, although there are actually benefits after expansion. That's an acceptable trade-off for me.

The outlier libclamav_nsis_LZMADecode.c grows from about 1K lines to over 4K lines after DFAJumpThreading, that's unacceptable to me.

llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp

UsmanNadeem

LGTM.

[DFAJumpThreading] Constraint the number of cloned instructions

1010fc0

XChy requested review from nikic and UsmanNadeem October 2, 2025 06:18

llvmbot added the llvm:transforms label Oct 2, 2025

XChy mentioned this pull request Oct 3, 2025

[DFAJumpThreading] Enable DFAJumpThread by default. #157646

Open

use keys

3bb61a8

efriedma-quic reviewed Oct 3, 2025

View reviewed changes

llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp Outdated Show resolved Hide resolved

use sizeWithoutDebug

2bad642

UsmanNadeem approved these changes Oct 6, 2025

View reviewed changes

XChy enabled auto-merge (squash) October 7, 2025 04:05

XChy disabled auto-merge October 7, 2025 04:30

XChy merged commit cf85ec5 into llvm:main Oct 7, 2025
8 of 9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DFAJumpThreading] Constraint the number of cloned instructions #161632

[DFAJumpThreading] Constraint the number of cloned instructions #161632

XChy commented Oct 2, 2025

Uh oh!

llvmbot commented Oct 2, 2025

Uh oh!

UsmanNadeem commented Oct 3, 2025

Uh oh!

XChy commented Oct 3, 2025

Uh oh!

Uh oh!

UsmanNadeem left a comment

Uh oh!

Uh oh!

Uh oh!

[DFAJumpThreading] Constraint the number of cloned instructions #161632

[DFAJumpThreading] Constraint the number of cloned instructions #161632

Conversation

XChy commented Oct 2, 2025

Uh oh!

llvmbot commented Oct 2, 2025

Uh oh!

UsmanNadeem commented Oct 3, 2025

Uh oh!

XChy commented Oct 3, 2025

Uh oh!

Uh oh!

UsmanNadeem left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!