Skip to content

Conversation

@XChy
Copy link
Member

@XChy XChy commented Oct 15, 2025

We previously considered the code size cost of most control flow instructions to be TCC_Basic. But for the switch instruction, that might be far from the actual code size. This patch improves the code size estimation for switch instructions.

To obey the X86 cost model specification:

TCK_CodeSize should match the instruction count (e.g. divss = 1), NOT the actual encoding size of the instruction.

This patch doesn't consider the jump table size, as the jump table itself doesn't lie in the code segment.

@llvmbot llvmbot added backend:X86 llvm:analysis Includes value tracking, cost tables and constant folding labels Oct 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 15, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-backend-x86

Author: Hongyu Chen (XChy)

Changes

We previously considered the code size cost of most control flow instructions to be TCC_Basic. But for the switch instruction, that might be far from the actual code size. This patch improves the code size estimation for switch instructions.

To obey the X86 cost model specification:
> TCK_CodeSize should match the instruction count (e.g. divss = 1), NOT the actual encoding size of the instruction.

This patch doesn't consider the jump table size, as the jump table itself doesn't lie in the code segment.


Patch is 20.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/163569.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86TargetTransformInfo.cpp (+17)
  • (added) llvm/test/Analysis/CostModel/X86/switch.ll (+411)
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 3d8d0a236a3c1..22a646f10507d 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -6155,6 +6155,23 @@ X86TTIImpl::getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx,
 InstructionCost X86TTIImpl::getCFInstrCost(unsigned Opcode,
                                            TTI::TargetCostKind CostKind,
                                            const Instruction *I) const {
+  if (Opcode == Instruction::Switch && CostKind == TTI::TCK_CodeSize) {
+    unsigned JumpTableSize, NumSuccs = I->getNumSuccessors();
+    getEstimatedNumberOfCaseClusters(*cast<SwitchInst>(I), JumpTableSize,
+                                     nullptr, nullptr);
+    // A trivial unconditional branch.
+    if (NumSuccs == 1)
+      return TTI::TCC_Basic;
+
+    // Assume that lowering the switch block is implemented by binary search if
+    // no jump table is generated.
+    if (JumpTableSize == 0)
+      return llvm::Log2_32_Ceil(NumSuccs) * 2 * TTI::TCC_Basic;
+
+    // Indirect branch + default compare + default jump
+    return 3 * TTI::TCC_Basic;
+  }
+
   if (CostKind != TTI::TCK_RecipThroughput)
     return Opcode == Instruction::PHI ? TTI::TCC_Free : TTI::TCC_Basic;
   // Branches are assumed to be predicted.
diff --git a/llvm/test/Analysis/CostModel/X86/switch.ll b/llvm/test/Analysis/CostModel/X86/switch.ll
new file mode 100644
index 0000000000000..e668e365e899d
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/X86/switch.ll
@@ -0,0 +1,411 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=throughput 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,THROUGHPUT
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=latency 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,LATENCY
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,CODESIZE
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+
+define i32 @single_succ_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'single_succ_switch'
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT:    ]
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+;
+; LATENCY-LABEL: 'single_succ_switch'
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT:    ]
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+;
+; CODESIZE-LABEL: 'single_succ_switch'
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT:    ]
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+;
+entry:
+  switch i32 %x, label %default [
+  ]
+default:
+  ret i32 1
+}
+
+define i32 @dense_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'dense_switch'
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT:      i32 0, label %bb0
+; THROUGHPUT-NEXT:      i32 1, label %bb1
+; THROUGHPUT-NEXT:      i32 2, label %bb2
+; THROUGHPUT-NEXT:      i32 3, label %bb3
+; THROUGHPUT-NEXT:      i32 4, label %bb4
+; THROUGHPUT-NEXT:    ]
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'dense_switch'
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT:      i32 0, label %bb0
+; LATENCY-NEXT:      i32 1, label %bb1
+; LATENCY-NEXT:      i32 2, label %bb2
+; LATENCY-NEXT:      i32 3, label %bb3
+; LATENCY-NEXT:      i32 4, label %bb4
+; LATENCY-NEXT:    ]
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'dense_switch'
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT:      i32 0, label %bb0
+; CODESIZE-NEXT:      i32 1, label %bb1
+; CODESIZE-NEXT:      i32 2, label %bb2
+; CODESIZE-NEXT:      i32 3, label %bb3
+; CODESIZE-NEXT:      i32 4, label %bb4
+; CODESIZE-NEXT:    ]
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+  switch i32 %x, label %default [
+    i32 0, label %bb0
+    i32 1, label %bb1
+    i32 2, label %bb2
+    i32 3, label %bb3
+    i32 4, label %bb4
+  ]
+bb0:
+  ret i32 0
+bb1:
+  ret i32 1
+bb2:
+  ret i32 2
+bb3:
+  ret i32 3
+bb4:
+  ret i32 4
+default:
+  unreachable
+}
+
+define i32 @sparse_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'sparse_switch'
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT:      i32 0, label %bb0
+; THROUGHPUT-NEXT:      i32 100, label %bb1
+; THROUGHPUT-NEXT:      i32 200, label %bb2
+; THROUGHPUT-NEXT:      i32 300, label %bb3
+; THROUGHPUT-NEXT:      i32 400, label %bb4
+; THROUGHPUT-NEXT:    ]
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'sparse_switch'
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT:      i32 0, label %bb0
+; LATENCY-NEXT:      i32 100, label %bb1
+; LATENCY-NEXT:      i32 200, label %bb2
+; LATENCY-NEXT:      i32 300, label %bb3
+; LATENCY-NEXT:      i32 400, label %bb4
+; LATENCY-NEXT:    ]
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'sparse_switch'
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT:      i32 0, label %bb0
+; CODESIZE-NEXT:      i32 100, label %bb1
+; CODESIZE-NEXT:      i32 200, label %bb2
+; CODESIZE-NEXT:      i32 300, label %bb3
+; CODESIZE-NEXT:      i32 400, label %bb4
+; CODESIZE-NEXT:    ]
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+  switch i32 %x, label %default [
+    i32 0, label %bb0
+    i32 100, label %bb1
+    i32 200, label %bb2
+    i32 300, label %bb3
+    i32 400, label %bb4
+  ]
+bb0:
+  ret i32 0
+bb1:
+  ret i32 1
+bb2:
+  ret i32 2
+bb3:
+  ret i32 3
+bb4:
+  ret i32 4
+default:
+  unreachable
+}
+
+define i32 @dense_big_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'dense_big_switch'
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT:      i32 0, label %bb0
+; THROUGHPUT-NEXT:      i32 1, label %bb1
+; THROUGHPUT-NEXT:      i32 2, label %bb2
+; THROUGHPUT-NEXT:      i32 3, label %bb3
+; THROUGHPUT-NEXT:      i32 4, label %bb4
+; THROUGHPUT-NEXT:      i32 5, label %bb5
+; THROUGHPUT-NEXT:      i32 6, label %bb6
+; THROUGHPUT-NEXT:      i32 7, label %bb7
+; THROUGHPUT-NEXT:      i32 8, label %bb8
+; THROUGHPUT-NEXT:      i32 9, label %bb9
+; THROUGHPUT-NEXT:      i32 10, label %bb10
+; THROUGHPUT-NEXT:    ]
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 5
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 6
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 7
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 8
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 9
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 10
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'dense_big_switch'
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT:      i32 0, label %bb0
+; LATENCY-NEXT:      i32 1, label %bb1
+; LATENCY-NEXT:      i32 2, label %bb2
+; LATENCY-NEXT:      i32 3, label %bb3
+; LATENCY-NEXT:      i32 4, label %bb4
+; LATENCY-NEXT:      i32 5, label %bb5
+; LATENCY-NEXT:      i32 6, label %bb6
+; LATENCY-NEXT:      i32 7, label %bb7
+; LATENCY-NEXT:      i32 8, label %bb8
+; LATENCY-NEXT:      i32 9, label %bb9
+; LATENCY-NEXT:      i32 10, label %bb10
+; LATENCY-NEXT:    ]
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'dense_big_switch'
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT:      i32 0, label %bb0
+; CODESIZE-NEXT:      i32 1, label %bb1
+; CODESIZE-NEXT:      i32 2, label %bb2
+; CODESIZE-NEXT:      i32 3, label %bb3
+; CODESIZE-NEXT:      i32 4, label %bb4
+; CODESIZE-NEXT:      i32 5, label %bb5
+; CODESIZE-NEXT:      i32 6, label %bb6
+; CODESIZE-NEXT:      i32 7, label %bb7
+; CODESIZE-NEXT:      i32 8, label %bb8
+; CODESIZE-NEXT:      i32 9, label %bb9
+; CODESIZE-NEXT:      i32 10, label %bb10
+; CODESIZE-NEXT:    ]
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+  switch i32 %x, label %default [
+    i32 0, label %bb0
+    i32 1, label %bb1
+    i32 2, label %bb2
+    i32 3, label %bb3
+    i32 4, label %bb4
+    i32 5, label %bb5
+    i32 6, label %bb6
+    i32 7, label %bb7
+    i32 8, label %bb8
+    i32 9, label %bb9
+    i32 10, label %bb10
+  ]
+bb0:
+  ret i32 0
+bb1:
+  ret i32 1
+bb2:
+  ret i32 2
+bb3:
+  ret i32 3
+bb4:
+  ret i32 4
+bb5:
+  ret i32 5
+bb6:
+  ret i32 6
+bb7:
+  ret i32 7
+bb8:
+  ret i32 8
+bb9:
+  ret i32 9
+bb10:
+  ret i32 10
+default:
+  unreachable
+}
+
+define i32 @sparse_big_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'sparse_big_switch'
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT:      i32 0, label %bb0
+; THROUGHPUT-NEXT:      i32 100, label %bb1
+; THROUGHPUT-NEXT:      i32 200, label %bb2
+; THROUGHPUT-NEXT:      i32 300, label %bb3
+; THROUGHPUT-NEXT:      i32 400, label %bb4
+; THROUGHPUT-NEXT:      i32 500, label %bb5
+; THROUGHPUT-NEXT:      i32 600, label %bb6
+; THROUGHPUT-NEXT:      i32 700, label %bb7
+; THROUGHPUT-NEXT:      i32 800, label %bb8
+; THROUGHPUT-NEXT:      i32 900, label %bb9
+; THROUGHPUT-NEXT:      i32 1000, label %bb10
+; THROUGHPUT-NEXT:    ]
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 5
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 6
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 7
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 8
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 9
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 10
+; THROUGHPUT-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'sparse_big_switch'
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT:      i32 0, label %bb0
+; LATENCY-NEXT:      i32 100, label %bb1
+; LATENCY-NEXT:      i32 200, label %bb2
+; LATENCY-NEXT:      i32 300, label %bb3
+; LATENCY-NEXT:      i32 400, label %bb4
+; LATENCY-NEXT:      i32 500, label %bb5
+; LATENCY-NEXT:      i32 600, label %bb6
+; LATENCY-NEXT:      i32 700, label %bb7
+; LATENCY-NEXT:      i32 800, label %bb8
+; LATENCY-NEXT:      i32 900, label %bb9
+; LATENCY-NEXT:      i32 1000, label %bb10
+; LATENCY-NEXT:    ]
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; LATENCY-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'sparse_big_switch'
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT:      i32 0, label %bb0
+; CODESIZE-NEXT:      i32 100, label %bb1
+; CODESIZE-NEXT:      i32 200, label %bb2
+; CODESIZE-NEXT:      i32 300, label %bb3
+; CODESIZE-NEXT:      i32 400, label %bb4
+; CODESIZE-NEXT:      i32 500, label %bb5
+; CODESIZE-NEXT:      i32 600, label %bb6
+; CODESIZE-NEXT:      i32 700, label %bb7
+; CODESIZE-NEXT:      i32 800, label %bb8
+; CODESIZE-NEXT:      i32 900, label %bb9
+; CODESIZE-NEXT:      i32 1000, label %bb10
+; CODESIZE-NEXT:    ]
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; CODESIZE-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+  switch i32 %x, label %default [
+    i32 0, label %bb0
+    i32 100, label %bb1
+    i32 200, label %bb2
+    i32 300, label %bb3
+    i32 400, label %bb4
+    i32 500, label %bb5
+    i32 600, label %bb6
+    i32 700, label %bb7
+    i32 800, label %bb8
+    i32 900, label %b...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to include this in TargetTransformInfoImpl::getCFInstrCost() instead? The logic isn't really X86 specific.

It looks like a lot of targets just unnecessarily implement this method, with an implementation that is equivalent to the default one.

@@ -6155,6 +6155,23 @@ X86TTIImpl::getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx,
InstructionCost X86TTIImpl::getCFInstrCost(unsigned Opcode,
TTI::TargetCostKind CostKind,
const Instruction *I) const {
if (Opcode == Instruction::Switch && CostKind == TTI::TCK_CodeSize) {
unsigned JumpTableSize, NumSuccs = I->getNumSuccessors();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to take into account whether the default is unreachable? onFinalizeSwitch() in InlineCost has some more complex logic for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem very x86-specific - can we move it to the base impl instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it to the base impl now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, maybe I misunderstood your word. @RKSimon, did you mean moving to TargetTransformInfoImpl::getCFInstrCost() by the base impl?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gently ping. @RKSimon, could you confirm that? Many thanks.

@XChy
Copy link
Member Author

XChy commented Oct 15, 2025

It looks like a lot of targets just unnecessarily implement this method, with an implementation that is equivalent to the default one.

The implementations of X86/AArch64/RISC-V are equivalent, but they differ slightly from the default implementation: the default implementation assumes a throughput cost of 1, rather than 0.

@XChy XChy changed the title [X86][CostModel] Estimate the codesize cost of switch [CostModel] Estimate the codesize cost of switch Oct 15, 2025
@XChy
Copy link
Member Author

XChy commented Oct 16, 2025

To keep the change small, I only modified the X86 TTI for now. Once the generic implementation is approved, I would update other targets as well.

@XChy XChy force-pushed the cost-model-x86-switch-codesize branch from 1cd7a6a to 7030613 Compare October 27, 2025 09:54
@nikic
Copy link
Contributor

nikic commented Oct 27, 2025

Why did you change this from the implementation that queries the load etc costs to one that doesn't?

@XChy
Copy link
Member Author

XChy commented Oct 27, 2025

Why did you change this from the implementation that queries the load etc costs to one that doesn't?

Because it's more complicated to query these costs in TTIImpl, and I expected the code size cost of most targets to be consistent, I tried to simplify this code. Will change it back to the general form with queries if that's better.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me, but why does this impact vectorization tests? Do those use the CodeSize model even when not optimizing for size?

@nikic nikic requested a review from fhahn October 27, 2025 16:15
@nikic
Copy link
Contributor

nikic commented Oct 27, 2025

Why did you change this from the implementation that queries the load etc costs to one that doesn't?

Because it's more complicated to query these costs in TTIImpl, and I expected the code size cost of most targets to be consistent, I tried to simplify this code. Will change it back to the general form with queries if that's better.

I think you're right that for codesize it probably does not make sense to query (unless maybe there are differences across backends, like some measure code size in bytes while others in instructions?)

@XChy
Copy link
Member Author

XChy commented Oct 27, 2025

This looks reasonable to me, but why does this impact vectorization tests? Do those use the CodeSize model even when not optimizing for size?

Hmm, I am not familiar with the vectorizer code, just updated these tests according to CI. Maybe @fhahn knows some details?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants