-
Notifications
You must be signed in to change notification settings - Fork 15k
[CostModel] Estimate the codesize cost of switch #163569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-x86 Author: Hongyu Chen (XChy) ChangesWe previously considered the code size cost of most control flow instructions to be To obey the X86 cost model specification: This patch doesn't consider the jump table size, as the jump table itself doesn't lie in the code segment. Patch is 20.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/163569.diff 2 Files Affected:
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 3d8d0a236a3c1..22a646f10507d 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -6155,6 +6155,23 @@ X86TTIImpl::getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx,
InstructionCost X86TTIImpl::getCFInstrCost(unsigned Opcode,
TTI::TargetCostKind CostKind,
const Instruction *I) const {
+ if (Opcode == Instruction::Switch && CostKind == TTI::TCK_CodeSize) {
+ unsigned JumpTableSize, NumSuccs = I->getNumSuccessors();
+ getEstimatedNumberOfCaseClusters(*cast<SwitchInst>(I), JumpTableSize,
+ nullptr, nullptr);
+ // A trivial unconditional branch.
+ if (NumSuccs == 1)
+ return TTI::TCC_Basic;
+
+ // Assume that lowering the switch block is implemented by binary search if
+ // no jump table is generated.
+ if (JumpTableSize == 0)
+ return llvm::Log2_32_Ceil(NumSuccs) * 2 * TTI::TCC_Basic;
+
+ // Indirect branch + default compare + default jump
+ return 3 * TTI::TCC_Basic;
+ }
+
if (CostKind != TTI::TCK_RecipThroughput)
return Opcode == Instruction::PHI ? TTI::TCC_Free : TTI::TCC_Basic;
// Branches are assumed to be predicted.
diff --git a/llvm/test/Analysis/CostModel/X86/switch.ll b/llvm/test/Analysis/CostModel/X86/switch.ll
new file mode 100644
index 0000000000000..e668e365e899d
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/X86/switch.ll
@@ -0,0 +1,411 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=throughput 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,THROUGHPUT
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=latency 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,LATENCY
+; RUN: opt < %s -passes="print<cost-model>" -cost-kind=code-size 2>&1 -disable-output -mtriple=x86_64-unknown-linux-gnu | FileCheck %s -check-prefixes=CHECK,CODESIZE
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+
+define i32 @single_succ_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'single_succ_switch'
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT: ]
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+;
+; LATENCY-LABEL: 'single_succ_switch'
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT: ]
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+;
+; CODESIZE-LABEL: 'single_succ_switch'
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT: ]
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+;
+entry:
+ switch i32 %x, label %default [
+ ]
+default:
+ ret i32 1
+}
+
+define i32 @dense_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'dense_switch'
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT: i32 0, label %bb0
+; THROUGHPUT-NEXT: i32 1, label %bb1
+; THROUGHPUT-NEXT: i32 2, label %bb2
+; THROUGHPUT-NEXT: i32 3, label %bb3
+; THROUGHPUT-NEXT: i32 4, label %bb4
+; THROUGHPUT-NEXT: ]
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'dense_switch'
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT: i32 0, label %bb0
+; LATENCY-NEXT: i32 1, label %bb1
+; LATENCY-NEXT: i32 2, label %bb2
+; LATENCY-NEXT: i32 3, label %bb3
+; LATENCY-NEXT: i32 4, label %bb4
+; LATENCY-NEXT: ]
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'dense_switch'
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT: i32 0, label %bb0
+; CODESIZE-NEXT: i32 1, label %bb1
+; CODESIZE-NEXT: i32 2, label %bb2
+; CODESIZE-NEXT: i32 3, label %bb3
+; CODESIZE-NEXT: i32 4, label %bb4
+; CODESIZE-NEXT: ]
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+ switch i32 %x, label %default [
+ i32 0, label %bb0
+ i32 1, label %bb1
+ i32 2, label %bb2
+ i32 3, label %bb3
+ i32 4, label %bb4
+ ]
+bb0:
+ ret i32 0
+bb1:
+ ret i32 1
+bb2:
+ ret i32 2
+bb3:
+ ret i32 3
+bb4:
+ ret i32 4
+default:
+ unreachable
+}
+
+define i32 @sparse_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'sparse_switch'
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT: i32 0, label %bb0
+; THROUGHPUT-NEXT: i32 100, label %bb1
+; THROUGHPUT-NEXT: i32 200, label %bb2
+; THROUGHPUT-NEXT: i32 300, label %bb3
+; THROUGHPUT-NEXT: i32 400, label %bb4
+; THROUGHPUT-NEXT: ]
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'sparse_switch'
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT: i32 0, label %bb0
+; LATENCY-NEXT: i32 100, label %bb1
+; LATENCY-NEXT: i32 200, label %bb2
+; LATENCY-NEXT: i32 300, label %bb3
+; LATENCY-NEXT: i32 400, label %bb4
+; LATENCY-NEXT: ]
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'sparse_switch'
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 6 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT: i32 0, label %bb0
+; CODESIZE-NEXT: i32 100, label %bb1
+; CODESIZE-NEXT: i32 200, label %bb2
+; CODESIZE-NEXT: i32 300, label %bb3
+; CODESIZE-NEXT: i32 400, label %bb4
+; CODESIZE-NEXT: ]
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+ switch i32 %x, label %default [
+ i32 0, label %bb0
+ i32 100, label %bb1
+ i32 200, label %bb2
+ i32 300, label %bb3
+ i32 400, label %bb4
+ ]
+bb0:
+ ret i32 0
+bb1:
+ ret i32 1
+bb2:
+ ret i32 2
+bb3:
+ ret i32 3
+bb4:
+ ret i32 4
+default:
+ unreachable
+}
+
+define i32 @dense_big_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'dense_big_switch'
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT: i32 0, label %bb0
+; THROUGHPUT-NEXT: i32 1, label %bb1
+; THROUGHPUT-NEXT: i32 2, label %bb2
+; THROUGHPUT-NEXT: i32 3, label %bb3
+; THROUGHPUT-NEXT: i32 4, label %bb4
+; THROUGHPUT-NEXT: i32 5, label %bb5
+; THROUGHPUT-NEXT: i32 6, label %bb6
+; THROUGHPUT-NEXT: i32 7, label %bb7
+; THROUGHPUT-NEXT: i32 8, label %bb8
+; THROUGHPUT-NEXT: i32 9, label %bb9
+; THROUGHPUT-NEXT: i32 10, label %bb10
+; THROUGHPUT-NEXT: ]
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 5
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 6
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 7
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 8
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 9
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 10
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'dense_big_switch'
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT: i32 0, label %bb0
+; LATENCY-NEXT: i32 1, label %bb1
+; LATENCY-NEXT: i32 2, label %bb2
+; LATENCY-NEXT: i32 3, label %bb3
+; LATENCY-NEXT: i32 4, label %bb4
+; LATENCY-NEXT: i32 5, label %bb5
+; LATENCY-NEXT: i32 6, label %bb6
+; LATENCY-NEXT: i32 7, label %bb7
+; LATENCY-NEXT: i32 8, label %bb8
+; LATENCY-NEXT: i32 9, label %bb9
+; LATENCY-NEXT: i32 10, label %bb10
+; LATENCY-NEXT: ]
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'dense_big_switch'
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 3 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT: i32 0, label %bb0
+; CODESIZE-NEXT: i32 1, label %bb1
+; CODESIZE-NEXT: i32 2, label %bb2
+; CODESIZE-NEXT: i32 3, label %bb3
+; CODESIZE-NEXT: i32 4, label %bb4
+; CODESIZE-NEXT: i32 5, label %bb5
+; CODESIZE-NEXT: i32 6, label %bb6
+; CODESIZE-NEXT: i32 7, label %bb7
+; CODESIZE-NEXT: i32 8, label %bb8
+; CODESIZE-NEXT: i32 9, label %bb9
+; CODESIZE-NEXT: i32 10, label %bb10
+; CODESIZE-NEXT: ]
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+ switch i32 %x, label %default [
+ i32 0, label %bb0
+ i32 1, label %bb1
+ i32 2, label %bb2
+ i32 3, label %bb3
+ i32 4, label %bb4
+ i32 5, label %bb5
+ i32 6, label %bb6
+ i32 7, label %bb7
+ i32 8, label %bb8
+ i32 9, label %bb9
+ i32 10, label %bb10
+ ]
+bb0:
+ ret i32 0
+bb1:
+ ret i32 1
+bb2:
+ ret i32 2
+bb3:
+ ret i32 3
+bb4:
+ ret i32 4
+bb5:
+ ret i32 5
+bb6:
+ ret i32 6
+bb7:
+ ret i32 7
+bb8:
+ ret i32 8
+bb9:
+ ret i32 9
+bb10:
+ ret i32 10
+default:
+ unreachable
+}
+
+define i32 @sparse_big_switch(i32 %x) {
+; THROUGHPUT-LABEL: 'sparse_big_switch'
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: switch i32 %x, label %default [
+; THROUGHPUT-NEXT: i32 0, label %bb0
+; THROUGHPUT-NEXT: i32 100, label %bb1
+; THROUGHPUT-NEXT: i32 200, label %bb2
+; THROUGHPUT-NEXT: i32 300, label %bb3
+; THROUGHPUT-NEXT: i32 400, label %bb4
+; THROUGHPUT-NEXT: i32 500, label %bb5
+; THROUGHPUT-NEXT: i32 600, label %bb6
+; THROUGHPUT-NEXT: i32 700, label %bb7
+; THROUGHPUT-NEXT: i32 800, label %bb8
+; THROUGHPUT-NEXT: i32 900, label %bb9
+; THROUGHPUT-NEXT: i32 1000, label %bb10
+; THROUGHPUT-NEXT: ]
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 0
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 1
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 2
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 3
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 4
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 5
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 6
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 7
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 8
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 9
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 10
+; THROUGHPUT-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; LATENCY-LABEL: 'sparse_big_switch'
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: switch i32 %x, label %default [
+; LATENCY-NEXT: i32 0, label %bb0
+; LATENCY-NEXT: i32 100, label %bb1
+; LATENCY-NEXT: i32 200, label %bb2
+; LATENCY-NEXT: i32 300, label %bb3
+; LATENCY-NEXT: i32 400, label %bb4
+; LATENCY-NEXT: i32 500, label %bb5
+; LATENCY-NEXT: i32 600, label %bb6
+; LATENCY-NEXT: i32 700, label %bb7
+; LATENCY-NEXT: i32 800, label %bb8
+; LATENCY-NEXT: i32 900, label %bb9
+; LATENCY-NEXT: i32 1000, label %bb10
+; LATENCY-NEXT: ]
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; LATENCY-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+; CODESIZE-LABEL: 'sparse_big_switch'
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 8 for instruction: switch i32 %x, label %default [
+; CODESIZE-NEXT: i32 0, label %bb0
+; CODESIZE-NEXT: i32 100, label %bb1
+; CODESIZE-NEXT: i32 200, label %bb2
+; CODESIZE-NEXT: i32 300, label %bb3
+; CODESIZE-NEXT: i32 400, label %bb4
+; CODESIZE-NEXT: i32 500, label %bb5
+; CODESIZE-NEXT: i32 600, label %bb6
+; CODESIZE-NEXT: i32 700, label %bb7
+; CODESIZE-NEXT: i32 800, label %bb8
+; CODESIZE-NEXT: i32 900, label %bb9
+; CODESIZE-NEXT: i32 1000, label %bb10
+; CODESIZE-NEXT: ]
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 0
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 1
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 2
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 3
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 4
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 5
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 6
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 7
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 8
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 9
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 10
+; CODESIZE-NEXT: Cost Model: Found an estimated cost of 1 for instruction: unreachable
+;
+entry:
+ switch i32 %x, label %default [
+ i32 0, label %bb0
+ i32 100, label %bb1
+ i32 200, label %bb2
+ i32 300, label %bb3
+ i32 400, label %bb4
+ i32 500, label %bb5
+ i32 600, label %bb6
+ i32 700, label %bb7
+ i32 800, label %bb8
+ i32 900, label %b...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to include this in TargetTransformInfoImpl::getCFInstrCost() instead? The logic isn't really X86 specific.
It looks like a lot of targets just unnecessarily implement this method, with an implementation that is equivalent to the default one.
| @@ -6155,6 +6155,23 @@ X86TTIImpl::getIntImmCostIntrin(Intrinsic::ID IID, unsigned Idx, | |||
| InstructionCost X86TTIImpl::getCFInstrCost(unsigned Opcode, | |||
| TTI::TargetCostKind CostKind, | |||
| const Instruction *I) const { | |||
| if (Opcode == Instruction::Switch && CostKind == TTI::TCK_CodeSize) { | |||
| unsigned JumpTableSize, NumSuccs = I->getNumSuccessors(); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to take into account whether the default is unreachable? onFinalizeSwitch() in InlineCost has some more complex logic for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem very x86-specific - can we move it to the base impl instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move it to the base impl now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe I misunderstood your word. @RKSimon, did you mean moving to TargetTransformInfoImpl::getCFInstrCost() by the base impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gently ping. @RKSimon, could you confirm that? Many thanks.
The implementations of X86/AArch64/RISC-V are equivalent, but they differ slightly from the default implementation: the default implementation assumes a throughput cost of 1, rather than 0. |
|
To keep the change small, I only modified the X86 TTI for now. Once the generic implementation is approved, I would update other targets as well. |
1cd7a6a to
7030613
Compare
|
Why did you change this from the implementation that queries the load etc costs to one that doesn't? |
Because it's more complicated to query these costs in TTIImpl, and I expected the code size cost of most targets to be consistent, I tried to simplify this code. Will change it back to the general form with queries if that's better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me, but why does this impact vectorization tests? Do those use the CodeSize model even when not optimizing for size?
I think you're right that for codesize it probably does not make sense to query (unless maybe there are differences across backends, like some measure code size in bytes while others in instructions?) |
Hmm, I am not familiar with the vectorizer code, just updated these tests according to CI. Maybe @fhahn knows some details? |
We previously considered the code size cost of most control flow instructions to be
TCC_Basic. But for the switch instruction, that might be far from the actual code size. This patch improves the code size estimation for switch instructions.To obey the X86 cost model specification:
This patch doesn't consider the jump table size, as the jump table itself doesn't lie in the code segment.