Skip to content

Conversation

OutOfCache
Copy link
Contributor

In some cases, we can replace a switch with simpler instructions or a lookup table.
For instance, if every case results in the same value, we can simply replace the switch
with that single value.

However, lookup tables are not always supported.
Targets, function attributes and compiler options can deactivate lookup table creation.
Currently, even simpler switch replacements like the single value optimization do not
get applied, because we only enable these transformations if lookup tables are enabled.

This PR enables the other kinds of replacements, even if lookup tables are not supported.
First, it checks if the potential replacements are lookup tables.
If they are, then check if lookup tables are supported and whether to continue.
If they are not, then we can apply the other transformations.

Originally, lookup table creation was delayed until late stages of the compilation pipeline, because
it can result in difficult-to-analyze code and prevent other optimizations.
As a side effect of this change, we can also enable the simpler optimizations much earlier in the
compilation process.

Currently, some switch optimizations are only possible, if lookup table
support is enabled. However, some of these optimizations are no lookup
tables at all, but a series of simple instructions like mul, add
or shift instructions.

These tests will show how a future change will enable these other kinds
of optimization, even if lookup tables are not supported.
Check if the SwitchReplacement is a lookup table or not.
The option prevents any switch replacements from happening earlier in
the compilation pipeline, even though it should only apply to lookup
tables.

Delay the check until we are sure that we would create a lookup table.
If not, continue with the replacement.
Only check the target support for LUTs if we would create one.
Otherwise, proceed with the optimization.
Allow the replacement of switches with anything other than a lookup
table, even if the "no-jump-table" attribute is set.
@OutOfCache OutOfCache force-pushed the simplifycfg-switch-nolut branch from fe0691a to eb89aab Compare September 2, 2025 17:51
// pipeline, because it would otherwise result in some
// difficult-to-analyze code and make pruning branches much harder.
// This is a problem if the switch expression itself can be restricted
// by inlining or CVP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the llvm-opt-benchmark results, we should also guard BitMapKind behind ConvertSwitchToLookupTable. Similar to lookup tables, this representation may make further optimization harder.

(I'd also be okay with dropping the ConvertSwitchToLookupTable change from this PR to not block the rest of the changes on flushing out all the phase ordering issues.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into the benchmark and reviewing. I decided to guard the bitmap creation with ConvertSwitchToLookup for now instead of completely removing the ConvertSwitchToLookup change. Hopefully the new benchmark results will show no other regressions, so the other kinds of transformations can still be applied earlier without issues, otherwise I will remove it for now.

Early bitmap creation can, similar to lookup tables,
cause missed optimizations because they are more difficult to analyze.

Delay the creation until later in the pipeline, when the option
`switch-to-lookup` is set in in SimplifyCFG.
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Jessica Del (OutOfCache)

Changes

In some cases, we can replace a switch with simpler instructions or a lookup table.
For instance, if every case results in the same value, we can simply replace the switch
with that single value.

However, lookup tables are not always supported.
Targets, function attributes and compiler options can deactivate lookup table creation.
Currently, even simpler switch replacements like the single value optimization do not
get applied, because we only enable these transformations if lookup tables are enabled.

This PR enables the other kinds of replacements, even if lookup tables are not supported.
First, it checks if the potential replacements are lookup tables.
If they are, then check if lookup tables are supported and whether to continue.
If they are not, then we can apply the other transformations.

Originally, lookup table creation was delayed until late stages of the compilation pipeline, because
it can result in difficult-to-analyze code and prevent other optimizations.
As a side effect of this change, we can also enable the simpler optimizations much earlier in the
compilation process.


Patch is 20.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156477.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+42-13)
  • (modified) llvm/test/CodeGen/Thumb2/bti-indirect-branches.ll (+3-2)
  • (added) llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll (+467)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 02d6393dd5815..7485230127723 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -6466,6 +6466,12 @@ class SwitchReplacement {
   /// Return the default value of the switch.
   Constant *getDefaultValue();
 
+  /// Return true if the replacement is a lookup table.
+  bool isLookupTable();
+
+  /// Return true if the replacement is a bitmap.
+  bool isBitMap();
+
 private:
   // Depending on the switch, there are different alternatives.
   enum {
@@ -6753,6 +6759,10 @@ static bool isTypeLegalForLookupTable(Type *Ty, const TargetTransformInfo &TTI,
 
 Constant *SwitchReplacement::getDefaultValue() { return DefaultValue; }
 
+bool SwitchReplacement::isLookupTable() { return Kind == LookupTableKind; }
+
+bool SwitchReplacement::isBitMap() { return Kind == BitMapKind; }
+
 static bool isSwitchDense(uint64_t NumCases, uint64_t CaseRange) {
   // 40% is the default density for building a jump table in optsize/minsize
   // mode. See also TargetLoweringBase::isSuitableForJumpTable(), which this
@@ -6918,16 +6928,12 @@ static void reuseTableCompare(
 /// lookup tables.
 static bool simplifySwitchLookup(SwitchInst *SI, IRBuilder<> &Builder,
                                  DomTreeUpdater *DTU, const DataLayout &DL,
-                                 const TargetTransformInfo &TTI) {
+                                 const TargetTransformInfo &TTI,
+                                 bool ConvertSwitchToLookupTable) {
   assert(SI->getNumCases() > 1 && "Degenerate switch?");
 
   BasicBlock *BB = SI->getParent();
   Function *Fn = BB->getParent();
-  // Only build lookup table when we have a target that supports it or the
-  // attribute is not set.
-  if (!TTI.shouldBuildLookupTables() ||
-      (Fn->getFnAttribute("no-jump-tables").getValueAsBool()))
-    return false;
 
   // FIXME: If the switch is too sparse for a lookup table, perhaps we could
   // split off a dense part and build a lookup table for that.
@@ -7086,6 +7092,34 @@ static bool simplifySwitchLookup(SwitchInst *SI, IRBuilder<> &Builder,
     PhiToReplacementMap.insert({PHI, Replacement});
   }
 
+  bool AnyLookupTables = any_of(
+      PhiToReplacementMap, [](auto &KV) { return KV.second.isLookupTable(); });
+
+  // A few conditions prevent the generation of lookup tables:
+  //     1. Not setting the ConvertSwitchToLookupTable option
+  //        This option prevents the LUT creation until a later stage in the
+  //        pipeline, because it would otherwise result in some
+  //        difficult-to-analyze code and make pruning branches much harder.
+  //        This is a problem if the switch expression itself can be restricted
+  //        by inlining or CVP.
+  //     2. The target does not support lookup tables.
+  //     3. The "no-jump-tables" function attribute is set.
+  // However, these objections do not apply to other switch replacements, like
+  // the bitmap, so we only stop here if any of these conditions are met and we
+  // want to create a LUT. Otherwise, continue with the switch replacement.
+  if (AnyLookupTables &&
+      (!ConvertSwitchToLookupTable || !TTI.shouldBuildLookupTables() ||
+       Fn->getFnAttribute("no-jump-tables").getValueAsBool()))
+    return false;
+
+  bool AnyBitMaps = any_of(PhiToReplacementMap,
+                           [](auto &KV) { return KV.second.isBitMap(); });
+
+  // Bitmaps can also cause missed optimizations due to difficult-to-analyze
+  // code. Delay the creation of bitmaps until later in the pipeline.
+  if (AnyBitMaps && !ConvertSwitchToLookupTable)
+    return false;
+
   Builder.SetInsertPoint(SI);
   // TableIndex is the switch condition - TableIndexOffset if we don't
   // use the condition directly
@@ -7727,13 +7761,8 @@ bool SimplifyCFGOpt::simplifySwitch(SwitchInst *SI, IRBuilder<> &Builder) {
   if (Options.ForwardSwitchCondToPhi && forwardSwitchConditionToPHI(SI))
     return requestResimplify();
 
-  // The conversion from switch to lookup tables results in difficult-to-analyze
-  // code and makes pruning branches much harder. This is a problem if the
-  // switch expression itself can still be restricted as a result of inlining or
-  // CVP. Therefore, only apply this transformation during late stages of the
-  // optimisation pipeline.
-  if (Options.ConvertSwitchToLookupTable &&
-      simplifySwitchLookup(SI, Builder, DTU, DL, TTI))
+  if (simplifySwitchLookup(SI, Builder, DTU, DL, TTI,
+                           Options.ConvertSwitchToLookupTable))
     return requestResimplify();
 
   if (simplifySwitchOfPowersOfTwo(SI, Builder, DL, TTI))
diff --git a/llvm/test/CodeGen/Thumb2/bti-indirect-branches.ll b/llvm/test/CodeGen/Thumb2/bti-indirect-branches.ll
index c6ffb92d60d8d..bb1545deebe16 100644
--- a/llvm/test/CodeGen/Thumb2/bti-indirect-branches.ll
+++ b/llvm/test/CodeGen/Thumb2/bti-indirect-branches.ll
@@ -9,6 +9,7 @@ define internal i32 @table_switch(i32 %x) "branch-target-enforcement" {
 ; CHECK-NEXT:    cmp r1, #3
 ; CHECK-NEXT:    bhi .LBB0_6
 ; CHECK-NEXT:  @ %bb.1: @ %entry
+; CHECK-NEXT:    movs r0, #3
 ; CHECK-NEXT:  .LCPI0_0:
 ; CHECK-NEXT:    tbb [pc, r1]
 ; CHECK-NEXT:  @ %bb.2:
@@ -22,7 +23,7 @@ define internal i32 @table_switch(i32 %x) "branch-target-enforcement" {
 ; CHECK-NEXT:    movs r0, #2
 ; CHECK-NEXT:    bx lr
 ; CHECK-NEXT:  .LBB0_4: @ %bb3
-; CHECK-NEXT:    movs r0, #3
+; CHECK-NEXT:    movs r0, #1
 ; CHECK-NEXT:    bx lr
 ; CHECK-NEXT:  .LBB0_5: @ %bb4
 ; CHECK-NEXT:    movs r0, #4
@@ -51,7 +52,7 @@ sw.epilog:
   br label %return
 
 return:
-  %ret = phi i32 [ 0, %sw.epilog ], [ 1, %bb1 ], [ 2, %bb2 ], [ 3, %bb3 ], [ 4, %bb4 ]
+  %ret = phi i32 [ 0, %sw.epilog ], [ 3, %bb1 ], [ 2, %bb2 ], [ 1, %bb3 ], [ 4, %bb4 ]
   ret i32 %ret
 }
 
diff --git a/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll b/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll
new file mode 100644
index 0000000000000..01200136b6d40
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/switch-transformations-no-lut.ll
@@ -0,0 +1,467 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes='simplifycfg' < %s | FileCheck %s --check-prefix=OPTNOLUT
+; RUN: %if amdgpu-registered-target %{ opt -mtriple=amdgcn--amdpal -S -passes='simplifycfg<switch-to-lookup>' < %s | FileCheck %s --check-prefix=TTINOLUT %}
+;
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+define i32 @linear_transform_with_default(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @linear_transform_with_default(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*:]]
+; OPTNOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 4
+; OPTNOLUT-NEXT:    [[SWITCH_IDX_MULT:%.*]] = mul nsw i32 [[X]], 3
+; OPTNOLUT-NEXT:    [[SWITCH_OFFSET:%.*]] = add nsw i32 [[SWITCH_IDX_MULT]], 1
+; OPTNOLUT-NEXT:    [[IDX:%.*]] = select i1 [[TMP0]], i32 [[SWITCH_OFFSET]], i32 13
+; OPTNOLUT-NEXT:    ret i32 [[IDX]]
+;
+; TTINOLUT-LABEL: define i32 @linear_transform_with_default(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*]]:
+; TTINOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 4
+; TTINOLUT-NEXT:    br i1 [[TMP0]], label %[[SWITCH_LOOKUP:.*]], label %[[END:.*]]
+; TTINOLUT:       [[SWITCH_LOOKUP]]:
+; TTINOLUT-NEXT:    [[SWITCH_IDX_MULT:%.*]] = mul nsw i32 [[X]], 3
+; TTINOLUT-NEXT:    [[SWITCH_OFFSET:%.*]] = add nsw i32 [[SWITCH_IDX_MULT]], 1
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[END]]:
+; TTINOLUT-NEXT:    [[IDX:%.*]] = phi i32 [ 13, %[[ENTRY]] ], [ [[SWITCH_OFFSET]], %[[SWITCH_LOOKUP]] ]
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %end [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+
+end:
+  %idx = phi i32 [ 1, %case0 ], [ 4, %case1 ], [ 7, %case2 ], [ 10, %case3 ], [ 13, %entry ]
+  ret i32 %idx
+}
+
+define i32 @linear_transform_with_outlier(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @linear_transform_with_outlier(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*]]:
+; OPTNOLUT-NEXT:    switch i32 [[X]], label %[[END:.*]] [
+; OPTNOLUT-NEXT:      i32 0, label %[[CASE0:.*]]
+; OPTNOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; OPTNOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; OPTNOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; OPTNOLUT-NEXT:      i32 4, label %[[CASE4:.*]]
+; OPTNOLUT-NEXT:    ]
+; OPTNOLUT:       [[CASE0]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE1]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE2]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE3]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE4]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[END]]:
+; OPTNOLUT-NEXT:    [[IDX:%.*]] = phi i32 [ 0, %[[CASE0]] ], [ 3, %[[CASE1]] ], [ 6, %[[CASE2]] ], [ 9, %[[CASE3]] ], [ 13, %[[CASE4]] ], [ 12, %[[ENTRY]] ]
+; OPTNOLUT-NEXT:    ret i32 [[IDX]]
+;
+; TTINOLUT-LABEL: define i32 @linear_transform_with_outlier(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*]]:
+; TTINOLUT-NEXT:    switch i32 [[X]], label %[[END:.*]] [
+; TTINOLUT-NEXT:      i32 0, label %[[CASE0:.*]]
+; TTINOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; TTINOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; TTINOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; TTINOLUT-NEXT:      i32 4, label %[[CASE4:.*]]
+; TTINOLUT-NEXT:    ]
+; TTINOLUT:       [[CASE0]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE1]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE2]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE3]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE4]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[END]]:
+; TTINOLUT-NEXT:    [[IDX:%.*]] = phi i32 [ 0, %[[CASE0]] ], [ 3, %[[CASE1]] ], [ 6, %[[CASE2]] ], [ 9, %[[CASE3]] ], [ 13, %[[CASE4]] ], [ 12, %[[ENTRY]] ]
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %end [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  i32 4, label %case4
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+case4:
+  br label %end
+
+end:
+  %idx = phi i32 [ 0, %case0 ], [ 3, %case1 ], [ 6, %case2 ], [ 9, %case3 ], [ 13, %case4 ], [ 12, %entry ]
+  ret i32 %idx
+}
+
+define i32 @linear_transform_no_default(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @linear_transform_no_default(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*:]]
+; OPTNOLUT-NEXT:    [[SWITCH_IDX_MULT:%.*]] = mul nsw i32 [[X]], 3
+; OPTNOLUT-NEXT:    ret i32 [[SWITCH_IDX_MULT]]
+;
+; TTINOLUT-LABEL: define i32 @linear_transform_no_default(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*:]]
+; TTINOLUT-NEXT:    [[IDX:%.*]] = mul nsw i32 [[X]], 3
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  i32 4, label %case4
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+case4:
+  br label %end
+default:
+  unreachable
+
+end:
+  %idx = phi i32 [ 0, %case0 ], [ 3, %case1 ], [ 6, %case2 ], [ 9, %case3 ], [ 12, %case4 ]
+  ret i32 %idx
+}
+
+define i4 @bitmap_no_default(i32 %x) {
+; OPTNOLUT-LABEL: define i4 @bitmap_no_default(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*]]:
+; OPTNOLUT-NEXT:    switch i32 [[X]], label %[[DEFAULT:.*]] [
+; OPTNOLUT-NEXT:      i32 0, label %[[END:.*]]
+; OPTNOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; OPTNOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; OPTNOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; OPTNOLUT-NEXT:    ]
+; OPTNOLUT:       [[CASE1]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE2]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE3]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[DEFAULT]]:
+; OPTNOLUT-NEXT:    unreachable
+; OPTNOLUT:       [[END]]:
+; OPTNOLUT-NEXT:    [[SWITCH_MASKED:%.*]] = phi i4 [ 2, %[[CASE1]] ], [ 4, %[[CASE2]] ], [ -8, %[[CASE3]] ], [ 0, %[[ENTRY]] ]
+; OPTNOLUT-NEXT:    ret i4 [[SWITCH_MASKED]]
+;
+; TTINOLUT-LABEL: define i4 @bitmap_no_default(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*:]]
+; TTINOLUT-NEXT:    [[SWITCH_CAST:%.*]] = trunc i32 [[X]] to i16
+; TTINOLUT-NEXT:    [[SWITCH_SHIFTAMT:%.*]] = mul nuw nsw i16 [[SWITCH_CAST]], 4
+; TTINOLUT-NEXT:    [[SWITCH_DOWNSHIFT:%.*]] = lshr i16 -31712, [[SWITCH_SHIFTAMT]]
+; TTINOLUT-NEXT:    [[IDX:%.*]] = trunc i16 [[SWITCH_DOWNSHIFT]] to i4
+; TTINOLUT-NEXT:    ret i4 [[IDX]]
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+default:
+  unreachable
+
+end:
+  %idx = phi i4 [ 0, %case0 ], [ 2, %case1 ], [ 4, %case2 ], [ 8, %case3 ]
+  ret i4 %idx
+}
+
+define i4 @bitmap_with_default(i32 %x) {
+; OPTNOLUT-LABEL: define i4 @bitmap_with_default(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*]]:
+; OPTNOLUT-NEXT:    switch i32 [[X]], label %[[DEFAULT:.*]] [
+; OPTNOLUT-NEXT:      i32 0, label %[[END:.*]]
+; OPTNOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; OPTNOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; OPTNOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; OPTNOLUT-NEXT:    ]
+; OPTNOLUT:       [[CASE1]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE2]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE3]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[DEFAULT]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[END]]:
+; OPTNOLUT-NEXT:    [[IDX:%.*]] = phi i4 [ 2, %[[CASE1]] ], [ 4, %[[CASE2]] ], [ -8, %[[CASE3]] ], [ -1, %[[DEFAULT]] ], [ 0, %[[ENTRY]] ]
+; OPTNOLUT-NEXT:    ret i4 [[IDX]]
+;
+; TTINOLUT-LABEL: define i4 @bitmap_with_default(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*]]:
+; TTINOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 4
+; TTINOLUT-NEXT:    br i1 [[TMP0]], label %[[SWITCH_LOOKUP:.*]], label %[[END:.*]]
+; TTINOLUT:       [[SWITCH_LOOKUP]]:
+; TTINOLUT-NEXT:    [[SWITCH_CAST:%.*]] = trunc i32 [[X]] to i16
+; TTINOLUT-NEXT:    [[SWITCH_SHIFTAMT:%.*]] = mul nuw nsw i16 [[SWITCH_CAST]], 4
+; TTINOLUT-NEXT:    [[SWITCH_DOWNSHIFT:%.*]] = lshr i16 -31712, [[SWITCH_SHIFTAMT]]
+; TTINOLUT-NEXT:    [[SWITCH_MASKED:%.*]] = trunc i16 [[SWITCH_DOWNSHIFT]] to i4
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[END]]:
+; TTINOLUT-NEXT:    [[IDX:%.*]] = phi i4 [ [[SWITCH_MASKED]], %[[SWITCH_LOOKUP]] ], [ -1, %[[ENTRY]] ]
+; TTINOLUT-NEXT:    ret i4 [[IDX]]
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+default:
+  br label %end
+
+end:
+  %idx = phi i4 [ 0, %case0 ], [ 2, %case1 ], [ 4, %case2 ], [ 8, %case3 ], [15, %default]
+  ret i4 %idx
+}
+
+define i32 @single_value_no_default(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @single_value_no_default(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*:]]
+; OPTNOLUT-NEXT:    ret i32 2
+;
+; TTINOLUT-LABEL: define i32 @single_value_no_default(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*:]]
+; TTINOLUT-NEXT:    ret i32 2
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  i32 4, label %case4
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+case4:
+  br label %end
+default:
+  unreachable
+
+end:
+  %idx = phi i32 [ 2, %case0 ], [ 2, %case1 ], [ 2, %case2 ], [ 2, %case3 ], [ 2, %case4 ]
+  ret i32 %idx
+}
+
+define i32 @single_value_withdefault(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @single_value_withdefault(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*:]]
+; OPTNOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 5
+; OPTNOLUT-NEXT:    [[DOT:%.*]] = select i1 [[TMP0]], i32 2, i32 3
+; OPTNOLUT-NEXT:    ret i32 [[DOT]]
+;
+; TTINOLUT-LABEL: define i32 @single_value_withdefault(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*:]]
+; TTINOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 5
+; TTINOLUT-NEXT:    [[IDX:%.*]] = select i1 [[TMP0]], i32 2, i32 3
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  i32 4, label %case4
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+case4:
+  br label %end
+default:
+  br label %end
+
+end:
+  %idx = phi i32 [ 2, %case0 ], [ 2, %case1 ], [ 2, %case2 ], [ 2, %case3 ], [ 2, %case4 ], [ 3, %default ]
+  ret i32 %idx
+}
+
+define i32 @single_value_no_jump_tables(i32 %x) "no-jump-tables"="true" {
+; OPTNOLUT-LABEL: define i32 @single_value_no_jump_tables(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+; OPTNOLUT-NEXT:  [[ENTRY:.*:]]
+; OPTNOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 5
+; OPTNOLUT-NEXT:    [[IDX:%.*]] = select i1 [[TMP0]], i32 2, i32 3
+; OPTNOLUT-NEXT:    ret i32 [[IDX]]
+;
+; TTINOLUT-LABEL: define i32 @single_value_no_jump_tables(
+; TTINOLUT-SAME: i32 [[X:%.*]]) #[[ATTR0:[0-9]+]] {
+; TTINOLUT-NEXT:  [[ENTRY:.*:]]
+; TTINOLUT-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[X]], 5
+; TTINOLUT-NEXT:    [[IDX:%.*]] = select i1 [[TMP0]], i32 2, i32 3
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %default [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  i32 4, label %case4
+  ]
+
+case0:
+  br label %end
+case1:
+  br label %end
+case2:
+  br label %end
+case3:
+  br label %end
+case4:
+  br label %end
+default:
+  br label %end
+
+end:
+  %idx = phi i32 [ 2, %case0 ], [ 2, %case1 ], [ 2, %case2 ], [ 2, %case3 ], [ 2, %case4 ], [ 3, %default ]
+  ret i32 %idx
+}
+
+define i32 @lookup_table(i32 %x) {
+; OPTNOLUT-LABEL: define i32 @lookup_table(
+; OPTNOLUT-SAME: i32 [[X:%.*]]) {
+; OPTNOLUT-NEXT:  [[ENTRY:.*]]:
+; OPTNOLUT-NEXT:    switch i32 [[X]], label %[[END:.*]] [
+; OPTNOLUT-NEXT:      i32 0, label %[[CASE0:.*]]
+; OPTNOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; OPTNOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; OPTNOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; OPTNOLUT-NEXT:    ]
+; OPTNOLUT:       [[CASE0]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE1]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE2]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[CASE3]]:
+; OPTNOLUT-NEXT:    br label %[[END]]
+; OPTNOLUT:       [[END]]:
+; OPTNOLUT-NEXT:    [[IDX:%.*]] = phi i32 [ 13, %[[CASE0]] ], [ 3, %[[CASE1]] ], [ 11, %[[CASE2]] ], [ 8, %[[CASE3]] ], [ 24, %[[ENTRY]] ]
+; OPTNOLUT-NEXT:    ret i32 [[IDX]]
+;
+; TTINOLUT-LABEL: define i32 @lookup_table(
+; TTINOLUT-SAME: i32 [[X:%.*]]) {
+; TTINOLUT-NEXT:  [[ENTRY:.*]]:
+; TTINOLUT-NEXT:    switch i32 [[X]], label %[[END:.*]] [
+; TTINOLUT-NEXT:      i32 0, label %[[CASE0:.*]]
+; TTINOLUT-NEXT:      i32 1, label %[[CASE1:.*]]
+; TTINOLUT-NEXT:      i32 2, label %[[CASE2:.*]]
+; TTINOLUT-NEXT:      i32 3, label %[[CASE3:.*]]
+; TTINOLUT-NEXT:    ]
+; TTINOLUT:       [[CASE0]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE1]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE2]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[CASE3]]:
+; TTINOLUT-NEXT:    br label %[[END]]
+; TTINOLUT:       [[END]]:
+; TTINOLUT-NEXT:    [[IDX:%.*]] = phi i32 [ 13, %[[CASE0]] ], [ 3, %[[CASE1]] ], [ 11, %[[CASE2]] ], [ 8, %[[CASE3]] ], [ 24, %[[ENTRY]] ]
+; TTINOLUT-NEXT:    ret i32 [[IDX]]
+;
+entry:
+  switch i32 %x, label %end [
+  i32 0, label %case0
+  i32 1, label %case1
+  i32 2, label %case2
+  i32 3, label %case3
+  ]
+
+c...
[truncated]

@nikic
Copy link
Contributor

nikic commented Sep 7, 2025

We should move the update of statistics like NumBitMaps from the analysis phase to the transform phase, to avoid over-reporting if we bail out of the transform.

Since the SwitchReplacement is created before we decide to replace the
switch, only increase the table counters when we actually create them.
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OutOfCache OutOfCache merged commit 89d86b6 into llvm:main Sep 9, 2025
9 checks passed
mtrofin added a commit that referenced this pull request Sep 10, 2025
The test exposes existing root profile propagation issues
mtrofin added a commit that referenced this pull request Sep 10, 2025
…157961)

The test exposes existing root profile propagation issues
nikic added a commit to nikic/llvm-project that referenced this pull request Sep 12, 2025
While we do not want to form actual lookup tables early, we do
want to perform some optimizations, as they may enable inlining
of the much simpler form.

Builds on llvm#156477, which
originally included this change as well.
nikic added a commit to nikic/llvm-project that referenced this pull request Sep 16, 2025
While we do not want to form actual lookup tables early, we do
want to perform some optimizations, as they may enable inlining
of the much simpler form.

Builds on llvm#156477, which
originally included this change as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants