Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Canonicalize switch(X << C) into switch(X) #77068

Merged
merged 2 commits into from
Jan 5, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Jan 5, 2024

This patch canonicalizes switch(X << C) to switch(X). If the shift may wrap, an and instruction will be created to mask out all of the shifted bits.
Alive2: https://alive2.llvm.org/ce/z/wSsL5y

NOTE: We can relax the one-use constraint. But I don't see any benefit in my benchmark.

Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=a776740d6296520b8bde156aa3f8d9ecb32cddd9&to=6dd783b9f90ae5f258102d732953567d7e317c02&stat=instructions%3Au

stage1-O3 stage1-ReleaseThinLTO stage1-ReleaseLTO-g stage1-O0-g stage2-O3 stage2-O0-g stage2-clang
-0.00% +0.01% -0.02% -0.01% +0.02% -0.00% +0.01%

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 5, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch canonicalizes switch(X &lt;&lt; C) to switch(X). If the shift may wrap, an and instruction will be created to mask out all of the shifted bits.
Alive2: https://alive2.llvm.org/ce/z/wSsL5y

NOTE: We can relax the one-use constraint. But I don't see any benefit in my benchmark.


Full diff: https://github.com/llvm/llvm-project/pull/77068.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+27)
  • (added) llvm/test/Transforms/InstCombine/switch-shl.ll (+185)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index f3181dc14792c8..e4a2c4b66c5b5a 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3208,6 +3208,33 @@ Instruction *InstCombinerImpl::visitSwitchInst(SwitchInst &SI) {
     return replaceOperand(SI, 0, Op0);
   }
 
+  uint64_t ShiftAmt;
+  if (match(Cond, m_Shl(m_Value(Op0), m_ConstantInt(ShiftAmt))) &&
+      ShiftAmt < Op0->getType()->getScalarSizeInBits() &&
+      all_of(SI.cases(), [&](const auto &Case) {
+        return Case.getCaseValue()->getValue().countr_zero() >= ShiftAmt;
+      })) {
+    // Change 'switch (X << 2) case 4:' into 'switch (X) case 1:'.
+    OverflowingBinaryOperator *Shl = cast<OverflowingBinaryOperator>(Cond);
+    if (Shl->hasNoUnsignedWrap() || Shl->hasNoSignedWrap() ||
+        Shl->hasOneUse()) {
+      Value *NewCond = Op0;
+      if (!Shl->hasNoUnsignedWrap() && !Shl->hasNoSignedWrap()) {
+        // If the shift may wrap, we need to mask off the shifted bits.
+        unsigned BitWidth = Op0->getType()->getScalarSizeInBits();
+        NewCond = Builder.CreateAnd(
+            Op0, APInt::getLowBitsSet(BitWidth, BitWidth - ShiftAmt));
+      }
+      for (auto Case : SI.cases()) {
+        const APInt &CaseVal = Case.getCaseValue()->getValue();
+        APInt ShiftedCase = Shl->hasNoSignedWrap() ? CaseVal.ashr(ShiftAmt)
+                                                   : CaseVal.lshr(ShiftAmt);
+        Case.setValue(ConstantInt::get(SI.getContext(), ShiftedCase));
+      }
+      return replaceOperand(SI, 0, NewCond);
+    }
+  }
+
   KnownBits Known = computeKnownBits(Cond, 0, &SI);
   unsigned LeadingKnownZeros = Known.countMinLeadingZeros();
   unsigned LeadingKnownOnes = Known.countMinLeadingOnes();
diff --git a/llvm/test/Transforms/InstCombine/switch-shl.ll b/llvm/test/Transforms/InstCombine/switch-shl.ll
new file mode 100644
index 00000000000000..037ef37f97a31a
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/switch-shl.ll
@@ -0,0 +1,185 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define i1 @test_switch_with_shl_mask(i32 %a) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_mask(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i32 [[A]] to i8
+; CHECK-NEXT:    switch i8 [[TRUNC]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i8 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i8 1, label [[SW_BB]]
+; CHECK-NEXT:      i8 -128, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl i32 %a, 24
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+define i1 @test_switch_with_shl_nuw_multiuse(i32 %a) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_nuw_multiuse(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[B:%.*]] = shl nuw i32 [[A]], 24
+; CHECK-NEXT:    call void @use(i32 [[B]])
+; CHECK-NEXT:    switch i32 [[A]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i32 1, label [[SW_BB]]
+; CHECK-NEXT:      i32 128, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl nuw i32 %a, 24
+  call void @use(i32 %b)
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+define i1 @test_switch_with_shl_nsw_multiuse(i32 %a) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_nsw_multiuse(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[B:%.*]] = shl nsw i32 [[A]], 24
+; CHECK-NEXT:    call void @use(i32 [[B]])
+; CHECK-NEXT:    switch i32 [[A]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i32 1, label [[SW_BB]]
+; CHECK-NEXT:      i32 -128, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl nsw i32 %a, 24
+  call void @use(i32 %b)
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+; Negative tests
+
+define i1 @test_switch_with_shl_mask_multiuse(i32 %a) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_mask_multiuse(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[B:%.*]] = shl i32 [[A]], 24
+; CHECK-NEXT:    call void @use(i32 [[B]])
+; CHECK-NEXT:    switch i32 [[B]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i32 16777216, label [[SW_BB]]
+; CHECK-NEXT:      i32 -2147483648, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl i32 %a, 24
+  call void @use(i32 %b)
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+define i1 @test_switch_with_shl_mask_unknown_shamt(i32 %a, i32 %shamt) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_mask_unknown_shamt(
+; CHECK-SAME: i32 [[A:%.*]], i32 [[SHAMT:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[B:%.*]] = shl i32 [[A]], [[SHAMT]]
+; CHECK-NEXT:    switch i32 [[B]], label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i32 16777216, label [[SW_BB]]
+; CHECK-NEXT:      i32 -2147483648, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl i32 %a, %shamt
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+define i1 @test_switch_with_shl_mask_poison(i32 %a) {
+; CHECK-LABEL: define i1 @test_switch_with_shl_mask_poison(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    switch i32 poison, label [[SW_DEFAULT:%.*]] [
+; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
+; CHECK-NEXT:      i32 16777216, label [[SW_BB]]
+; CHECK-NEXT:      i32 -2147483648, label [[SW_BB]]
+; CHECK-NEXT:    ]
+; CHECK:       sw.bb:
+; CHECK-NEXT:    ret i1 true
+; CHECK:       sw.default:
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %b = shl i32 %a, 32
+  switch i32 %b, label %sw.default [
+  i32 0, label %sw.bb
+  i32 16777216, label %sw.bb
+  i32 2147483648, label %sw.bb
+  ]
+
+sw.bb:
+  ret i1 true
+sw.default:
+  ret i1 false
+}
+
+declare void @use(i32)

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jan 5, 2024
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jsonn
Copy link
Contributor

jsonn commented Jan 5, 2024

I think it would make sense for the not-one-use case to check if the switch is dense enough for a jump table. In that case, the transformation should always be a win

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jan 5, 2024

I think it would make sense for the not-one-use case to check if the switch is dense enough for a jump table. In that case, the transformation should always be a win

It has been handled by ReduceSwitchRange in SimplifyCFG. The motivation of this patch is to handle this pattern without the constraint SI->getNumCases() < 4 (The threshold will be target-specific in future. See also @nikic's comment https://github.com/llvm/llvm-project/pull/70977/files#r1394070233). So I think keeping it conservative makes sense.

@dtcxzyw dtcxzyw merged commit 1259c05 into llvm:main Jan 5, 2024
4 checks passed
@dtcxzyw dtcxzyw deleted the perf/fold-switch-shl branch January 5, 2024 17:43
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
)

This patch canonicalizes `switch(X << C)` to `switch(X)`. If the shift
may wrap, an and instruction will be created to mask out all of the
shifted bits.
Alive2: https://alive2.llvm.org/ce/z/wSsL5y

NOTE: We can relax the one-use constraint. But I don't see any benefit
in my benchmark.

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=a776740d6296520b8bde156aa3f8d9ecb32cddd9&to=6dd783b9f90ae5f258102d732953567d7e317c02&stat=instructions%3Au

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.00%|+0.01%|-0.02%|-0.01%|+0.02%|-0.00%|+0.01%|
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants