Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SimplifyCFG] Add optimization for switches of powers of two #70977

Merged
merged 1 commit into from
Nov 18, 2023
Merged

[SimplifyCFG] Add optimization for switches of powers of two #70977

merged 1 commit into from
Nov 18, 2023

Conversation

DKay7
Copy link
Contributor

@DKay7 DKay7 commented Nov 1, 2023

Optimization reduces range for switches which cases are positive powers of two by replacing each case with count_trailing_zero(case).

Resolves #70756

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 1, 2023

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-transforms

Author: Daniil (DKay7)

Changes

Optimization reduces range for switches which cases are positive powers of two by replacing each case with count_trailing_zero(case).

Resolves #70756


Full diff: https://github.com/llvm/llvm-project/pull/70977.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+73-3)
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 68b5b1a78a3460e..5e44015e1e93595 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -50,6 +50,7 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Metadata.h"
@@ -6792,9 +6793,6 @@ static bool ReduceSwitchRange(SwitchInst *SI, IRBuilder<> &Builder,
 
   // This transform can be done speculatively because it is so cheap - it
   // results in a single rotate operation being inserted.
-  // FIXME: It's possible that optimizing a switch on powers of two might also
-  // be beneficial - flag values are often powers of two and we could use a CLZ
-  // as the key function.
 
   // countTrailingZeros(0) returns 64. As Values is guaranteed to have more than
   // one element and LLVM disallows duplicate cases, Shift is guaranteed to be
@@ -6839,6 +6837,75 @@ static bool ReduceSwitchRange(SwitchInst *SI, IRBuilder<> &Builder,
   return true;
 }
 
+static bool isSwitchOfPowersOfTwo(ArrayRef<int64_t> Values) {
+  for (auto &Value : Values) {
+    if (Value <= 0 || (Value & (Value - 1)) != 0)
+      return false;
+  }
+
+  return true;
+}
+
+static bool simplifySwitchOfPowersOfTwo(SwitchInst *SI, IRBuilder<> &Builder,
+                                        const DataLayout &DL) {
+
+  auto *CondTy = cast<IntegerType>(SI->getCondition()->getType());
+
+  if (CondTy->getIntegerBitWidth() > 64 ||
+      !DL.fitsInLegalInteger(CondTy->getIntegerBitWidth()))
+    return false;
+
+  // Only bother with this optimization if there are more than 3 switch cases;
+  // SDAG will only bother creating jump tables for 4 or more cases.
+  if (SI->getNumCases() < 4)
+    return false;
+
+  SmallVector<int64_t, 4> Values;
+  for (const auto &Case : SI->cases())
+    Values.push_back(Case.getCaseValue()->getValue().getSExtValue());
+
+  if (!isSwitchOfPowersOfTwo(Values))
+    return false;
+
+  Builder.SetInsertPoint(SI);
+
+  auto *Condition = SI->getCondition();
+  auto &Context = SI->getContext();
+  Function *Cttz =
+      Intrinsic::getDeclaration(SI->getModule(), Intrinsic::cttz, {CondTy});
+
+  // FIXME we can also perform this optimization only for switches with
+  // unreachable default case.
+  // This assumtion will save us from checking if `Condition` is a power of two
+
+  // checking if switch condition is a power of two. If not, just set its number
+  // of trailing zeros to 0 since `case 0` is forbidden it will lead to jumping
+  // to default case
+  // condition & (condition - 1) result in zero if condition is a power of two
+  auto *Sub = Builder.CreateSub(Condition, ConstantInt::get(CondTy, 1));
+  auto *And = Builder.CreateAnd(Condition, Sub);
+  auto *Cmp = Builder.CreateICmpNE(And, ConstantInt::get(CondTy, 0));
+
+  // if condition is not a power of two, we add 1 to it to make set its number
+  // of trailing zeros to 0. and case 0 is forbidden in this optimization.
+  auto *Add = Builder.CreateOr(Condition, Builder.CreateZExt(Cmp, CondTy));
+
+  // FIXME maybe we should check if cttz intrinsic is cheap on the target
+  // architecture
+  auto *ResultTrailingZeros = Builder.CreateCall(
+      Cttz, {Add, ConstantInt::get(Type::getInt1Ty(Context), 0)});
+  SI->replaceUsesOfWith(Condition, ResultTrailingZeros);
+
+  // Replace each case with its trailing zeros number
+  for (auto &Case : SI->cases()) {
+    auto *OrigValue = Case.getCaseValue();
+    Case.setValue(cast<ConstantInt>(ConstantInt::get(
+        OrigValue->getType(), OrigValue->getValue().countr_zero())));
+  }
+
+  return true;
+}
+
 bool SimplifyCFGOpt::simplifySwitch(SwitchInst *SI, IRBuilder<> &Builder) {
   BasicBlock *BB = SI->getParent();
 
@@ -6886,6 +6953,9 @@ bool SimplifyCFGOpt::simplifySwitch(SwitchInst *SI, IRBuilder<> &Builder) {
       SwitchToLookupTable(SI, Builder, DTU, DL, TTI))
     return requestResimplify();
 
+  if (simplifySwitchOfPowersOfTwo(SI, Builder, DL))
+    return requestResimplify();
+
   if (ReduceSwitchRange(SI, Builder, DL, TTI))
     return requestResimplify();
 

@dtcxzyw dtcxzyw changed the title Added optimization for switches of powers of two [SimplifyCFG] Added optimization for switches of powers of two Nov 2, 2023
@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 2, 2023

Please add some regression tests (including negative tests, e.g., switch with non-unreachable default case).

@dtcxzyw dtcxzyw requested review from nikic and DianQK November 2, 2023 05:38
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
Copy link
Member

@DianQK DianQK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add one positive test case first.

Failed Tests (1):
--
  | LLVM :: CodeGen/AArch64/switch-unreachable-default.ll

llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
@dtcxzyw dtcxzyw changed the title [SimplifyCFG] Added optimization for switches of powers of two [SimplifyCFG] Add optimization for switches of powers of two Nov 2, 2023
@DKay7
Copy link
Contributor Author

DKay7 commented Nov 9, 2023

I added all your suggestions and also wrote a test with both positive and negative cases. But what should I do with a failed test in the pipeline? Thanks.

@DianQK
Copy link
Member

DianQK commented Nov 10, 2023

I added all your suggestions and also wrote a test with both positive and negative cases. But what should I do with a failed test in the pipeline? Thanks.

Possibly you could change i32 64, label %bb5 to i32 128, label %bb5. But I'm not sure that's okay.

@DKay7
Copy link
Contributor Author

DKay7 commented Nov 10, 2023

I added all your suggestions and also wrote a test with both positive and negative cases. But what should I do with a failed test in the pipeline? Thanks.

Possibly you could change i32 64, label %bb5 to i32 128, label %bb5. But I'm not sure that's okay.

Well, it doesn't help for switch-unreachable-default.ll. Main problem with this test is that it was designed as described in it:

; The switch is lowered with a jump table for cases 1--32 and case 64 handled
; separately. Even though the default of the switch is unreachable, the
; fall-through for the jump table *is* reachable so the range check must be
; emitted.

And proposed optimization is able to lower the entire switch to one jump table. There's one way to fix it: I can replace i32 64, label %bb5 with something negative, for example: i32 -64, label %bb5. Then, proposed optimization won't do anything and test will be passed.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this transform be part of SwitchToLookupTable()? I'm not sure it makes sense to do this independently, if we can't form a lookup table.

@DKay7
Copy link
Contributor Author

DKay7 commented Nov 10, 2023

Shouldn't this transform be part of SwitchToLookupTable()? I'm not sure it makes sense to do this independently, if we can't form a lookup table.

I decided to create separated function to allow another optimizations such as ReduceSwitchRange to subsequently affect switch-case before jump table is created.

@DKay7 DKay7 requested a review from dtcxzyw November 12, 2023 16:52
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Show resolved Hide resolved
@DKay7 DKay7 requested a review from dtcxzyw November 12, 2023 22:32
@DKay7 DKay7 requested a review from dtcxzyw November 13, 2023 18:01
@DKay7 DKay7 requested a review from dtcxzyw November 14, 2023 12:46
@DKay7 DKay7 requested a review from dtcxzyw November 14, 2023 15:02
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved

// Only bother with this optimization if there are more than 3 switch cases;
// SDAG will only bother creating jump tables for 4 or more cases.
if (SI->getNumCases() < 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that AArch64 started using a larger limit very recently (#71166). But I think they can add a TTI hook for this if it becomes a problem for them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will post a patch later.

@DKay7 DKay7 requested a review from dtcxzyw November 15, 2023 19:06
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good, but I think some paths aren't covered by tests yet.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Show resolved Hide resolved
@DKay7 DKay7 requested a review from dtcxzyw November 17, 2023 18:09
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!
Please wait for additional approval from other reviewers.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved
Optimization reduces range for switches which cases are positive powers of two by replacing each case with count_trailing_zero(case).
Also, this optimization is performed only for switches with default case unreachable

Resolves #70756
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will merge this PR after it passes the CI.

@dtcxzyw dtcxzyw merged commit 424c424 into llvm:main Nov 18, 2023
3 checks passed
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
)

Optimization reduces the range for switches whose cases are positive powers
of two by replacing each case with count_trailing_zero(case).

Resolves llvm#70756
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
)

Optimization reduces the range for switches whose cases are positive powers
of two by replacing each case with count_trailing_zero(case).

Resolves llvm#70756
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reducing switch range on powers of two
5 participants