Skip to content

Commit

Permalink
[X86] condition branches folding for three-way conditional codes
Browse files Browse the repository at this point in the history
This patch implements a pass that optimizes condition branches on x86 by
taking advantage of the three-way conditional code generated by compare
instructions.

Currently, it tries to hoisting EQ and NE conditional branch to a dominant
conditional branch condition where the same EQ/NE conditional code is
computed. An example:
bb_0:
  cmp %0, 19
  jg bb_1
  jmp bb_2
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_4
bb_4:
  cmp %0, 20
  je bb_5
  jmp bb_6
Here we could combine the two compares in bb_0 and bb_4 and have the
following code:

bb_0:
  cmp %0, 20
  jg bb_1
  jl bb_2
  jmp bb_5
bb_1:
  cmp %0, 40
  jg bb_3
  jmp bb_6

For the case of %0 == 20 (bb_5), we eliminate two jumps, and the control height
for bb_6 is also reduced. bb_4 is gone after the optimization.

This optimization is motivated by the branch pattern generated by the switch
lowering: we always have pivot-1 compare for the inner nodes and we do a pivot
compare again the leaf (like above pattern).

This pass currently is enabled on Intel's Sandybridge and later arches. Some
reviewers pointed out that on some arches (like AMD Jaguar), this pass may
increase branch density to the point where it hurts the performance of the
branch predictor.

Differential Revision: https://reviews.llvm.org/D46662

llvm-svn: 343993
  • Loading branch information
xur-llvm committed Oct 8, 2018
1 parent 823549a commit 67b1b32
Show file tree
Hide file tree
Showing 9 changed files with 947 additions and 0 deletions.
1 change: 1 addition & 0 deletions llvm/lib/Target/X86/CMakeLists.txt
Expand Up @@ -27,6 +27,7 @@ set(sources
X86CallingConv.cpp
X86CallLowering.cpp
X86CmovConversion.cpp
X86CondBrFolding.cpp
X86DomainReassignment.cpp
X86ExpandPseudo.cpp
X86FastISel.cpp
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/X86/X86.h
Expand Up @@ -75,6 +75,9 @@ FunctionPass *createX86OptimizeLEAs();
/// Return a pass that transforms setcc + movzx pairs into xor + setcc.
FunctionPass *createX86FixupSetCC();

/// Return a pass that folds conditional branch jumps.
FunctionPass *createX86CondBrFolding();

/// Return a pass that avoids creating store forward block issues in the hardware.
FunctionPass *createX86AvoidStoreForwardingBlocks();

Expand Down
7 changes: 7 additions & 0 deletions llvm/lib/Target/X86/X86.td
Expand Up @@ -404,6 +404,12 @@ def FeatureFastBEXTR : SubtargetFeature<"fast-bextr", "HasFastBEXTR", "true",
"Indicates that the BEXTR instruction is implemented as a single uop "
"with good throughput.">;

// Merge branches using three-way conditional code.
def FeatureMergeToThreeWayBranch : SubtargetFeature<"merge-to-threeway-branch",
"ThreewayBranchProfitable", "true",
"Merge branches to a three-way "
"conditional branch">;

//===----------------------------------------------------------------------===//
// Register File Description
//===----------------------------------------------------------------------===//
Expand Down Expand Up @@ -732,6 +738,7 @@ def SNBFeatures : ProcessorFeatures<[], [
FeatureFastScalarFSQRT,
FeatureFastSHLDRotate,
FeatureSlowIncDec,
FeatureMergeToThreeWayBranch,
FeatureMacroFusion
]>;

Expand Down

0 comments on commit 67b1b32

Please sign in to comment.