Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Improve select equiv fold for plain condition #83405

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

vfdff
Copy link
Contributor

@vfdff vfdff commented Feb 29, 2024

Fold select if the user of its select user indicates the condition

Fix #83225

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 29, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Allen (vfdff)

Changes

Fold select if the user of its select user indicates the condition

Fix #83225


Full diff: https://github.com/llvm/llvm-project/pull/83405.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+26)
  • (modified) llvm/test/Transforms/InstCombine/select.ll (+30)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 71fa9b9ba41ebb..1485fe8674b720 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -500,6 +500,29 @@ static bool isSelect01(const APInt &C1I, const APInt &C2I) {
   return C1I.isOne() || C1I.isAllOnes() || C2I.isOne() || C2I.isAllOnes();
 }
 
+/// Try to simplify a select instruction when the user of its select user
+/// indicates the condition.
+static bool simplifySeqSelectWithSameCond(Value *Cond, Value *F) {
+  Value *FalseVal = F;
+  Value *CondNext;
+  Value *FalseNext;
+  while (match(FalseVal, m_Select(m_Value(CondNext), m_Value(),
+                                  m_OneUse(m_Value(FalseNext))))) {
+    if (CondNext == Cond) {
+      auto *SI = cast<SelectInst>(FalseVal);
+      LLVM_DEBUG(dbgs() << "IC: Fold " << SI << " with " << *FalseNext << '\n');
+
+      SI->replaceAllUsesWith(FalseNext);
+      SI->eraseFromParent();
+      return true;
+    }
+
+    FalseVal = FalseNext;
+  }
+
+  return false;
+}
+
 /// Try to fold the select into one of the operands to allow further
 /// optimization.
 Instruction *InstCombinerImpl::foldSelectIntoOp(SelectInst &SI, Value *TrueVal,
@@ -557,6 +580,9 @@ Instruction *InstCombinerImpl::foldSelectIntoOp(SelectInst &SI, Value *TrueVal,
   if (Instruction *R = TryFoldSelectIntoOp(SI, FalseVal, TrueVal, true))
     return R;
 
+  if (simplifySeqSelectWithSameCond(SI.getCondition(), FalseVal))
+    return &SI;
+
   return nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index c5f1b77c6d7404..068d168876a328 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -3672,3 +3672,33 @@ define i32 @src_select_xxory_eq0_xorxy_y(i32 %x, i32 %y) {
   %cond = select i1 %xor0, i32 %xor, i32 %y
   ret i32 %cond
 }
+
+define i32 @sequence_select_with_same_cond (i1 %c1, i1 %c2){
+; CHECK-LABEL: @sequence_select_with_same_cond(
+; CHECK-NEXT:    [[S1:%.*]] = select i1 [[C1:%.*]], i32 23, i32 45
+; CHECK-NEXT:    [[S2:%.*]] = select i1 [[C2:%.*]], i32 666, i32 [[S1]]
+; CHECK-NEXT:    [[S3:%.*]] = select i1 [[C1]], i32 789, i32 [[S2]]
+; CHECK-NEXT:    ret i32 [[S3]]
+;
+  %s1 = select i1 %c1, i32 23, i32 45
+  %s2 = select i1 %c2, i32 666, i32 %s1
+  %s3 = select i1 %c1, i32 789, i32 %s2
+  ret i32 %s3
+}
+
+declare void @use32(i32)
+
+define i32 @sequence_select_with_same_cond_extra_use (i1 %c1, i1 %c2){
+; CHECK-LABEL: @sequence_select_with_same_cond_extra_use(
+; CHECK-NEXT:    [[S1:%.*]] = select i1 [[C1:%.*]], i32 23, i32 45
+; CHECK-NEXT:    call void @use32(i32 [[S1]])
+; CHECK-NEXT:    [[S2:%.*]] = select i1 [[C2:%.*]], i32 666, i32 [[S1]]
+; CHECK-NEXT:    [[S3:%.*]] = select i1 [[C1]], i32 789, i32 [[S2]]
+; CHECK-NEXT:    ret i32 [[S3]]
+;
+  %s1 = select i1 %c1, i32 23, i32 45
+  call void @use32(i32 %s1)
+  %s2 = select i1 %c2, i32 666, i32 %s1
+  %s3 = select i1 %c1, i32 789, i32 %s2
+  ret i32 %s3
+}

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Feb 29, 2024
@vfdff vfdff requested a review from dtcxzyw February 29, 2024 11:19
@nikic
Copy link
Contributor

nikic commented Feb 29, 2024

I think this should be a minor generalization of some existing simplifyWithOpReplaced or foldSelectValueEquivalence style fold.

@vfdff
Copy link
Contributor Author

vfdff commented Mar 1, 2024

I think this should be a minor generalization of some existing simplifyWithOpReplaced or foldSelectValueEquivalence style fold.
Thanks, refactor with simplifyWithOpReplaced. (foldSelectValueEquivalence is used for select + cmp

@vfdff vfdff force-pushed the PR83225 branch 2 times, most recently from b246978 to 03d2c81 Compare March 9, 2024 04:49
while (match(ValOp, m_Select(m_Value(CondNext), m_Value(), m_Value()))) {
if (CondNext == CondVal)
if (simplifyWithOpReplaced(ValOp, CondVal, RepOp, SQ,
/* AllowRefinement */ true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the simplifyWithOpReplaced is doing anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted the redundant call simplifyWithOpReplaced , thanks

nikic added a commit to nikic/llvm-project that referenced this pull request Mar 18, 2024
The select equivalence fold takes a select like "X == Y ? A : B"
and then tries to simplify A based on the known equality.

This patch also uses it for the case were we have just "C ? A : B"
by treating the condition as either "C == 1" or "C != 1".

This is intended as an alternative to llvm#83405
for fixing llvm#83225.
@nikic
Copy link
Contributor

nikic commented Mar 18, 2024

FWIW, what I had in mind with my comment was more something along the lines of #85663, which generalizes the existing select equivalence fold. The disadvantage is that it does not handle the multi-use example out of the box (but could be generalized).

@vfdff
Copy link
Contributor Author

vfdff commented Mar 22, 2024

FWIW, what I had in mind with my comment was more something along the lines of #85663, which generalizes the existing select equivalence fold. The disadvantage is that it does not handle the multi-use example out of the box (but could be generalized).

hi @nikic, I saw #85663 is closed because the compile time regression, so would you suggest I continue with current solution?


SINext = cast<SelectInst>(ValOp);
ValOp = SINext->getOperand(OpIndex);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure concerned about this, I think it may be slight better to continue looping until no more select and have a changed boolean.
Take a case like:

define i8 @foo(i1 %cond0, i1 %cond1, i8 %w, i8 %x, i8 %y, i8 %z, i8 %a) {
  %sel0 = select i1 %cond0, i8 %w, i8 %x
  %sel1 = select i1 %cond1, i8 %y, i8 %sel0
  %sel2 = select i1 %cond0, i8 %sel1, i8 %z
  %sel3 = select i1 %cond1, i8 %sel2, i8 %a
  ret i8 %sel3
}

I think what happens here is you replace %sel2 = select i1 %cond0, i8 %sel1, i8 %z -> %sel2 = select i1 %cond0, i8 %y, i8 %z. But after that you won't be able to replace %sel1 = select i1 %cond1, i8 %y, i8 %sel0 -> %sel1 = select i1 %cond1, i8 %y, i8 %x b.c you won't match any select arm in %sel2.

I'm not really sure if this ever comes up, I don't think its a must fix, just a thought.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thansk, need a seperate patch to improve the case https://alive2.llvm.org/ce/z/YGZPFa

@goldsteinn
Copy link
Contributor

LGTM.

@vfdff
Copy link
Contributor Author

vfdff commented Apr 3, 2024

ping ?

@goldsteinn
Copy link
Contributor

Still LGTM, ping @dtcxzyw @nikic

InstCombinerImpl &IC) {
Value *CondVal = SI.getCondition();
if (match(CondVal, m_ImmConstant()))
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it handled by simplifySelectInst?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it isn't.
#85663 is a more general improvement by nikic, and it has some compile time regression, so I continue on this solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (auto *CondC = dyn_cast<Constant>(Cond)) {
if (auto *TrueC = dyn_cast<Constant>(TrueVal))
if (auto *FalseC = dyn_cast<Constant>(FalseVal))
if (Constant *C = ConstantFoldSelectInstruction(CondC, TrueC, FalseC))
return C;

Can you provide a test to show that the early exit here is necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I delete the early exit checking.
I added it just because I thought it was an optimized expression and didn't need to be optimized.

Value *CondNext;
while (match(ValOp, m_Select(m_Value(CondNext), m_Value(), m_Value()))) {
if (CondNext == CondVal) {
IC.replaceOperand(*SINext, OpIndex,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is incorrect when the inner select is used by both arms.

Example: https://alive2.llvm.org/ce/z/sv6m69

define i8 @src(i1 %cond0, i1 %cond1, i8 %a, i8 %b) {
entry:
%sel0 = select i1 %cond0, i8 %a, i8 %b
%sel1 = select i1 %cond1, i8 %sel0, i8 %b
%sel2 = select i1 %cond1, i8 %sel1, i8 %b
%sel3 = select i1 %cond0, i8 %sel1, i8 %sel2
ret i8 %sel3
}

define i8 @tgt(i1 %cond0, i1 %cond1, i8 %a, i8 %b) {
entry:
%sel0 = select i1 %cond0, i8 %a, i8 %b
%sel1 = select i1 %cond1, i8 %a, i8 %b
%sel2 = select i1 %cond1, i8 %sel1, i8 %b
%sel3 = select i1 %cond0, i8 %sel1, i8 %sel2
ret i8 %sel3
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think you can get around that by just building a new select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @dtcxzyw , the above case can be addressed correctly https://alive2.llvm.org/ce/z/HyqtF4, and it will replace the %sel2 because we find the first 3 selects match the combine pattern

  • note: here it doesn't update the %sel1, but updates the %sel2
define i8 @tgt(i1 %cond0, i1 %cond1, i8 %a, i8 %b) {
entry:
%sel0 = select i1 %cond0, i8 %a, i8 %b
%sel1 = select i1 %cond1, i8 %sel0, i8 %b
%sel2 = select i1 %cond1, i8 %sel0, i8 %b  ;   %sel1 --> %sel0
%sel3 = select i1 %cond0, i8 %sel1, i8 %sel2
ret i8 %sel3
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think you can get around that by just building a new select.
Thanks @goldsteinn , apply your comment to make the PR more robust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need support the case with multi-arms because it may be not profitable to build a new select

define i32 @sequence_select_with_same_cond_extra_use(i1 %c1, i1 %c2){
; CHECK-LABEL: @sequence_select_with_same_cond_extra_use(
; CHECK-NEXT:    [[S1:%.*]] = select i1 [[C1:%.*]], i32 23, i32 45
; CHECK-NEXT:    [[S2_NEW:%.*]] = select i1 [[C2:%.*]], i32 666, i32 45
; CHECK-NEXT:    [[S2:%.*]] = select i1 [[C2]], i32 666, i32 [[S1]]
; CHECK-NEXT:    call void @use32(i32 [[S2]])
; CHECK-NEXT:    [[S3:%.*]] = select i1 [[C1]], i32 789, i32 [[S2_NEW]]
; CHECK-NEXT:    ret i32 [[S3]]
;
  %s1 = select i1 %c1, i32 23, i32 45
  %s2 = select i1 %c2, i32 666, i32 %s1
  call void @use32(i32 %s2)
  %s3 = select i1 %c1, i32 789, i32 %s2
  ret i32 %s3
}

@vfdff vfdff force-pushed the PR83225 branch 2 times, most recently from 967f70b to 6247957 Compare April 8, 2024 11:35
; CHECK-NEXT: [[SEL3:%.*]] = select i1 [[TMP1]], i8 [[SEL1_NEW]], i8 [[SEL4]]
; CHECK-NEXT: ret i8 [[SEL3]]
;
%sel0 = select i1 %cond0, i8 %a, i8 %b
Copy link
Contributor Author

@vfdff vfdff Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first

define i8 @fist(i1 %cond0, i1 %cond1, i8 %a, i8 %b) {
%sel0 = select i1 %cond0, i8 %a, i8 %b
%sel1 = select i1 %cond1, i8 %sel0, i8 2
%sel2 = select i1 %cond1, i8 %sel0, i8 3 ; update directly base on first 3 nodes
%sel3 = select i1 %cond0, i8 %sel1, i8 %sel2
ret i8 %sel3
}

second

define i8 @second(i1 %cond0, i1 %cond1, i8 %a, i8 %b) {
  %sel0 = select i1 %cond0, i8 %a, i8 %b
  %sel1.new = select i1 %cond1, i8 %a, i8 2
  %sel2 = select i1 %cond1, i8 %sel0, i8 3
  %sel3 = select i1 %cond0, i8 %sel1.new, i8 %sel2; base on the 4 nodes
  ret i8 %sel3
}

@vfdff vfdff requested a review from dtcxzyw April 8, 2024 11:50
Comment on lines 503 to 504
/// Try to simplify a select instruction when the user of its select user
/// indicates the condition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to parse this comment "when the user of its select user". Can you rephrase this whole thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment with a example.

%sel2 = select i1 %cond1, i8 %sel1, i8 3
%sel3 = select i1 %cond0, i8 %sel1, i8 %sel2
ret i8 %sel3
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some tests that show flag preservation behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added sequence_select_with_same_cond_float_and_fmf_flag1 and sequence_select_with_same_cond_float_and_fmf_flag2, thanks

@vfdff
Copy link
Contributor Author

vfdff commented May 13, 2024

According performace considerations, I deleted the support with multi-use select node and rebase to top upstream as the code conflicts.

@vfdff vfdff requested a review from arsenm May 18, 2024 01:11
@vfdff
Copy link
Contributor Author

vfdff commented May 24, 2024

ping @dtcxzyw, @arsenm

@vfdff
Copy link
Contributor Author

vfdff commented Jun 1, 2024

close to avoid blocking other solutions

@vfdff vfdff closed this Jun 1, 2024
@arsenm
Copy link
Contributor

arsenm commented Jun 1, 2024

close to avoid blocking other solutions

What do you mean? If there's an alternative patch, it should be mentioned in the closing message. Otherwise, why close it

@vfdff
Copy link
Contributor Author

vfdff commented Jun 1, 2024

Oh,sorry, I reopen it now.
I close it because the solution is limit to single-use select node, which seems not perfect :)

@vfdff vfdff reopened this Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[InstCombine] Missed optimization: fold select if the user of its select user indicates the condition
6 participants