-
Notifications
You must be signed in to change notification settings - Fork 15k
[InstCombine] Allow folding cross-lane operations into PHIs/selects #164388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4003,18 +4003,29 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) { | |
|
|
||
| // Try to fold intrinsic into select/phi operands. This is legal if: | ||
| // * The intrinsic is speculatable. | ||
| // * The select condition is not a vector, or the intrinsic does not | ||
| // perform cross-lane operations. | ||
| if (isSafeToSpeculativelyExecuteWithVariableReplaced(&CI) && | ||
| isNotCrossLaneOperation(II)) | ||
| // * The operand is one of the following: | ||
| // - a phi. | ||
| // - a select with a scalar condition. | ||
| // - a select with a vector condition and II is not a cross lane operation. | ||
| if (isSafeToSpeculativelyExecuteWithVariableReplaced(&CI)) { | ||
| for (Value *Op : II->args()) { | ||
| if (auto *Sel = dyn_cast<SelectInst>(Op)) | ||
| if (Instruction *R = FoldOpIntoSelect(*II, Sel)) | ||
| if (auto *Sel = dyn_cast<SelectInst>(Op)) { | ||
| bool IsVectorCond = Sel->getCondition()->getType()->isVectorTy(); | ||
| if (IsVectorCond && !isNotCrossLaneOperation(II)) | ||
| continue; | ||
| // Don't replace a scalar select with a more expensive vector select if | ||
| // we can't simplify both arms of the select. | ||
| bool SimplifyBothArms = | ||
| !Op->getType()->isVectorTy() && II->getType()->isVectorTy(); | ||
| if (Instruction *R = FoldOpIntoSelect( | ||
| *II, Sel, /*FoldWithMultiUse=*/false, SimplifyBothArms)) | ||
| return R; | ||
| } | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some overflow intrinsic calls (e.g., See https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2962/files. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added some test cases, I don't think it's entirely unexpected as So now |
||
| if (auto *Phi = dyn_cast<PHINode>(Op)) | ||
| if (Instruction *R = foldOpIntoPhi(*II, Phi)) | ||
| return R; | ||
| } | ||
| } | ||
|
|
||
| if (Instruction *Shuf = foldShuffledIntrinsicOperands(II)) | ||
| return Shuf; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -115,10 +115,10 @@ define float @test_select_frexp_no_const(float %x, float %y, i1 %cond) { | |
| define i32 @test_select_frexp_extract_exp(float %x, i1 %cond) { | ||
| ; CHECK-LABEL: define i32 @test_select_frexp_extract_exp( | ||
| ; CHECK-SAME: float [[X:%.*]], i1 [[COND:%.*]]) { | ||
| ; CHECK-NEXT: [[SEL:%.*]] = select i1 [[COND]], float 1.000000e+00, float [[X]] | ||
| ; CHECK-NEXT: [[FREXP:%.*]] = call { float, i32 } @llvm.frexp.f32.i32(float [[SEL]]) | ||
| ; CHECK-NEXT: [[FREXP:%.*]] = call { float, i32 } @llvm.frexp.f32.i32(float [[X]]) | ||
| ; CHECK-NEXT: [[FREXP_1:%.*]] = extractvalue { float, i32 } [[FREXP]], 1 | ||
| ; CHECK-NEXT: ret i32 [[FREXP_1]] | ||
| ; CHECK-NEXT: [[FREXP_2:%.*]] = select i1 [[COND]], i32 1, i32 [[FREXP_1]] | ||
| ; CHECK-NEXT: ret i32 [[FREXP_2]] | ||
| ; | ||
| %sel = select i1 %cond, float 1.000000e+00, float %x | ||
| %frexp = call { float, i32 } @llvm.frexp.f32.i32(float %sel) | ||
|
|
@@ -132,7 +132,7 @@ define float @test_select_frexp_fast_math_select(float %x, i1 %cond) { | |
| ; CHECK-SAME: float [[X:%.*]], i1 [[COND:%.*]]) { | ||
| ; CHECK-NEXT: [[FREXP1:%.*]] = call { float, i32 } @llvm.frexp.f32.i32(float [[X]]) | ||
| ; CHECK-NEXT: [[MANTISSA:%.*]] = extractvalue { float, i32 } [[FREXP1]], 0 | ||
| ; CHECK-NEXT: [[SELECT_FREXP:%.*]] = select nnan ninf nsz i1 [[COND]], float 5.000000e-01, float [[MANTISSA]] | ||
| ; CHECK-NEXT: [[SELECT_FREXP:%.*]] = select i1 [[COND]], float 5.000000e-01, float [[MANTISSA]] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The FMFs end up getting dropped here as this was previously handled by |
||
| ; CHECK-NEXT: ret float [[SELECT_FREXP]] | ||
| ; | ||
| %sel = select nnan ninf nsz i1 %cond, float 1.000000e+00, float %x | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please point out the test coverage for this special case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was due to regressions in
LoopVectorize/AArch64as aget.active.lane.mask(%base, select %cond, %n, 0)was replaced with select between the mask andzeroinitializer, which is a more expensive operation.I'll add a negative test in
intrinsic-select.ll.