-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[InstCombine] Skip foldFBinOpOfIntCastsFromSign for vector ops #162804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesConverting a vector float op into a vector int op may be non-profitable, especially for targets where the float op for a given type is legal, but the integer op is not. We could of course also try to address this via a reverse transform in the backend, but I don't think it's worth the bother, given that vectors were never the intended use case for this transform in the first place. Fixes #162749. Full diff: https://github.com/llvm/llvm-project/pull/162804.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index d56a1af49ef32..b012a8fd47bec 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1690,6 +1690,11 @@ Instruction *InstCombinerImpl::foldFBinOpOfIntCastsFromSign(
// 2) (fp_binop ({s|u}itofp x), FpC)
// -> ({s|u}itofp (int_binop x, (fpto{s|u}i FpC)))
Instruction *InstCombinerImpl::foldFBinOpOfIntCasts(BinaryOperator &BO) {
+ // Don't perform the fold on vectors, as the integer operation may be much
+ // more expensive than the float operation in that case.
+ if (BO.getType()->isVectorTy())
+ return nullptr;
+
std::array<Value *, 2> IntOps = {nullptr, nullptr};
Constant *Op1FpC = nullptr;
// Check for:
diff --git a/llvm/test/Transforms/InstCombine/add-sitofp.ll b/llvm/test/Transforms/InstCombine/add-sitofp.ll
index fae1365dfa853..e1d39fdb8b871 100644
--- a/llvm/test/Transforms/InstCombine/add-sitofp.ll
+++ b/llvm/test/Transforms/InstCombine/add-sitofp.ll
@@ -99,12 +99,15 @@ define float @test_3(i32 %a, i32 %b) {
ret float %p
}
+; Don't perform the fold on vector operations, as the integer op may be
+; much more expensive than the float op in that case.
define <4 x double> @test_4(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: @test_4(
; CHECK-NEXT: [[A_AND:%.*]] = and <4 x i32> [[A:%.*]], splat (i32 1073741823)
; CHECK-NEXT: [[B_AND:%.*]] = and <4 x i32> [[B:%.*]], splat (i32 1073741823)
-; CHECK-NEXT: [[TMP1:%.*]] = add nuw nsw <4 x i32> [[A_AND]], [[B_AND]]
-; CHECK-NEXT: [[RES:%.*]] = uitofp nneg <4 x i32> [[TMP1]] to <4 x double>
+; CHECK-NEXT: [[A_AND_FP:%.*]] = uitofp nneg <4 x i32> [[A_AND]] to <4 x double>
+; CHECK-NEXT: [[B_AND_FP:%.*]] = uitofp nneg <4 x i32> [[B_AND]] to <4 x double>
+; CHECK-NEXT: [[RES:%.*]] = fadd <4 x double> [[A_AND_FP]], [[B_AND_FP]]
; CHECK-NEXT: ret <4 x double> [[RES]]
;
; Drop two highest bits to guarantee that %a + %b doesn't overflow
diff --git a/llvm/test/Transforms/InstCombine/binop-itofp.ll b/llvm/test/Transforms/InstCombine/binop-itofp.ll
index 702bbbbf7d176..57184ea54583a 100644
--- a/llvm/test/Transforms/InstCombine/binop-itofp.ll
+++ b/llvm/test/Transforms/InstCombine/binop-itofp.ll
@@ -1063,6 +1063,25 @@ define float @negzero_check_on_constant_for_si_fmul(i1 %c, i1 %.b, ptr %g_2345)
ret float %mul3.i.i
}
+; Don't perform the fold on vector operations, as the integer op may be
+; much more expensive than the float op in that case.
+define <2 x half> @test_ui_ui_i8_mul_vec(<2 x i8> noundef %x_in, <2 x i8> noundef %y_in) {
+; CHECK-LABEL: @test_ui_ui_i8_mul_vec(
+; CHECK-NEXT: [[X:%.*]] = and <2 x i8> [[X_IN:%.*]], splat (i8 15)
+; CHECK-NEXT: [[Y:%.*]] = and <2 x i8> [[Y_IN:%.*]], splat (i8 15)
+; CHECK-NEXT: [[XF:%.*]] = uitofp nneg <2 x i8> [[X]] to <2 x half>
+; CHECK-NEXT: [[YF:%.*]] = uitofp nneg <2 x i8> [[Y]] to <2 x half>
+; CHECK-NEXT: [[R:%.*]] = fmul <2 x half> [[XF]], [[YF]]
+; CHECK-NEXT: ret <2 x half> [[R]]
+;
+ %x = and <2 x i8> %x_in, splat (i8 15)
+ %y = and <2 x i8> %y_in, splat (i8 15)
+ %xf = uitofp <2 x i8> %x to <2 x half>
+ %yf = uitofp <2 x i8> %y to <2 x half>
+ %r = fmul <2 x half> %xf, %yf
+ ret <2 x half> %r
+}
+
define <2 x float> @nonzero_check_on_constant_for_si_fmul_vec_w_poison(i1 %c, i1 %.b, ptr %g_2345) {
; CHECK-LABEL: @nonzero_check_on_constant_for_si_fmul_vec_w_poison(
; CHECK-NEXT: [[SEL:%.*]] = select i1 [[C:%.*]], i32 65529, i32 53264
@@ -1091,8 +1110,9 @@ define <2 x float> @nonzero_check_on_constant_for_si_fmul_nz_vec_w_poison(i1 %c,
; CHECK-NEXT: [[CONV_I_V:%.*]] = insertelement <2 x i16> poison, i16 [[CONV_I_S]], i64 0
; CHECK-NEXT: [[CONV_I:%.*]] = shufflevector <2 x i16> [[CONV_I_V]], <2 x i16> poison, <2 x i32> zeroinitializer
; CHECK-NEXT: [[MUL3_I_I:%.*]] = sitofp <2 x i16> [[CONV_I]] to <2 x float>
+; CHECK-NEXT: [[MUL3_I_I1:%.*]] = fmul <2 x float> [[MUL3_I_I]], <float poison, float 1.000000e+00>
; CHECK-NEXT: store i32 [[SEL]], ptr [[G_2345:%.*]], align 4
-; CHECK-NEXT: ret <2 x float> [[MUL3_I_I]]
+; CHECK-NEXT: ret <2 x float> [[MUL3_I_I1]]
;
%sel = select i1 %c, i32 65529, i32 53264
%conv.i.s = trunc i32 %sel to i16
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - we could just move the combine entirely to vectorcombine (even though we would probably handle scalar and vector cases) and make it cost based - but this seems the better option until someone identifies a strong need for it.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/25629 Here is the relevant piece of the build log for the reference
|
…162804) Converting a vector float op into a vector int op may be non-profitable, especially for targets where the float op for a given type is legal, but the integer op is not. We could of course also try to address this via a reverse transform in the backend, but I don't think it's worth the bother, given that vectors were never the intended use case for this transform in the first place. Fixes llvm#162749.
…162804) Converting a vector float op into a vector int op may be non-profitable, especially for targets where the float op for a given type is legal, but the integer op is not. We could of course also try to address this via a reverse transform in the backend, but I don't think it's worth the bother, given that vectors were never the intended use case for this transform in the first place. Fixes llvm#162749.
Converting a vector float op into a vector int op may be non-profitable, especially for targets where the float op for a given type is legal, but the integer op is not.
We could of course also try to address this via a reverse transform in the backend, but I don't think it's worth the bother, given that vectors were never the intended use case for this transform in the first place.
Fixes #162749.