-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[LoongArch] Legalize BUILD_VECTOR into a broadcast when all non-undef elements are identical #169755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zhaoqi5
wants to merge
2
commits into
users/zhaoqi5/tests-buildvector-with-undef-vrepl
Choose a base branch
from
users/zhaoqi5/opt-buildvector-with-undef-vrepl
base: users/zhaoqi5/tests-buildvector-with-undef-vrepl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… elements are identical When a BUILD_VECTOR consists of the same element (ignoring undefs), it is better emitting a broadcast instead of multiple insertions. Some floating-point cases suffer performance regressions, those specific cases are excluded in this commit. Including when: - only one element is non-undef, - only two elements are non-undef, and one of them must at index 0, - for v8f32 vector type, specially exclude the cases when the only two non-undefs are at index (1,2)/(1,3)/(2,3).
Member
|
@llvm/pr-subscribers-backend-loongarch Author: ZhaoQi (zhaoqi5) ChangesWhen a BUILD_VECTOR consists of the same element (ignoring undefs), Some floating-point cases suffer performance regressions, those
Full diff: https://github.com/llvm/llvm-project/pull/169755.diff 6 Files Affected:
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index 3ad5f7fa9e2a7..1471a11f22c38 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -3088,6 +3088,36 @@ SDValue LoongArchTargetLowering::lowerBUILD_VECTOR(SDValue Op,
}
if (!IsConstant) {
+ SmallVector<SDValue> Sequence;
+ BitVector UndefElements;
+ bool IsRepeated = Node->getRepeatedSequence(Sequence, &UndefElements);
+ unsigned SeqLen = Sequence.size();
+ unsigned NumUndefElts = UndefElements.count();
+
+ // When a BUILD_VECTOR consists of the same element (ignoring undefs),
+ // prefer emitting a broadcast instead of multiple insertions.
+ if (IsRepeated && SeqLen == 1) {
+ // Integer vectors benefit from splat unconditionally.
+ if (ResTy.isInteger())
+ return DAG.getSplatBuildVector(ResTy, DL, Sequence[0]);
+
+ // Only certain floating-point cases suffer performance regressions,
+ // exclude those specific cases.
+ bool IsSplatBetter = true;
+ if (NumUndefElts == NumElts - 1)
+ IsSplatBetter = false;
+ if (NumUndefElts == NumElts - 2 && !UndefElements[0])
+ IsSplatBetter = false;
+ if (ResTy == MVT::v8f32 && NumUndefElts == NumElts - 2 &&
+ ((!UndefElements[1] && !UndefElements[2]) ||
+ (!UndefElements[1] && !UndefElements[3]) ||
+ (!UndefElements[2] && !UndefElements[3])))
+ IsSplatBetter = false;
+
+ if (IsSplatBetter)
+ return DAG.getSplatBuildVector(ResTy, DL, Sequence[0]);
+ }
+
// If the BUILD_VECTOR has a repeated pattern, use INSERT_VECTOR_ELT to fill
// the sub-sequence of the vector and then broadcast the sub-sequence.
//
@@ -3095,10 +3125,7 @@ SDValue LoongArchTargetLowering::lowerBUILD_VECTOR(SDValue Op,
// back to use INSERT_VECTOR_ELT to materialize the vector, because it
// generates worse code in some cases. This could be further optimized
// with more consideration.
- SmallVector<SDValue> Sequence;
- BitVector UndefElements;
- if (Node->getRepeatedSequence(Sequence, &UndefElements) &&
- UndefElements.count() == 0) {
+ if (IsRepeated && NumUndefElts == 0) {
// Using LSX instructions to fill the sub-sequence of 256-bits vector,
// because the high part can be simply treated as undef.
SDValue Vector = DAG.getUNDEF(ResTy);
@@ -3110,7 +3137,6 @@ SDValue LoongArchTargetLowering::lowerBUILD_VECTOR(SDValue Op,
fillVector(Sequence, DAG, DL, Subtarget, FillVec, FillTy);
- unsigned SeqLen = Sequence.size();
unsigned SplatLen = NumElts / SeqLen;
MVT SplatEltTy = MVT::getIntegerVT(VT.getScalarSizeInBits() * SeqLen);
MVT SplatTy = MVT::getVectorVT(SplatEltTy, SplatLen);
diff --git a/llvm/test/CodeGen/LoongArch/lasx/build-vector.ll b/llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
index f753ac3f6f623..d04cabb4ddf8e 100644
--- a/llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
+++ b/llvm/test/CodeGen/LoongArch/lasx/build-vector.ll
@@ -93,22 +93,7 @@ entry:
define void @buildvector_v32i8_splat_with_undef(ptr %dst, i8 %a0) nounwind {
; CHECK-LABEL: buildvector_v32i8_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 0
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 1
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 2
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 3
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 4
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 5
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 6
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 7
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 8
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 9
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 10
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 11
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 12
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 13
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 14
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 15
+; CHECK-NEXT: xvreplgr2vr.b $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -135,15 +120,7 @@ entry:
define void @buildvector_v16i16_splat_with_undef(ptr %dst, i16 %a0) nounwind {
; CHECK-LABEL: buildvector_v16i16_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 0
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 1
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 2
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 3
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 4
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 5
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 6
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 7
-; CHECK-NEXT: xvpermi.q $xr0, $xr0, 2
+; CHECK-NEXT: xvreplgr2vr.h $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -162,9 +139,7 @@ entry:
define void @buildvector_v8i32_splat_with_undef(ptr %dst, i32 %a0) nounwind {
; CHECK-LABEL: buildvector_v8i32_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a1, 0
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a1, 2
-; CHECK-NEXT: xvpermi.q $xr0, $xr0, 2
+; CHECK-NEXT: xvreplgr2vr.w $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -189,8 +164,7 @@ define void @buildvector_v4i64_splat_with_undef(ptr %dst, i64 %a0) nounwind {
;
; LA64-LABEL: buildvector_v4i64_splat_with_undef:
; LA64: # %bb.0: # %entry
-; LA64-NEXT: xvinsgr2vr.d $xr0, $a1, 0
-; LA64-NEXT: xvinsgr2vr.d $xr0, $a1, 3
+; LA64-NEXT: xvreplgr2vr.d $xr0, $a1
; LA64-NEXT: xvst $xr0, $a0, 0
; LA64-NEXT: ret
entry:
@@ -232,9 +206,8 @@ define void @buildvector_v8f32_splat_with_undef_2(ptr %dst, float %a0) nounwind
; CHECK-LABEL: buildvector_v8f32_splat_with_undef_2:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $f0 killed $f0 def $xr0
-; CHECK-NEXT: xvinsve0.w $xr1, $xr0, 1
-; CHECK-NEXT: xvinsve0.w $xr1, $xr0, 4
-; CHECK-NEXT: xvst $xr1, $a0, 0
+; CHECK-NEXT: xvreplve0.w $xr0, $xr0
+; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
%ins1 = insertelement <8 x float> undef, float %a0, i32 1
@@ -246,11 +219,9 @@ entry:
define void @buildvector_v8f32_splat_with_undef_3(ptr %dst, float %a0) nounwind {
; CHECK-LABEL: buildvector_v8f32_splat_with_undef_3:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: # kill: def $f0 killed $f0 def $vr0
-; CHECK-NEXT: vextrins.w $vr1, $vr0, 16
-; CHECK-NEXT: vextrins.w $vr1, $vr0, 48
-; CHECK-NEXT: xvpermi.q $xr1, $xr1, 2
-; CHECK-NEXT: xvst $xr1, $a0, 0
+; CHECK-NEXT: # kill: def $f0 killed $f0 def $xr0
+; CHECK-NEXT: xvreplve0.w $xr0, $xr0
+; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
%ins1 = insertelement <8 x float> undef, float %a0, i32 1
@@ -292,8 +263,7 @@ define void @buildvector_v4f64_splat_with_undef_2(ptr %dst, double %a0) nounwind
; CHECK-LABEL: buildvector_v4f64_splat_with_undef_2:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $f0_64 killed $f0_64 def $xr0
-; CHECK-NEXT: vextrins.d $vr0, $vr0, 16
-; CHECK-NEXT: xvpermi.q $xr0, $xr0, 2
+; CHECK-NEXT: xvreplve0.d $xr0, $xr0
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll b/llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
index 2f1db43e68fef..ce846e9cc0e0c 100644
--- a/llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
+++ b/llvm/test/CodeGen/LoongArch/lasx/ir-instruction/insertelement.ll
@@ -35,7 +35,7 @@ define void @insert_32xi8_upper(ptr %src, ptr %dst, i8 %in) nounwind {
define void @insert_32xi8_undef(ptr %dst, i8 %in) nounwind {
; CHECK-LABEL: insert_32xi8_undef:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 1
+; CHECK-NEXT: xvreplgr2vr.b $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
%v = insertelement <32 x i8> poison, i8 %in, i32 1
@@ -46,8 +46,7 @@ define void @insert_32xi8_undef(ptr %dst, i8 %in) nounwind {
define void @insert_32xi8_undef_upper(ptr %dst, i8 %in) nounwind {
; CHECK-LABEL: insert_32xi8_undef_upper:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 6
-; CHECK-NEXT: xvpermi.q $xr0, $xr0, 2
+; CHECK-NEXT: xvreplgr2vr.b $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
%v = insertelement <32 x i8> poison, i8 %in, i32 22
@@ -88,7 +87,7 @@ define void @insert_16xi16_upper(ptr %src, ptr %dst, i16 %in) nounwind {
define void @insert_16xi16_undef(ptr %dst, i16 %in) nounwind {
; CHECK-LABEL: insert_16xi16_undef:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 1
+; CHECK-NEXT: xvreplgr2vr.h $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
%v = insertelement <16 x i16> poison, i16 %in, i32 1
@@ -99,8 +98,7 @@ define void @insert_16xi16_undef(ptr %dst, i16 %in) nounwind {
define void @insert_16xi16_undef_upper(ptr %dst, i16 %in) nounwind {
; CHECK-LABEL: insert_16xi16_undef_upper:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 2
-; CHECK-NEXT: xvpermi.q $xr0, $xr0, 2
+; CHECK-NEXT: xvreplgr2vr.h $xr0, $a1
; CHECK-NEXT: xvst $xr0, $a0, 0
; CHECK-NEXT: ret
%v = insertelement <16 x i16> poison, i16 %in, i32 10
diff --git a/llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll b/llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
index bba269279937a..39533cff7c868 100644
--- a/llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
+++ b/llvm/test/CodeGen/LoongArch/lasx/scalar-to-vector.ll
@@ -7,7 +7,7 @@
define <32 x i8> @scalar_to_32xi8(i8 %val) {
; CHECK-LABEL: scalar_to_32xi8:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a0, 0
+; CHECK-NEXT: xvreplgr2vr.b $xr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <32 x i8> poison, i8 %val, i32 0
ret <32 x i8> %ret
@@ -16,7 +16,7 @@ define <32 x i8> @scalar_to_32xi8(i8 %val) {
define <16 x i16> @scalar_to_16xi16(i16 %val) {
; CHECK-LABEL: scalar_to_16xi16:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a0, 0
+; CHECK-NEXT: xvreplgr2vr.h $xr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <16 x i16> poison, i16 %val, i32 0
ret <16 x i16> %ret
@@ -25,7 +25,7 @@ define <16 x i16> @scalar_to_16xi16(i16 %val) {
define <8 x i32> @scalar_to_8xi32(i32 %val) {
; CHECK-LABEL: scalar_to_8xi32:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 0
+; CHECK-NEXT: xvreplgr2vr.w $xr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <8 x i32> poison, i32 %val, i32 0
ret <8 x i32> %ret
@@ -40,7 +40,7 @@ define <4 x i64> @scalar_to_4xi64(i64 %val) {
;
; LA64-LABEL: scalar_to_4xi64:
; LA64: # %bb.0:
-; LA64-NEXT: vinsgr2vr.d $vr0, $a0, 0
+; LA64-NEXT: xvreplgr2vr.d $xr0, $a0
; LA64-NEXT: ret
%ret = insertelement <4 x i64> poison, i64 %val, i32 0
ret <4 x i64> %ret
diff --git a/llvm/test/CodeGen/LoongArch/lsx/build-vector.ll b/llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
index 1fe289cdaaccb..eb1a843bb42d8 100644
--- a/llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
+++ b/llvm/test/CodeGen/LoongArch/lsx/build-vector.ll
@@ -93,14 +93,7 @@ entry:
define void @buildvector_v16i8_splat_with_undef(ptr %dst, i8 %a0) nounwind {
; CHECK-LABEL: buildvector_v16i8_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 0
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 2
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 4
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 6
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 8
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 10
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 12
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a1, 14
+; CHECK-NEXT: vreplgr2vr.b $vr0, $a1
; CHECK-NEXT: vst $vr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -119,10 +112,7 @@ entry:
define void @buildvector_v8i16_splat_with_undef(ptr %dst, i16 %a0) nounwind {
; CHECK-LABEL: buildvector_v8i16_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 1
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 3
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 5
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a1, 7
+; CHECK-NEXT: vreplgr2vr.h $vr0, $a1
; CHECK-NEXT: vst $vr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -137,8 +127,7 @@ entry:
define void @buildvector_v4i32_splat_with_undef(ptr %dst, i32 %a0) nounwind {
; CHECK-LABEL: buildvector_v4i32_splat_with_undef:
; CHECK: # %bb.0: # %entry
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a1, 1
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a1, 2
+; CHECK-NEXT: vreplgr2vr.w $vr0, $a1
; CHECK-NEXT: vst $vr0, $a0, 0
; CHECK-NEXT: ret
entry:
@@ -158,7 +147,7 @@ define void @buildvector_v2i64_splat_with_undef(ptr %dst, i64 %a0) nounwind {
;
; LA64-LABEL: buildvector_v2i64_splat_with_undef:
; LA64: # %bb.0: # %entry
-; LA64-NEXT: vinsgr2vr.d $vr0, $a1, 0
+; LA64-NEXT: vreplgr2vr.d $vr0, $a1
; LA64-NEXT: vst $vr0, $a0, 0
; LA64-NEXT: ret
entry:
@@ -197,9 +186,8 @@ define void @buildvector_v4f32_splat_with_undef_2(ptr %dst, float %a0) nounwind
; CHECK-LABEL: buildvector_v4f32_splat_with_undef_2:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: # kill: def $f0 killed $f0 def $vr0
-; CHECK-NEXT: vextrins.w $vr1, $vr0, 16
-; CHECK-NEXT: vextrins.w $vr1, $vr0, 32
-; CHECK-NEXT: vst $vr1, $a0, 0
+; CHECK-NEXT: vreplvei.w $vr0, $vr0, 0
+; CHECK-NEXT: vst $vr0, $a0, 0
; CHECK-NEXT: ret
entry:
%ins1 = insertelement <4 x float> undef, float %a0, i32 1
@@ -969,7 +957,7 @@ define void @buildvector_v2i64_partial(ptr %dst, i64 %a0) nounwind {
;
; LA64-LABEL: buildvector_v2i64_partial:
; LA64: # %bb.0: # %entry
-; LA64-NEXT: vinsgr2vr.d $vr0, $a1, 0
+; LA64-NEXT: vreplgr2vr.d $vr0, $a1
; LA64-NEXT: vst $vr0, $a0, 0
; LA64-NEXT: ret
entry:
diff --git a/llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll b/llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
index d2a506dd98547..be1a8206c2418 100644
--- a/llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
+++ b/llvm/test/CodeGen/LoongArch/lsx/scalar-to-vector.ll
@@ -7,7 +7,7 @@
define <16 x i8> @scalar_to_16xi8(i8 %val) {
; CHECK-LABEL: scalar_to_16xi8:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.b $vr0, $a0, 0
+; CHECK-NEXT: vreplgr2vr.b $vr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <16 x i8> poison, i8 %val, i32 0
ret <16 x i8> %ret
@@ -16,7 +16,7 @@ define <16 x i8> @scalar_to_16xi8(i8 %val) {
define <8 x i16> @scalar_to_8xi16(i16 %val) {
; CHECK-LABEL: scalar_to_8xi16:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.h $vr0, $a0, 0
+; CHECK-NEXT: vreplgr2vr.h $vr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <8 x i16> poison, i16 %val, i32 0
ret <8 x i16> %ret
@@ -25,7 +25,7 @@ define <8 x i16> @scalar_to_8xi16(i16 %val) {
define <4 x i32> @scalar_to_4xi32(i32 %val) {
; CHECK-LABEL: scalar_to_4xi32:
; CHECK: # %bb.0:
-; CHECK-NEXT: vinsgr2vr.w $vr0, $a0, 0
+; CHECK-NEXT: vreplgr2vr.w $vr0, $a0
; CHECK-NEXT: ret
%ret = insertelement <4 x i32> poison, i32 %val, i32 0
ret <4 x i32> %ret
@@ -40,7 +40,7 @@ define <2 x i64> @scalar_to_2xi64(i64 %val) {
;
; LA64-LABEL: scalar_to_2xi64:
; LA64: # %bb.0:
-; LA64-NEXT: vinsgr2vr.d $vr0, $a0, 0
+; LA64-NEXT: vreplgr2vr.d $vr0, $a0
; LA64-NEXT: ret
%ret = insertelement <2 x i64> poison, i64 %val, i32 0
ret <2 x i64> %ret
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When a BUILD_VECTOR consists of the same element (ignoring undefs),
it is better emitting a broadcast instead of multiple insertions.
Some floating-point cases suffer performance regressions or
have no benefits, those specific cases are excluded in this
commit. Including when:
two non-undefs are at index (1,2)/(1,3)/(2,3).