Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release/18.x: [llvm][LoongArch] Improve loongarch_lasx_xvpermi_q instrinsic (#82984) #83540

Closed
wants to merge 1 commit into from

Conversation

llvmbot
Copy link
Collaborator

@llvmbot llvmbot commented Mar 1, 2024

Backport d7c80bb

Requested by: @leecheechen

…2984)

For instruction xvpermi.q, only [1:0] and [5:4] bits of operands[3] are
used. The unused bits in operands[3] need to be set to 0 to avoid
causing undefined behavior.

(cherry picked from commit d7c80bb)
@llvmbot
Copy link
Collaborator Author

llvmbot commented Mar 1, 2024

@llvm/pr-subscribers-backend-loongarch

Author: None (llvmbot)

Changes

Backport d7c80bb

Requested by: @leecheechen


Full diff: https://github.com/llvm/llvm-project/pull/83540.diff

2 Files Affected:

  • (modified) llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp (+24-1)
  • (modified) llvm/test/CodeGen/LoongArch/lasx/intrinsic-permi.ll (+30)
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index 76c1a14fe0156c..3324dd2e8fc217 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -968,6 +968,28 @@ static SDValue checkIntrinsicImmArg(SDValue Op, unsigned ImmOp,
   return SDValue();
 }
 
+static SDValue checkAndModifyXVPERMI_QIntrinsicImmArg(SDValue Op,
+                                                      SelectionDAG &DAG) {
+  SDValue Op3 = Op->getOperand(3);
+  uint64_t Imm = Op3->getAsZExtVal();
+  // Check the range of ImmArg.
+  if (!isUInt<8>(Imm)) {
+    DAG.getContext()->emitError(Op->getOperationName(0) +
+                                ": argument out of range.");
+    return DAG.getNode(ISD::UNDEF, SDLoc(Op), Op.getValueType());
+  }
+
+  // For instruction xvpermi.q, only [1:0] and [5:4] bits of operands[3]
+  // are used. The unused bits in operands[3] need to be set to 0 to avoid
+  // causing undefined behavior on LA464.
+  if ((Imm & 0x33) != Imm) {
+    Op3 = DAG.getTargetConstant(Imm & 0x33, SDLoc(Op), Op3.getValueType());
+    DAG.UpdateNodeOperands(Op.getNode(), Op->getOperand(0), Op->getOperand(1),
+                           Op->getOperand(2), Op3);
+  }
+  return SDValue();
+}
+
 SDValue
 LoongArchTargetLowering::lowerINTRINSIC_WO_CHAIN(SDValue Op,
                                                  SelectionDAG &DAG) const {
@@ -1225,13 +1247,14 @@ LoongArchTargetLowering::lowerINTRINSIC_WO_CHAIN(SDValue Op,
   case Intrinsic::loongarch_lsx_vextrins_d:
   case Intrinsic::loongarch_lasx_xvshuf4i_d:
   case Intrinsic::loongarch_lasx_xvpermi_w:
-  case Intrinsic::loongarch_lasx_xvpermi_q:
   case Intrinsic::loongarch_lasx_xvbitseli_b:
   case Intrinsic::loongarch_lasx_xvextrins_b:
   case Intrinsic::loongarch_lasx_xvextrins_h:
   case Intrinsic::loongarch_lasx_xvextrins_w:
   case Intrinsic::loongarch_lasx_xvextrins_d:
     return checkIntrinsicImmArg<8>(Op, 3, DAG);
+  case Intrinsic::loongarch_lasx_xvpermi_q:
+    return checkAndModifyXVPERMI_QIntrinsicImmArg(Op, DAG);
   case Intrinsic::loongarch_lsx_vrepli_b:
   case Intrinsic::loongarch_lsx_vrepli_h:
   case Intrinsic::loongarch_lsx_vrepli_w:
diff --git a/llvm/test/CodeGen/LoongArch/lasx/intrinsic-permi.ll b/llvm/test/CodeGen/LoongArch/lasx/intrinsic-permi.ll
index 0d9f9daabc4488..92669d2e589558 100644
--- a/llvm/test/CodeGen/LoongArch/lasx/intrinsic-permi.ll
+++ b/llvm/test/CodeGen/LoongArch/lasx/intrinsic-permi.ll
@@ -36,3 +36,33 @@ entry:
   %res = call <32 x i8> @llvm.loongarch.lasx.xvpermi.q(<32 x i8> %va, <32 x i8> %vb, i32 1)
   ret <32 x i8> %res
 }
+
+define <32 x i8> @lasx_xvpermi_q_204(<32 x i8> %va, <32 x i8> %vb) nounwind {
+; CHECK-LABEL: lasx_xvpermi_q_204:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 0
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.xvpermi.q(<32 x i8> %va, <32 x i8> %vb, i32 204)
+  ret <32 x i8> %res
+}
+
+define <32 x i8> @lasx_xvpermi_q_221(<32 x i8> %va, <32 x i8> %vb) nounwind {
+; CHECK-LABEL: lasx_xvpermi_q_221:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 17
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.xvpermi.q(<32 x i8> %va, <32 x i8> %vb, i32 221)
+  ret <32 x i8> %res
+}
+
+define <32 x i8> @lasx_xvpermi_q_255(<32 x i8> %va, <32 x i8> %vb) nounwind {
+; CHECK-LABEL: lasx_xvpermi_q_255:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    xvpermi.q $xr0, $xr1, 51
+; CHECK-NEXT:    ret
+entry:
+  %res = call <32 x i8> @llvm.loongarch.lasx.xvpermi.q(<32 x i8> %va, <32 x i8> %vb, i32 255)
+  ret <32 x i8> %res
+}

@xry111
Copy link
Contributor

xry111 commented Mar 6, 2024

I have some doubt about this change.

To me if the user requests xvpermi.q via the loongarch_lasx_xvpermi_q intrinsic, we should give her/him the xvpermi.q instruction. If (s)he is passing an invalid operand then (s)he is invoking the undefined behavior herself/himself and we don't need to guarantee a thing.

So to me we should not merge this and we should revert this change for main. Or am I missing something? @xen0n @heiher @SixWeining @MaskRay

@SixWeining
Copy link
Contributor

I have some doubt about this change.

To me if the user requests xvpermi.q via the loongarch_lasx_xvpermi_q intrinsic, we should give her/him the xvpermi.q instruction. If (s)he is passing an invalid operand then (s)he is invoking the undefined behavior herself/himself and we don't need to guarantee a thing.

So to me we should not merge this and we should revert this change for main. Or am I missing something? @xen0n @heiher @SixWeining @MaskRay

Yes, it can be argued. But I know gcc has similar change.

@xen0n
Copy link
Contributor

xen0n commented Mar 8, 2024

For the record, based on the principle of "explicit is better than implicit" that generally holds, I'd favor an approach where such compile-time-verifiable out-of-range operands are given compile-time errors, or we should just pass through the value unmodified. Otherwise the intrinsic would have to carry the workaround effectively forever, and we could find ourselves trapped if later micro-architectures actually start to make use of the currently cleared bits.

@SixWeining
Copy link
Contributor

For the record, based on the principle of "explicit is better than implicit" that generally holds, I'd favor an approach where such compile-time-verifiable out-of-range operands are given compile-time errors, or we should just pass through the value unmodified. Otherwise the intrinsic would have to carry the workaround effectively forever, and we could find ourselves trapped if later micro-architectures actually start to make use of the currently cleared bits.

Makes sense. After discussion with gcc-loongarch team, we both decide to revert this change. Thanks.

@SixWeining SixWeining closed this Mar 11, 2024
SixWeining added a commit that referenced this pull request Mar 13, 2024
jpf91 pushed a commit to D-Programming-GDC/gcc that referenced this pull request Mar 15, 2024
The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (llvm/llvm-project#83540),
we think that keeping original insn operand as unmodified
state is better solution.

This patch partially reverts 7b158e0.

gcc/ChangeLog:

	* config/loongarch/lasx.md (lasx_xvpermi_q_<LASX:mode>):
	Remove masking of operand 3.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
	Reposition operand 3's value into instruction's defined accept range.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants