Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Implement CodeGen Support for XCValu Extension in CV32E40P #78138

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

realqhc
Copy link
Contributor

@realqhc realqhc commented Jan 15, 2024

Implement XCValu intrinsics and CodeGen for CV32E40P according to the specification.

This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins.

Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 15, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-risc-v

Author: None (realqhc)

Changes

… in CV32E40P

Implement XCValu intrinsics and CodeGen for CV32E40P according to the specification.

This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins.

Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie.


Patch is 32.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78138.diff

5 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsRISCVXCV.td (+28)
  • (modified) llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp (+83)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+14-2)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td (+158-4)
  • (added) llvm/test/CodeGen/RISCV/xcvalu.ll (+583)
diff --git a/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td b/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
index f1590ad66e362b..f0c6faadf6aabf 100644
--- a/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
+++ b/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
@@ -18,6 +18,18 @@ class ScalarCoreVBitManipGprIntrinsic
     : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty],
                             [IntrNoMem, IntrSpeculatable]>;
 
+class ScalarCoreVAluGprIntrinsic
+  : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty],
+                          [IntrNoMem, IntrSpeculatable]>;
+
+class ScalarCoreVAluGprGprIntrinsic
+  : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
+                          [IntrNoMem, IntrSpeculatable]>;
+
+class ScalarCoreVAluGprGprGprIntrinsic
+  : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
+                          [IntrNoMem, IntrSpeculatable]>;
+
 let TargetPrefix = "riscv" in {
   def int_riscv_cv_bitmanip_extract : ScalarCoreVBitManipGprGprIntrinsic;
   def int_riscv_cv_bitmanip_extractu : ScalarCoreVBitManipGprGprIntrinsic;
@@ -34,4 +46,20 @@ let TargetPrefix = "riscv" in {
     : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
                             [IntrNoMem, IntrWillReturn, IntrSpeculatable,
                             ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
+
+  def int_riscv_cv_alu_exths : ScalarCoreVAluGprIntrinsic;
+  def int_riscv_cv_alu_exthz : ScalarCoreVAluGprIntrinsic;
+  def int_riscv_cv_alu_extbs : ScalarCoreVAluGprIntrinsic;
+  def int_riscv_cv_alu_extbz : ScalarCoreVAluGprIntrinsic;
+
+  def int_riscv_cv_alu_clip   : ScalarCoreVAluGprGprIntrinsic;
+  def int_riscv_cv_alu_clipu  : ScalarCoreVAluGprGprIntrinsic;
+  def int_riscv_cv_alu_addn   : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_addun  : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_addrn  : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_addurn : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_subn   : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_subun  : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_subrn  : ScalarCoreVAluGprGprGprIntrinsic;
+  def int_riscv_cv_alu_suburn : ScalarCoreVAluGprGprGprIntrinsic;
 } // TargetPrefix = "riscv"
diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index ed2b1ceb7d6f0d..aaac5ce834dd4b 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -53,6 +53,8 @@ class RISCVExpandPseudo : public MachineFunctionPass {
                             MachineBasicBlock::iterator MBBI);
   bool expandRV32ZdinxLoad(MachineBasicBlock &MBB,
                            MachineBasicBlock::iterator MBBI);
+  bool expandCoreVClip(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
+  bool expandCoreVAddSub(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
 #ifndef NDEBUG
   unsigned getInstSizeInBytes(const MachineFunction &MF) const {
     unsigned Size = 0;
@@ -161,6 +163,16 @@ bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,
   case RISCV::PseudoVMSET_M_B64:
     // vmset.m vd => vmxnor.mm vd, vd, vd
     return expandVMSET_VMCLR(MBB, MBBI, RISCV::VMXNOR_MM);
+  case RISCV::CV_CLIP_PSEUDO:
+  case RISCV::CV_CLIPU_PSEUDO:return expandCoreVClip(MBB, MBBI);
+  case RISCV::CV_ADDN_PSEUDO:
+  case RISCV::CV_ADDUN_PSEUDO:
+  case RISCV::CV_ADDRN_PSEUDO:
+  case RISCV::CV_ADDURN_PSEUDO:
+  case RISCV::CV_SUBN_PSEUDO:
+  case RISCV::CV_SUBUN_PSEUDO:
+  case RISCV::CV_SUBRN_PSEUDO:
+  case RISCV::CV_SUBURN_PSEUDO:return expandCoreVAddSub(MBB, MBBI);
   }
 
   return false;
@@ -547,6 +559,77 @@ bool RISCVPreRAExpandPseudo::expandLoadTLSGDAddress(
                              RISCV::ADDI);
 }
 
+bool RISCVExpandPseudo::expandCoreVClip(llvm::MachineBasicBlock &MBB,
+                                        MachineBasicBlock::iterator MBBI) {
+  DebugLoc DL = MBBI->getDebugLoc();
+  Register DstReg = MBBI->getOperand(0).getReg();
+  Register I = MBBI->getOperand(1).getReg();
+  uint64_t J = MBBI->getOperand(2).getImm();
+
+  unsigned Opcode = MBBI->getOpcode() == RISCV::CV_CLIPU_PSEUDO ?
+                    RISCV::CV_CLIPU : RISCV::CV_CLIP;
+  const MCInstrDesc &Desc = TII->get(Opcode);
+  BuildMI(MBB, MBBI, DL, Desc, DstReg)
+      .addReg(I)
+      .addImm(Log2_32_Ceil(J + 1) + 1);
+  MBBI->eraseFromParent();
+  return true;
+}
+
+bool RISCVExpandPseudo::expandCoreVAddSub(llvm::MachineBasicBlock &MBB,
+                                          MachineBasicBlock::iterator MBBI) {
+  auto *MRI = &MBB.getParent()->getRegInfo();
+  DebugLoc DL = MBBI->getDebugLoc();
+  Register DstReg = MBBI->getOperand(0).getReg();
+  Register X = MBBI->getOperand(1).getReg();
+  Register Y = MBBI->getOperand(2).getReg();
+  uint8_t Shift = MBBI->getOperand(3).getImm();
+
+  bool IsImm = 0 <= Shift && Shift <= 31;
+  unsigned Opcode;
+  switch (MBBI->getOpcode()) {
+  case RISCV::CV_ADDN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_ADDN : RISCV::CV_ADDNR;
+    break;
+  case RISCV::CV_ADDUN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_ADDUN : RISCV::CV_ADDUNR;
+    break;
+  case RISCV::CV_ADDRN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_ADDRN : RISCV::CV_ADDRNR;
+    break;
+  case RISCV::CV_ADDURN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_ADDURN : RISCV::CV_ADDURNR;
+    break;
+  case RISCV::CV_SUBN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_SUBN : RISCV::CV_SUBNR;
+    break;
+  case RISCV::CV_SUBUN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_SUBUN : RISCV::CV_SUBUNR;
+    break;
+  case RISCV::CV_SUBRN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_SUBRN : RISCV::CV_SUBRNR;
+    break;
+  case RISCV::CV_SUBURN_PSEUDO:
+    Opcode = IsImm ? RISCV::CV_SUBURN : RISCV::CV_SUBURNR;
+    break;
+  default:llvm_unreachable("unknown instruction");
+  }
+  const MCInstrDesc &Desc = TII->get(Opcode);
+  if (IsImm) {
+    BuildMI(MBB, MBBI, DL, Desc, DstReg).
+        addReg(X).
+        addReg(Y).
+        addImm(Shift);
+  } else {
+    MRI->replaceRegWith(DstReg, X);
+    BuildMI(MBB, MBBI, DL, Desc, DstReg).
+        addReg(Y).
+        addReg(DstReg);
+  }
+  MBBI->eraseFromParent();
+  return true;
+}
+
 } // end of anonymous namespace
 
 INITIALIZE_PASS(RISCVExpandPseudo, "riscv-expand-pseudo",
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index cb9ffabc41236e..0979b40af768ed 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -250,10 +250,12 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
   if (RV64LegalI32 && Subtarget.is64Bit())
     setOperationAction(ISD::SELECT_CC, MVT::i32, Expand);
 
-  setCondCodeAction(ISD::SETLE, XLenVT, Expand);
+  if (!Subtarget.hasVendorXCValu())
+    setCondCodeAction(ISD::SETLE, XLenVT, Expand);
   setCondCodeAction(ISD::SETGT, XLenVT, Custom);
   setCondCodeAction(ISD::SETGE, XLenVT, Expand);
-  setCondCodeAction(ISD::SETULE, XLenVT, Expand);
+  if (!Subtarget.hasVendorXCValu())
+    setCondCodeAction(ISD::SETULE, XLenVT, Expand);
   setCondCodeAction(ISD::SETUGT, XLenVT, Custom);
   setCondCodeAction(ISD::SETUGE, XLenVT, Expand);
 
@@ -1366,6 +1368,16 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
     }
   }
 
+  if (Subtarget.hasVendorXCValu()) {
+    setOperationAction(ISD::ABS, XLenVT, Legal);
+    setOperationAction(ISD::SMIN, XLenVT, Legal);
+    setOperationAction(ISD::UMIN, XLenVT, Legal);
+    setOperationAction(ISD::SMAX, XLenVT, Legal);
+    setOperationAction(ISD::UMAX, XLenVT, Legal);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Legal);
+    setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Legal);
+  }
+
   // Function alignments.
   const Align FunctionAlignment(Subtarget.hasStdExtCOrZca() ? 2 : 4);
   setMinFunctionAlignment(FunctionAlignment);
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
index 924e91e15c348f..e0aeaf8c5c5f7c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
@@ -198,7 +198,7 @@ let DecoderNamespace = "XCValu" in {
 
 } // DecoderNamespace = "XCValu"
 
-let Predicates = [HasVendorXCValu],
+let Predicates = [HasVendorXCValu, IsRV32],
   hasSideEffects = 0, mayLoad = 0, mayStore = 0 in {
   // General ALU Operations
   def CV_ABS    : CVInstAluR<0b0101000, 0b011, "cv.abs">,
@@ -249,10 +249,10 @@ let Predicates = [HasVendorXCValu],
                   Sched<[]>;
   def CV_SUBURN : CVInstAluRRI<0b11, 0b011, "cv.suburn">,
                   Sched<[]>;
-} // Predicates = [HasVendorXCValu],
+} // Predicates = [HasVendorXCValu, IsRV32],
   //   hasSideEffects = 0, mayLoad = 0, mayStore = 0
 
-let Predicates = [HasVendorXCValu],
+let Predicates = [HasVendorXCValu, IsRV32],
   hasSideEffects = 0, mayLoad = 0, mayStore = 0,
   Constraints = "$rd = $rd_wb" in {
   def CV_ADDNR   : CVInstAluRRNR<0b1000000, 0b011, "cv.addnr">,
@@ -272,7 +272,7 @@ let Predicates = [HasVendorXCValu],
   def CV_SUBURNR : CVInstAluRRNR<0b1000111, 0b011, "cv.suburnr">,
                    Sched<[]>;
 
-} // Predicates = [HasVendorXCValu],
+} // Predicates = [HasVendorXCValu, IsRV32],
   //   hasSideEffects = 0, mayLoad = 0, mayStore = 0,
   //   Constraints = "$rd = $rd_wb"
 
@@ -662,6 +662,8 @@ let Predicates = [HasVendorXCVelw, IsRV32], hasSideEffects = 0,
 def cv_tuimm2 : TImmLeaf<XLenVT, [{return isUInt<2>(Imm);}]>;
 def cv_tuimm5 : TImmLeaf<XLenVT, [{return isUInt<5>(Imm);}]>;
 def cv_uimm10 : ImmLeaf<XLenVT, [{return isUInt<10>(Imm);}]>;
+def cv_uimm32: Operand<XLenVT>,
+               ImmLeaf<XLenVT, [{return isPowerOf2_32(Imm + 1);}]>;
 
 def CV_LO5: SDNodeXForm<imm, [{
   return CurDAG->getTargetConstant(N->getZExtValue() & 0x1f, SDLoc(N),
@@ -673,6 +675,49 @@ def CV_HI5: SDNodeXForm<imm, [{
                                    N->getValueType(0));
 }]>;
 
+def between : PatFrags<(ops node:$lowerBound, node:$upperBound, node:$value),
+                       [(smin (smax node:$value, node:$lowerBound), node:$upperBound),
+                        (smax (smin node:$value, node:$upperBound), node:$lowerBound)]>;
+
+def betweenu : PatFrags<(ops node:$upperBound, node:$value),
+                        [(smin (smax node:$value, 0), node:$upperBound),
+                         (smax (smin node:$value, node:$upperBound), 0)]>;
+def powerOf2 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(Imm); }]>;
+def powerOf2Minus1 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(Imm+1); }]>;
+def negativePowerOf2 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(-Imm); }]>;
+def roundBit : PatFrag<(ops node:$shiftAmount),
+                       (srl (shl 1, node:$shiftAmount), (XLenVT 1))>;
+def trailing1sPlus1 : SDNodeXForm<imm, [{
+  return CurDAG->getTargetConstant(
+                          llvm::countr_one(N->getZExtValue()) + 1,
+                          SDLoc(N), N->getValueType(0));
+}]>;
+
+def shiftRound : PatFrag<(ops node:$value, node:$shiftAmount),
+                         (sra (add node:$value, powerOf2), node:$shiftAmount), [{
+  if (auto powerOf2 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1)))
+    return (powerOf2->getZExtValue() << 1) == (1U << N->getConstantOperandVal(1));
+  return false;
+}]>;
+
+def ushiftRound : PatFrag<(ops node:$value, node:$shiftAmount),
+                          (srl (add node:$value, powerOf2), node:$shiftAmount), [{
+  if (auto powerOf2 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1)))
+    return (powerOf2->getZExtValue() << 1) == (1U << N->getConstantOperandVal(1));
+  return false;
+}]>;
+
+def clip : PatFrag<(ops node:$upperBound, node:$value),
+                   (between negativePowerOf2, node:$upperBound, node:$value), [{
+  // Checking lower & upper bound for the clip instruction
+  if (auto bound1 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1))) {
+    if (auto bound2 = dyn_cast<ConstantSDNode>(N->getOperand(1))) {
+      return (bound1->getSExtValue() == ~bound2->getSExtValue());
+    }
+  }
+  return false;
+}]>;
+
 multiclass PatCoreVBitManip<Intrinsic intr> {
   def : PatGprGpr<intr, !cast<RVInst>("CV_" # NAME # "R")>;
   def : Pat<(intr GPR:$rs1, cv_uimm10:$imm),
@@ -704,3 +749,112 @@ let Predicates = [HasVendorXCVbitmanip, IsRV32] in {
             (CV_BITREV GPR:$rs1, cv_tuimm2:$radix, cv_tuimm5:$pts)>;
   def : Pat<(bitreverse (XLenVT GPR:$rs)), (CV_BITREV GPR:$rs, 0, 0)>;
 }
+
+class PatCoreVAluGpr <string intr, string asm> :
+  PatGpr<!cast<Intrinsic>("int_riscv_cv_alu_" # intr),
+            !cast<RVInst>("CV_" # asm)>;
+class PatCoreVAluGprGpr <string intr, string asm> :
+  PatGprGpr<!cast<Intrinsic>("int_riscv_cv_alu_" # intr),
+               !cast<RVInst>("CV_" # asm)>;
+
+multiclass PatCoreVAluGprImm <Intrinsic intr> {
+  def "CV_" # NAME # "_PSEUDO" :
+    Pseudo<(outs GPR:$rd), (ins GPR:$rs, cv_uimm32:$imm), []>;
+  def : PatGprGpr<intr, !cast<RVInst>("CV_" # NAME # "R")>;
+  def : PatGprImm<intr, !cast<RVInst>("CV_" # NAME # "_PSEUDO"), cv_uimm32>;
+}
+
+multiclass PatCoreVAluGprGprImm <Intrinsic intr> {
+  def "CV_" # NAME # "_PSEUDO" :
+    Pseudo<(outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2, uimm5:$imm), []>;
+  def : Pat<(intr GPR:$rs1, GPR:$rs2, GPR:$rs3),
+            (!cast<RVInst>("CV_" # NAME # "R") GPR:$rs1, GPR:$rs2, GPR:$rs3)>;
+  def : Pat<(intr GPR:$rs1, GPR:$rs2, uimm5:$imm),
+            (!cast<RVInst>("CV_" # NAME # "_PSEUDO") GPR:$rs1, GPR:$rs2,
+            uimm5:$imm)>;
+}
+
+let Predicates = [HasVendorXCValu, IsRV32], AddedComplexity = 1 in {
+  def : PatGpr<abs, CV_ABS>;
+  def : PatGprGpr<setle, CV_SLET>;
+  def : PatGprGpr<setule, CV_SLETU>;
+  def : PatGprGpr<smin, CV_MIN>;
+  def : PatGprGpr<umin, CV_MINU>;
+  def : PatGprGpr<smax, CV_MAX>;
+  def : PatGprGpr<umax, CV_MAXU>;
+
+  def : Pat<(sext_inreg (XLenVT GPR:$rs1), i16), (CV_EXTHS GPR:$rs1)>;
+  def : Pat<(sext_inreg (XLenVT GPR:$rs1), i8), (CV_EXTBS GPR:$rs1)>;
+
+  def : Pat<(and (XLenVT GPR:$rs1), 0xffff), (CV_EXTHZ GPR:$rs1)>;
+  def : Pat<(and (XLenVT GPR:$rs1), 0xff),   (CV_EXTBZ GPR:$rs1)>;
+
+  def : Pat<(clip powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)),
+            (CV_CLIP GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>;
+  def : Pat<(between (not GPR:$rs2), GPR:$rs2, (XLenVT GPR:$rs1)),
+            (CV_CLIPR GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(betweenu powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)),
+            (CV_CLIPU GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>;
+  def : Pat<(betweenu GPR:$rs2, (XLenVT GPR:$rs1)),
+            (CV_CLIPUR GPR:$rs1, GPR:$rs2)>;
+
+  def : Pat<(sra (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+            (CV_ADDN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(srl (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+            (CV_ADDUN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(shiftRound (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+             uimm5:$imm5),
+            (CV_ADDRN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(ushiftRound (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+             uimm5:$imm5),
+            (CV_ADDURN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+
+  def : Pat<(sra (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+            (CV_SUBN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(srl (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+            (CV_SUBUN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(shiftRound (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+             uimm5:$imm5),
+            (CV_SUBRN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+  def : Pat<(ushiftRound (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+             uimm5:$imm5),
+            (CV_SUBURN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+
+  def : Pat<(sra (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+            (CV_ADDNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(srl (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+            (CV_ADDUNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(sra (add (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+            (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+            (CV_ADDRNR  GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(srl (add (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+            (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+            (CV_ADDURNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+
+  def : Pat<(sra (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+            (CV_SUBNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(srl (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+            (CV_SUBUNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(sra (add (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+            (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+            (CV_SUBRNR  GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+  def : Pat<(srl (add (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+            (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+            (CV_SUBURNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+
+  def : PatCoreVAluGpr<"exths", "EXTHS">;
+  def : PatCoreVAluGpr<"exthz", "EXTHZ">;
+  def : PatCoreVAluGpr<"extbs", "EXTBS">;
+  def : PatCoreVAluGpr<"extbz", "EXTBZ">;
+
+  defm CLIP   : PatCoreVAluGprImm<int_riscv_cv_alu_clip>;
+  defm CLIPU  : PatCoreVAluGprImm<int_riscv_cv_alu_clipu>;
+  defm ADDN   : PatCoreVAluGprGprImm<int_riscv_cv_alu_addn>;
+  defm ADDUN  : PatCoreVAluGprGprImm<int_riscv_cv_alu_addun>;
+  defm ADDRN  : PatCoreVAluGprGprImm<int_riscv_cv_alu_addrn>;
+  defm ADDURN : PatCoreVAluGprGprImm<int_riscv_cv_alu_addurn>;
+  defm SUBN   : PatCoreVAluGprGprImm<int_riscv_cv_alu_subn>;
+  defm SUBUN  : PatCoreVAluGprGprImm<int_riscv_cv_alu_subun>;
+  defm SUBRN  : PatCoreVAluGprGprImm<int_riscv_cv_alu_subrn>;
+  defm SUBURN : PatCoreVAluGprGprImm<int_riscv_cv_alu_suburn>;
+} // Predicates = [HasVendorXCValu, IsRV32]
diff --git a/llvm/test/CodeGen/RISCV/xcvalu.ll b/llvm/test/CodeGen/RISCV/xcvalu.ll
new file mode 100644
index 00000000000000..3b83b32a672c09
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/xcvalu.ll
@@ -0,0 +1,583 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -O0 -mtriple=riscv32 -mattr=+m -mattr=+xcvalu -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s
+
+declare i32 @llvm.abs.i32(i32, i1)
+declare i32 @llvm.smin.i32(i32, i32)
+declare i32 @llvm.smax.i32(i32, i32)
+declare i32 @llvm.umin.i32(i32, i32)
+declare i32 @llvm.umax.i32(i32, i32)
+
+define i32 @abs(i32 %a) {
+; CHECK-LABEL: abs:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.abs a0, a0
+; CHECK-NEXT:    ret
+  %1 = call i32 @llvm.abs.i32(i32 %a, i1 false)
+  ret i32 %1
+}
+
+define i1 @slet(i32 %a, i32 %b) {
+; CHECK-LABEL: slet:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.slet a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = icmp sle i32 %a, %b
+  ret i1 %1
+}
+
+define i1 @sletu(i32 %a, i32 %b) {
+; CHECK-LABEL: sletu:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.sletu a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = icmp ule i32 %a, %b
+  ret i1 %1
+}
+
+define i32 @smin(i32 %a, i32 %b) {
+; CHECK-LABEL: smin:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.min a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = call i32 @llvm.smin.i32(i32 %a, i32 %b)
+  ret i32 %1
+}
+
+define i32 @umin(i32 %a, i32 %b) {
+; CHECK-LABEL: umin:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.minu a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = call i32 @llvm.umin.i32(i32 %a, i32 %b)
+  ret i32 %1
+}
+
+define i32 @smax(i32 %a, i32 %b) {
+; CHECK-LABEL: smax:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.max a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = call i32 @llvm.smax.i32(i32 %a, i32 %b)
+  ret i32 %1
+}
+
+define i32 @umax(i32 %a, i32 %b) {
+; CHECK-LABEL: umax:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    cv.maxu a0, a0, a1
+; CHECK-NEXT:    ret
+  %1 = call i32 @llvm.umax.i32(i32 %a, i32 %b)
+  ret i32 %1
+}
+
+define i32 @exths(i16 %a) {
+; CHECK-LABEL: exths:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    # kill: def $x11 killed $x10
+; CHECK-NEXT:    cv.exths a0, a0
+; CHECK-NEXT:    ret
+  %1 = sext i16 %a to i32
+  ret i32 %1
+}
+
+define i32 @exthz(i16 %a) {
+; CHECK-LABEL: exthz:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    # kill: def $x11 killed $x10
+; CHECK-NEXT:    cv.exthz a0, a0
+; CHECK-NEXT:    ret
+  %1 = zext i16 %a to i32
+  ret i32 %1
+}
+
+define i32 @extbs(i8 %a) {
+; CHECK-LABEL: ...
[truncated]

Copy link

github-actions bot commented Jan 15, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.


def : Pat<(clip powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)),
(CV_CLIP GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>;
def : Pat<(between (not GPR:$rs2), GPR:$rs2, (XLenVT GPR:$rs1)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if (not GPR:$rs2) is not less than GPR:$rs2? Then they aren't lower and upper bound.

Does the hardware implement the checks in this order

if rs1 <= -2^(Is2-1), rD = -2^(Is2-1),
else if rs1 >= 2^(Is2-1)-1, rD = 2^(Is2-1)-1,
else rD = rs1

If so then I think we can only match the pattern where the smax is done before the smin.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alive2 proof that shows the order of smin/smax matter if you don't know that rs2 is positive. https://alive2.llvm.org/ce/z/dpHmEL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topperc Good point. The hardware actually treats GPR:$rs2 as unsigned, so it makes sense to generate this instruction only if rs2 is unsigned (e.g.: if it is an ABS node). The fix should require the node there to be unsigned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topperc Good point. The hardware actually treats GPR:$rs2 as unsigned, so it makes sense to generate this instruction only if rs2 is unsigned (e.g.: if it is an ABS node). The fix should require the node there to be unsigned.

cv.abs a0, a1
cv.clipr a0, a0, a1

Is this what we should generate to address this issue? If so, I have proposed a new pattern that generates this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "unsigned" do you mean bit 31 must be zero?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a fair point. I raised this issue with the hw group and we should have soon an update that clarifies this. I think that having bit 31 zero (so rs2 being positive) is a requirement for correctness of the operation. In that case we'd need a check on the operand value. I agree with separating this patch into two for now: intrinsics support and codegen.

; CHECK: # %bb.0:
; CHECK-NEXT: cv.exthz a0, a0
; CHECK-NEXT: ret
%1 = call i32 @llvm.riscv.cv.alu.exthz(i32 %a)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an intrinsic for exthz? Isn't this just AND with 0xffff?

@realqhc realqhc changed the title [RISCV] Implement Intrinsics and CodeGen Support for XCValu Extension… [RISCV] Implement CodeGen Support for XCValu Extension in CV32E40P Apr 12, 2024
@topperc
Copy link
Collaborator

topperc commented Apr 12, 2024

Can we please split up this patch so I can approve the simple cases?

@realqhc
Copy link
Contributor Author

realqhc commented Apr 13, 2024

Can we please split up this patch so I can approve the simple cases?

I have prepared the striped version in this pull request. #85603

realqhc and others added 4 commits May 24, 2024 14:55
… in CV32E40P

Implement XCValu intrinsics for CV32E40P according to the
specification.

This commit is part of a patch-set to upstream the vendor specific
extensions of CV32E40P that need LLVM intrinsics to implement Clang
builtins.

Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill,
@NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants