Skip to content

[PowerPC] [Draft] Emit xxeval instruction for Ternary operation with v2i64 operand types #145574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tonykuttai
Copy link
Contributor

@tonykuttai tonykuttai commented Jun 24, 2025

xxeval instruction can be used for supporting the ternary patterns.

Supporting the following patterns in this change with v2i64 operands. Since the Binary operations xor & and gets promoted to v4i32 the original pattern becomes:
ternary(v2i64:A, xor(v2i64:B, v2i64:C), and(v2i64:B, v2i64:C)) to ternary(v2i64:A, bitcast(v2i64, xor(v4i32:B, v4i32:C)), bitcast(v2i64, and(v4i32:B, v4i32:C)) ).

  • To lower this pattern to xxeval instruction, Custom lowering is seeked and ISD::VSELECT node is Custom Lowered during the PPCIselLowering.
  • If the pattern matches ternary(v2i64:A, bitcast(v2i64, xor(v4i32:B, v4i32:C)), bitcast(v2i64, and(v4i32:B, v4i32:C)) ), then a new node PPCISD::VSELECT is emitted as shown. All other nodes will follow the default lowering.
SelectionDAG has 15 nodes:
  t0: ch,glue = EntryToken
  t4: v2i64,ch = CopyFromReg t0, Register:v2i64 %1
  t6: v2i64,ch = CopyFromReg t0, Register:v2i64 %2
        t2: v2i64,ch = CopyFromReg t0, Register:v2i64 %0
      t16: v2i64 = sign_extend_inreg t2, ValueType:ch:v2i1
      t8: v2i64 = xor t4, t6
      t9: v2i64 = and t4, t6
    t23: v2i64 = PPCISD::VSELECT t16, t8, t9
  t12: ch,glue = CopyToReg t0, Register:v2i64 $v2, t23
  t13: ch = PPCISD::RET_GLUE t12, Register:v2i64 $v2, t12:1
  • Now, I have added the pattern matching for PPCISD::VSELECT in llvm/lib/Target/PowerPC/PPCInstrP10.td.
  • Sample test file used xxeval-vselect-x-and-v2i64.ll :
; Function to test ternary(A, xor(B, C), and(B, C)) for <2 x i64>
define <2 x i64> @ternary_A_xor_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) {
; CHECK-LABEL: ternary_A_xor_BC_and_BC_2x64:
; CHECK:       # %bb.0: # %entry
; CHECK-NEXT:    xxlxor v5, v5, v5
; CHECK-NEXT:    xxlxor vs0, v3, v4
; CHECK-NEXT:    xxland vs1, v3, v4
; CHECK-NEXT:    xxsplti32dx v5, 1, 63
; CHECK-NEXT:    vsld v2, v2, v5
; CHECK-NEXT:    vsrad v2, v2, v5
; CHECK-NEXT:    xxsel v2, vs1, vs0, v2
; CHECK-NEXT:    blr
entry:
  %xor = xor <2 x i64> %B, %C
  %and = and <2 x i64> %B, %C
  %res = select <2 x i1> %A, <2 x i64> %xor, <2 x i64> %and
  ret <2 x i64> %res
}

doit.sh script:

$LLVM_BUILD/bin/llc -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr -debug-only=isel -debug-only=legalize-types xxeval-vselect-x-and-v2i64.ll > isel_debug_x_and_2x64.ll 2<&1
$LLVM_BUILD/bin/llc -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr -debug-only=ppc-lowering xxeval-vselect-x-and-v2i64.ll > ppc-lowering_debug_x_and_2x64.ll 2<&1


  • This is resulting in a crash during legalize vector types.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/lib/Target/PowerPC/PPCISelLowering.cpp llvm/lib/Target/PowerPC/PPCISelLowering.h
View the diff from clang-format here.
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 91cad5cf6..76c40b63d 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -844,7 +844,7 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
       AddPromotedToType (ISD::LOAD  , VT, MVT::v4i32);
       setOperationAction(ISD::SELECT, VT, Promote);
       AddPromotedToType (ISD::SELECT, VT, MVT::v4i32);
-      setOperationAction(ISD::VSELECT, VT, Custom);      
+      setOperationAction(ISD::VSELECT, VT, Custom);
       setOperationAction(ISD::SELECT_CC, VT, Promote);
       AddPromotedToType (ISD::SELECT_CC, VT, MVT::v4i32);
       setOperationAction(ISD::STORE, VT, Promote);
@@ -1691,7 +1691,8 @@ const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
     return "PPCISD::XXSPLTI_SP_TO_DP";
   case PPCISD::XXSPLTI32DX:
     return "PPCISD::XXSPLTI32DX";
-  case PPCISD::VSELECT:        return "PPCISD::VSELECT";
+  case PPCISD::VSELECT:
+    return "PPCISD::VSELECT";
   case PPCISD::VECINSERT:       return "PPCISD::VECINSERT";
   case PPCISD::XXPERMDI:        return "PPCISD::XXPERMDI";
   case PPCISD::XXPERM:
@@ -9598,7 +9599,8 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
   }
 
   if (VT != MVT::v2i64) {
-    LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: VT is not v2i64, default lowering\n");
+    LLVM_DEBUG(llvm::dbgs()
+               << "LowerVSELECT: VT is not v2i64, default lowering\n");
     return SDValue();
   }
 
@@ -9607,16 +9609,16 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
   SDValue FalseOp = Op.getOperand(2);
 
   // Helper to check for the pattern: BITCAST (XOR (BITCAST x), (BITCAST y))
-  auto isPromotedBitcastBinop = [](SDValue V, unsigned &BinOpcode, SDValue &X, SDValue &Y) -> bool {
+  auto isPromotedBitcastBinop = [](SDValue V, unsigned &BinOpcode, SDValue &X,
+                                   SDValue &Y) -> bool {
     LLVM_DEBUG(llvm::dbgs() << "isPromotedBitCastBinop: \n");
     LLVM_DEBUG(llvm::dbgs() << "Binop Op: "; V->dump());
     LLVM_DEBUG(llvm::dbgs() << "\n");
-    
+
     if (V.getOpcode() != ISD::BITCAST)
       return false;
     SDValue BinOp = V.getOperand(0);
-    if (BinOp.getOpcode() != ISD::AND &&
-        BinOp.getOpcode() != ISD::XOR &&
+    if (BinOp.getOpcode() != ISD::AND && BinOp.getOpcode() != ISD::XOR &&
         BinOp.getOpcode() != ISD::OR)
       return false;
     // Both operands must be BITCAST from v4i32
@@ -9637,22 +9639,27 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
   SDValue TrueX, TrueY, FalseX, FalseY;
   if (!isPromotedBitcastBinop(TrueOp, TrueBinOpcode, TrueX, TrueY) ||
       !isPromotedBitcastBinop(FalseOp, FalseBinOpcode, FalseX, FalseY)) {
-    LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: Pattern not matched, default lowering\n");
+    LLVM_DEBUG(llvm::dbgs()
+               << "LowerVSELECT: Pattern not matched, default lowering\n");
     return SDValue();
   }
 
   LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: TrueBinOpCode: " << TrueBinOpcode);
-  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT: FalseBinOpCode: " << FalseBinOpcode);
+  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT: FalseBinOpCode: "
+                          << FalseBinOpcode);
 
   // For the specific pattern: VSELECT(cond, XOR, AND)
   if (!(TrueBinOpcode == ISD::XOR && FalseBinOpcode == ISD::AND)) {
-    LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: Not AND/XOR pattern, default lowering\n");
+    LLVM_DEBUG(llvm::dbgs()
+               << "LowerVSELECT: Not AND/XOR pattern, default lowering\n");
     return SDValue();
   }
 
   // The operands to AND and XOR must be the same
   if (!(TrueX == FalseX && TrueY == FalseY)) {
-    LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: AND/XOR operands mismatch, default lowering\n");
+    LLVM_DEBUG(
+        llvm::dbgs()
+        << "LowerVSELECT: AND/XOR operands mismatch, default lowering\n");
     return SDValue();
   }
 
@@ -9660,33 +9667,38 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
   SDLoc DL(Op);
   SDValue XorV2i64 = DAG.getNode(ISD::XOR, DL, VT, TrueX, TrueY);
   SDValue AndV2i64 = DAG.getNode(ISD::AND, DL, VT, TrueX, TrueY);
-  LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: XOR Node : " ; XorV2i64->dump());
-  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT: AND Node : " ; AndV2i64->dump());
+  LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: XOR Node : "; XorV2i64->dump());
+  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT: AND Node : "; AndV2i64->dump());
 
   // Legalize the mask type if needed
   EVT MaskVT = Cond.getValueType();
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
-  EVT LegalMaskVT = TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), MaskVT);
+  EVT LegalMaskVT =
+      TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), MaskVT);
   if (MaskVT != LegalMaskVT)
     Cond = DAG.getZExtOrTrunc(Cond, DL, LegalMaskVT);
 
   // Emit the new PPCISD::VSELECT node (so it can match xxeval for v2i64)
-  SDValue NewVSelect = DAG.getNode(PPCISD::VSELECT, DL, VT, Cond, XorV2i64, AndV2i64);
+  SDValue NewVSelect =
+      DAG.getNode(PPCISD::VSELECT, DL, VT, Cond, XorV2i64, AndV2i64);
   DAG.ReplaceAllUsesWith(Op, NewVSelect);
 
-  LLVM_DEBUG(llvm::dbgs() << "LowerVSELECT: Emitted PPCISD::VSELECT for v2i64 AND/XOR pattern\n");
+  LLVM_DEBUG(
+      llvm::dbgs()
+      << "LowerVSELECT: Emitted PPCISD::VSELECT for v2i64 AND/XOR pattern\n");
   LLVM_DEBUG(NewVSelect.dump());
 
   LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT NEW DAG\n"; DAG.dump());
   DAG.RemoveDeadNode(Op.getNode());
-  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT NEW DAG After removal\n"; DAG.dump());
+  LLVM_DEBUG(llvm::dbgs() << "\nLowerVSELECT NEW DAG After removal\n";
+             DAG.dump());
 
   return NewVSelect;
 }
 
-// Lower the vector operands of the VSELECT Node 
+// Lower the vector operands of the VSELECT Node
 // The operands of the VSELECT nodes needs to modifed back if:
-// - the operands of the VSELECT are bitcast (op (bitcast x), (bitcast y)) 
+// - the operands of the VSELECT are bitcast (op (bitcast x), (bitcast y))
 // - the outer bitcast is VT and inner bitcast is v4i32
 // - VSELECT Node type is not v4i32 and is of type v2i64
 // Then operands needs to put back to their original types.
@@ -9701,7 +9713,7 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
 //   }
 //   LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: VT is not v4i32 \n");
 //   // If the VT is v2i64, we need to check the conditions:
-//   // - the operands of the VSELECT are bitcast (op (bitcast x), (bitcast y)) 
+//   // - the operands of the VSELECT are bitcast (op (bitcast x), (bitcast y))
 //   // - the outer bitcast is VT and inner bitcast is v4i32
 //   // - VSELECT Node type is not v4i32 and is of type v2i64
 //   SDValue Cond = Op.getOperand(0);
@@ -9737,7 +9749,7 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
 //     return true;
 //   };
 
-//   auto getOriginalNode = [&DAG](SDValue V) -> SDValue {   
+//   auto getOriginalNode = [&DAG](SDValue V) -> SDValue {
 //       SDValue InnerOp = V.getOperand(0);
 //       // Get the Bit Op node's Operands
 //       SDValue InnerBitOpOperand0 = InnerOp.getOperand(0);
@@ -9755,7 +9767,8 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
 //     // and return the VSELECT node with the original nodes.
 //     LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: Valid pattern \n");
 //   } else {
-//     // If the TrueOp and FalseOp are not valid patterns, return the original Op
+//     // If the TrueOp and FalseOp are not valid patterns, return the original
+//     Op
 //     // without modification.
 //     LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: Invalid pattern \n");
 //     return SDValue();
@@ -9764,21 +9777,24 @@ SDValue PPCTargetLowering::LowerVSELECT(SDValue Op, SelectionDAG &DAG) const {
 //   // Get the original nodes from the TrueOp and FalseOp
 //   SDValue NTrueOp = getOriginalNode(TrueOp);
 //   SDValue NFalseOp = getOriginalNode(FalseOp);
-//   // LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: NTrueOp: " << NTrueOp.getNode()->print(dbgs()) << "\n");
-//   // LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: NFalseOp: " << NFalseOp.getNode()->dump() << "\n");
+//   // LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: NTrueOp: " <<
+//   NTrueOp.getNode()->print(dbgs()) << "\n");
+//   // LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: NFalseOp: " <<
+//   NFalseOp.getNode()->dump() << "\n");
 
 //   // Ensure both NTrueOp and NFalseOp are valid before using them.
 //   if (!NTrueOp.getNode() || !NFalseOp.getNode()) {
-//     LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: One or both original nodes are invalid, returning original Op\n");
-//     return SDValue();
+//     LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: One or both original
+//     nodes are invalid, returning original Op\n"); return SDValue();
 //   }
 //   EVT MaskVT = Cond.getValueType();
 //   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
-//   EVT LegalMaskVT = TLI.getSetCCResultType(DAG.getDataLayout(), *DAG.getContext(), MaskVT);
+//   EVT LegalMaskVT = TLI.getSetCCResultType(DAG.getDataLayout(),
+//   *DAG.getContext(), MaskVT);
 
 //   LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: cond mask VT: " << MaskVT);
-//   LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: cond Legal VT: " << LegalMaskVT << "\n");
-//   if (MaskVT != LegalMaskVT) {
+//   LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: cond Legal VT: " <<
+//   LegalMaskVT << "\n"); if (MaskVT != LegalMaskVT) {
 //     LLVM_DEBUG(llvm::dbgs() << "LowerVECTOR_SELECT: cond LEGALIZATION\n");
 //     // Bitcast or extend/truncate as needed
 //     Cond = DAG.getZExtOrTrunc(Cond, SDLoc(Op), LegalMaskVT);
@@ -12750,7 +12766,8 @@ SDValue PPCTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
   case ISD::FSHR:               return LowerFunnelShift(Op, DAG);
 
   // Vector-related lowering.
-  case ISD::VSELECT:            return LowerVSELECT(Op, DAG);
+  case ISD::VSELECT:
+    return LowerVSELECT(Op, DAG);
   case ISD::BUILD_VECTOR:       return LowerBUILD_VECTOR(Op, DAG);
   case ISD::VECTOR_SHUFFLE:     return LowerVECTOR_SHUFFLE(Op, DAG);
   case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant