[PowerPC] Exploit xxeval instruction for ternary patterns - part 1 #141733

tonykuttai · 2025-05-28T09:19:01Z

Description

xxeval is Power 10 VSX Vector Evaluate instruction that have the form xxeval XT,XA,XB,XC,IMM.
xxeval(A, B, C, IMM) can support 256 equivalent functions based on the value of IMM. A previous patch added support for and, nan), eqv, nor, or, and xor. This change adds support for ternary operations using xxeval for v4i32, v2i64, v16i8 and v8i16 operand types.

Supporting the following patterns in this change:

ternary(A, X, and(B,C))
ternary(A, X, B)
ternary(A, X, C)
ternary(A, X, xor(B,C))
This change make use of the patterns and its corresponding imm values to group the ternary operations so that appropriate tableGen multiclass were formed. The following patterns are handled:

Ternary Operators                     Imm Values
ternary(A,  xor(B,C),   and(B,C))	22
ternary(A,  nor(B,C),   and(B,C))	24
ternary(A,  eqv(B,C),   and(B,C))	25
ternary(A,  not(C),     and(B,C))	26
ternary(A,  not(B),     and(B,C))	28

ternary(A,  and(B,C),   B)	        49
ternary(A,  nor(B,C),   B)	        56
ternary(A,  eqv(B,C),   B)	        57
ternary(A,  nand(B,C),  B)	        62

ternary(A,  and(B,C),   C)	        81
ternary(A,  nor(B,C),   C)	        88
ternary(A,  eqv(B,C),   C)	        89
ternary(A,  nand(B,C),  C)	        94

ternary(A,  and(B,C),   xor(B,C))	97
ternary(A,  B,          xor(B,C))	99
ternary(A,  C,          xor(B,C))	101
ternary(A,  or(B,C),    xor(B,C))	103
ternary(A,  nor(B,C),   xor(B,C))	104

Note:
Patterns ternary(A, not(B), C) : 92 and ternary(A, not(C), B): 58 are not supported as xxsel instruction is preferred in both cases.

llvmbot · 2025-05-28T10:46:55Z

@llvm/pr-subscribers-backend-powerpc

Author: Tony Varghese (tonykuttai)

Changes

Description

xxeval instruction can be used for supporting the ternary patterns.

Supporting the following patterns in this change:

ternary(A, X, and(B,C))
ternary(A, X, B)
ternary(A, X, C)
ternary(A, X, xor(B,C))
This change make use of the patterns and its corresponding imm values to group the ternary operations so that appropriate tableGen multiclass were formed. The following patterns are handled:

Ternary Operators                     Imm Values
ternary(A,  xor(B,C),   and(B,C))	22
ternary(A,  nor(B,C),   and(B,C))	24
ternary(A,  eqv(B,C),   and(B,C))	25
ternary(A,  not(C),     and(B,C))	26
ternary(A,  not(B),     and(B,C))	28

ternary(A,  and(B,C),   B)	        49
ternary(A,  nor(B,C),   B)	        56
ternary(A,  eqv(B,C),   B)	        57
ternary(A,  not(C),     B)	        58
ternary(A,  nand(B,C),  B)	        62

ternary(A,  and(B,C),   C)	        81
ternary(A,  nor(B,C),   C)	        88
ternary(A,  eqv(B,C),   C)	        89
ternary(A,  nand(B,C),  C)	        94

ternary(A,  and(B,C),   xor(B,C))	97
ternary(A,  B,          xor(B,C))	99
ternary(A,  C,          xor(B,C))	101
ternary(A,  or(B,C),    xor(B,C))	103
ternary(A,  nor(B,C),   xor(B,C))	104

ternary(A, not(B), C) : 92 is omitted as it was a symmetrical to ternary(A, not(C), B) : 58

Patch is 33.13 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141733.diff

5 Files Affected:

(modified) llvm/lib/Target/PowerPC/PPCInstrP10.td (+170-32)
(added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll (+134)
(added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll (+126)
(added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-c.ll (+104)
(added) llvm/test/CodeGen/PowerPC/xxeval-vselect-x-xor.ll (+128)

diff --git a/llvm/lib/Target/PowerPC/PPCInstrP10.td b/llvm/lib/Target/PowerPC/PPCInstrP10.td
index 39a1ab0d388a7..99f77e18e43a0 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrP10.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrP10.td
@@ -2159,8 +2159,133 @@ let AddedComplexity = 400, Predicates = [IsISA3_1, HasVSX] in {
                                (COPY_TO_REGCLASS $VRB, VSRC), 2)))>;
 }
 
-class xxevalPattern <dag pattern, bits<8> imm> :
-  Pat<(v4i32 pattern), (XXEVAL $vA, $vB, $vC, imm)> {}
+class xxevalPattern <ValueType vt, dag pattern, bits<8> imm> :
+  Pat<(vt pattern), (XXEVAL $vA, $vB, $vC, imm)> {}
+
+class DagUnaryVNot<ValueType vt, string opstr>{
+  // Defines a class that returns the UnaryVNot dag for an operand string based on a value type.
+  dag res = !cond(
+          !eq(vt, v4i32) : !dag(vnot, [v4i32], [opstr]),
+          !eq(vt, v2i64) : (v2i64 (bitconvert (vnot (v4i32 !dag(bitconvert, [v2i64], [opstr])))))
+          );
+}
+
+class DagCondVNot<dag d, bit negate> {
+  // Defines a class that generates a vnot around the dag.
+  dag res = !if(!ne(negate, 0),
+               (vnot d),
+               d);
+}
+
+class XXEvalUnaryNot<ValueType vt> {
+  // Defines a wrapper class for unary NOT operations for v4i32 and v2i64 vector types.
+  // Unary NOT on operand B or C based on value type.
+  dag opB = DagUnaryVNot<vt, "vB">.res;
+  dag opC = DagUnaryVNot<vt, "vC">.res;
+}
+
+class XXEvalBinaryPattern<ValueType vt, SDPatternOperator op, bit notResult = 0> {
+  // Defines a wrapper class for binary patterns with optional NOT on result.
+  // Generate op pattern with optional NOT wrapping for result depending on "notResult".
+      dag opPat = !cond(
+                !eq(vt, v4i32) : DagCondVNot<(op v4i32:$vB, v4i32:$vC), notResult>.res,
+                !eq(vt, v2i64) : (v2i64 (bitconvert DagCondVNot<(op
+                                      (v4i32 (bitconvert v2i64:$vB)),
+                                      (v4i32 (bitconvert v2i64:$vC))), notResult>.res))
+                );
+}
+
+multiclass XXEvalVSelectWithXAnd<ValueType vt, bits<8> baseImm> {
+  // Multiclass for Ternary(A, X, and(B, C)) style patterns.
+  // Ternary(A, xor(B,C), and(B,C)) => imm: baseImm
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        baseImm>;
+  // Ternary(A, nor(B,C), and(B,C)) => imm: baseImm + 2
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 2)>;
+  // Ternary(A, eqv(B,C), and(B,C)) => imm: baseImm + 3
+  def : xxevalPattern<vt, 
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 3)>;
+  // Ternary(A, not(C), and(B,C)) => imm: baseImm + 4
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opC, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 4)>;
+  // Ternary(A, not(B), and(B,C)) => imm: baseImm + 6
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opB, XXEvalBinaryPattern<vt, and>.opPat), 
+        !add(baseImm, 6)>;
+}
+
+multiclass XXEvalVSelectWithXB<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, B) style patterns
+  // Ternary(A, and(B,C), B) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, vt:$vB), 
+        baseImm>;
+  // Ternary(A, nor(B,C), B) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, vt:$vB), 
+        !add(baseImm, 7)>;
+  // Ternary(A, eqv(B,C), B) => imm: baseImm + 8
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, vt:$vB), 
+        !add(baseImm, 8)>;
+  // Ternary(A, not(C), B) => imm: baseImm + 9
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalUnaryNot<vt>.opC, vt:$vB), 
+        !add(baseImm, 9)>;
+  // Ternary(A, nand(B,C), B) => imm: baseImm + 13
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and, 1>.opPat, vt:$vB), 
+        !add(baseImm, 13)>;
+}
+
+multiclass XXEvalVSelectWithXC<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, C) style patterns
+  // Ternary(A, and(B,C), C) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, vt:$vC), 
+        baseImm>;
+  // Ternary(A, nor(B,C), C) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, vt:$vC), 
+        !add(baseImm, 7)>;
+  // Ternary(A, eqv(B,C), C) => imm: baseImm + 8
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, xor, 1>.opPat, vt:$vC), 
+        !add(baseImm, 8)>;
+  // Ternary(A, nand(B,C), C) => imm: baseImm + 13
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and, 1>.opPat, vt:$vC), 
+        !add(baseImm, 13)>;
+}
+
+multiclass XXEvalVSelectWithXXor<ValueType vt, bits<8> baseImm>{
+  // Multiclass for Ternary(A, X, xor(B,C)) style patterns
+  // Ternary(A, and(B,C), xor(B,C)) => imm: baseImm
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, and>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        baseImm>;
+  // Ternary(A, B, xor(B,C)) => imm: baseImm + 2
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, vt:$vB, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 2)>;
+  // Ternary(A, C, xor(B,C)) => imm: baseImm + 4
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, vt:$vC, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 4)>;
+  // Ternary(A, or(B,C), xor(B,C)) => imm: baseImm + 6
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 6)>;
+  // Ternary(A, nor(B,C), xor(B,C)) => imm: baseImm + 7
+  def : xxevalPattern<vt,
+        (vselect vt:$vA, XXEvalBinaryPattern<vt, or, 1>.opPat, XXEvalBinaryPattern<vt, xor>.opPat), 
+        !add(baseImm, 7)>; 
+}
 
 let Predicates = [PrefixInstrs, HasP10Vector] in {
   let AddedComplexity = 400 in {
@@ -2192,83 +2317,96 @@ let Predicates = [PrefixInstrs, HasP10Vector] in {
     // Anonymous patterns for XXEVAL
     // AND
     // and(A, B, C)
-    def : xxevalPattern<(and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 1>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 1>;
     // and(A, xor(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 6>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 6>;
     // and(A, or(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 7>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 7>;
     // and(A, nor(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 8>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 8>;
     // and(A, eqv(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 9>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 9>;
     // and(A, nand(B, C))
-    def : xxevalPattern<(and v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 14>;
+    def : xxevalPattern<v4i32, (and v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 14>;
 
     // NAND
     // nand(A, B, C)
-    def : xxevalPattern<(vnot (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC))),
                          !sub(255, 1)>;
     // nand(A, xor(B, C))
-    def : xxevalPattern<(vnot (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))),
                          !sub(255, 6)>;
     // nand(A, or(B, C))
-    def : xxevalPattern<(vnot (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC))),
+    def : xxevalPattern<v4i32, (vnot (and v4i32:$vA, (or v4i32:$vB, v4i32:$vC))),
                          !sub(255, 7)>;
     // nand(A, nor(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)),
                          !sub(255, 8)>;
     // nand(A, eqv(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)),
                          !sub(255, 9)>;
     // nand(A, nand(B, C))
-    def : xxevalPattern<(or (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)),
                          !sub(255, 14)>;
 
     // EQV
     // (eqv A, B, C)
-    def : xxevalPattern<(or (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)),
+    def : xxevalPattern<v4i32, (or (and v4i32:$vA, (and v4i32:$vB, v4i32:$vC)),
                             (vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)))),
                          150>;
     // (eqv A, (and B, C))
-    def : xxevalPattern<(vnot (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 225>;
+    def : xxevalPattern<v4i32, (vnot (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 225>;
     // (eqv A, (or B, C))
-    def : xxevalPattern<(vnot (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 135>;
+    def : xxevalPattern<v4i32, (vnot (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 135>;
 
     // NOR
     // (nor A, B, C)
-    def : xxevalPattern<(vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 128>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC))), 128>;
     // (nor A, (and B, C))
-    def : xxevalPattern<(vnot (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 224>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC))), 224>;
     // (nor A, (eqv B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)), 96>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (xor v4i32:$vB, v4i32:$vC)), 96>;
     // (nor A, (nand B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)), 16>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (and v4i32:$vB, v4i32:$vC)), 16>;
     // (nor A, (nor B, C))
-    def : xxevalPattern<(and (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)), 112>;
+    def : xxevalPattern<v4i32, (and (vnot v4i32:$vA), (or v4i32:$vB, v4i32:$vC)), 112>;
     // (nor A, (xor B, C))
-    def : xxevalPattern<(vnot (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))), 144>;
+    def : xxevalPattern<v4i32, (vnot (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC))), 144>;
 
     // OR
     // (or A, B, C)
-    def : xxevalPattern<(or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 127>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 127>;
     // (or A, (and B, C))
-    def : xxevalPattern<(or v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 31>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 31>;
     // (or A, (eqv B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 159>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (xor v4i32:$vB, v4i32:$vC))), 159>;
     // (or A, (nand B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 239>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (and v4i32:$vB, v4i32:$vC))), 239>;
     // (or A, (nor B, C))
-    def : xxevalPattern<(or v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 143>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (vnot (or v4i32:$vB, v4i32:$vC))), 143>;
     // (or A, (xor B, C))
-    def : xxevalPattern<(or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 111>;
+    def : xxevalPattern<v4i32, (or v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 111>;
 
     // XOR
     // (xor A, B, C)
-    def : xxevalPattern<(xor v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 105>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (xor v4i32:$vB, v4i32:$vC)), 105>;
     // (xor A, (and B, C))
-    def : xxevalPattern<(xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 30>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (and v4i32:$vB, v4i32:$vC)), 30>;
     // (xor A, (or B, C))
-    def : xxevalPattern<(xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 120>;
+    def : xxevalPattern<v4i32, (xor v4i32:$vA, (or v4i32:$vB, v4i32:$vC)), 120>;
+
+    // Ternary operation support with the xxeval instruction.
+    defm : XXEvalVSelectWithXAnd<v4i32, 22>;
+    defm : XXEvalVSelectWithXAnd<v2i64, 22>;
+
+    defm : XXEvalVSelectWithXB<v4i32, 49>;
+    defm : XXEvalVSelectWithXB<v2i64, 49>;
+
+    defm : XXEvalVSelectWithXC<v4i32, 81>;
+    defm : XXEvalVSelectWithXC<v2i64, 81>;
+
+    defm : XXEvalVSelectWithXXor<v4i32, 97>;
+    defm : XXEvalVSelectWithXXor<v2i64, 97>;
 
     // Anonymous patterns to select prefixed VSX loads and stores.
     // Load / Store f128
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll
new file mode 100644
index 0000000000000..b30617469c901
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-and.ll
@@ -0,0 +1,134 @@
+; Test file to verify the emission of xxeval instructions when ternary operators are used.
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; Function to test ternary(A, xor(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_xor_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 22
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_xor_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <4 x i32> %B, %C
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %xor, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, xor(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_xor_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 22
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_xor_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <2 x i64> %B, %C
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %xor, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, nor(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_nor_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 24
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_nor_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %or = or <4 x i32> %B, %C
+  %nor = xor <4 x i32> %or, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector NOR operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %nor, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, nor(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_nor_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 24
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_nor_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %or = or <2 x i64> %B, %C
+  %nor = xor <2 x i64> %or, <i64 -1, i64 -1>  ; Vector NOR operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %nor, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, eqv(B, C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_eqv_BC_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 25
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_eqv_BC_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <4 x i32> %B, %C
+  %eqv = xor <4 x i32> %xor, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector eqv operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %eqv, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, eqv(B, C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_eqv_BC_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 25
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_eqv_BC_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %xor = xor <2 x i64> %B, %C
+  %eqv = xor <2 x i64> %xor, <i64 -1, i64 -1>  ; Vector eqv operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %eqv, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, not(C), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_not_C_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 26
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_not_C_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <4 x i32> %C, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector not operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %not, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, not(C), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_not_C_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 26
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_not_C_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <2 x i64> %C, <i64 -1, i64 -1>  ; Vector not operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %not, <2 x i64> %and
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, not(B), and(B, C)) for <4 x i32>
+; CHECK-LABEL: ternary_A_not_B_and_BC_4x32
+; CHECK: xxeval v2, v2, v3, v4, 28
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_not_B_and_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 -1>  ; Vector not operation
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %not, <4 x i32> %and
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, not(B), and(B, C)) for <2 x i64>
+; CHECK-LABEL: ternary_A_not_B_and_BC_2x64
+; CHECK: xxeval v2, v2, v3, v4, 28
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_not_B_and_BC_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %not = xor <2 x i64> %B, <i64 -1, i64 -1>  ; Vector not operation
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %not, <2 x i64> %and
+  ret <2 x i64> %res
+}
\ No newline at end of file
diff --git a/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
new file mode 100644
index 0000000000000..7ab5fa7d62688
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/xxeval-vselect-x-b.ll
@@ -0,0 +1,126 @@
+; Test file to verify the emission of xxeval instructions when ternary operators are used.
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64le-unknown-unknown \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr10 -mtriple=powerpc64-ibm-aix-xcoff \
+; RUN:   -ppc-asm-full-reg-names --ppc-vsr-nums-as-vr < %s | FileCheck %s
+
+; Function to test ternary(A, and(B, C), B) for <4 x i32>
+; CHECK-LABEL: ternary_A_and_BC_B_4x32
+; CHECK: xxeval v2, v2, v3, v4, 49
+; CHECK-NEXT: blr
+define dso_local <4 x i32> @ternary_A_and_BC_B_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i32> %C) local_unnamed_addr #0 {
+entry:
+  %and = and <4 x i32> %B, %C
+  %res = select <4 x i1> %A, <4 x i32> %and, <4 x i32> %B
+  ret <4 x i32> %res
+}
+
+; Function to test ternary(A, and(B, C), B) for <2 x i64>
+; CHECK-LABEL: ternary_A_and_BC_B_2x64
+; CHECK: xxeval v2, v2, v3, v4, 49
+; CHECK-NEXT: blr
+define dso_local <2 x i64> @ternary_A_and_BC_B_2x64(<2 x i1> %A, <2 x i64> %B, <2 x i64> %C) local_unnamed_addr #0 {
+entry:
+  %and = and <2 x i64> %B, %C
+  %res = select <2 x i1> %A, <2 x i64> %and, <2 x i64> %B
+  ret <2 x i64> %res
+}
+
+; Function to test ternary(A, nor(B, C), B) for <4 x i32>
+; CHECK-LABEL: ternary_A_...
[truncated]

tonykuttai · 2025-05-28T10:47:27Z

@RolandF77 @lei137 @amy-kwan

[NFC][PowerpC] Add testcases for locking down the xxeval instruction support for ternary operators is the precursor to this PR.

llvm/lib/Target/PowerPC/PPCInstrP10.td

tonykuttai · 2025-06-03T16:27:17Z

@lei137 @amy-kwan @RolandF77 @redstar
[NFC][PowerPC] Add testcases for locking down the xxeval instruction support for ternary operators change is merged. Rebased this branch.

lei137

Unsure about all the class def that just contain a dag def.
Maybe we can have 1 XXEvalUnaryOp class with the different dag defined with descriptive names and have all the xxevalpattern classes derive from that and use the appropriate dag from there as needed?

llvm/lib/Target/PowerPC/PPCInstrP10.td

tonykuttai · 2025-06-11T15:01:30Z

Unsure about all the class def that just contain a dag def. Maybe we can have 1 XXEvalUnaryOp class with the different dag defined with descriptive names and have all the xxevalpattern classes derive from that and use the appropriate dag from there as needed?

Two classes defined for getting the Unary and the BinaryPattern.

XXEvalUnaryPattern
XXEvalBinaryPattern

For getting a simple dag of vt:$vB or vt:$vC, it would be simpler to do it without the use of a wrapper class. and hence I have excluded them from the XXEvalUnaryPattern.

tonykuttai · 2025-06-12T06:22:17Z

Rebased the PR to the latest main branch. (NFC pre-commit patches already merged).

amy-kwan

We had a discussion of this offline - could we refactor the PR to handle different types and move some of the implementation out of TableGen?

tonykuttai · 2025-06-20T19:00:21Z

@amy-kwan @lei137 @RolandF77
Updated the PR to support only v4i32 operand types of VSELECT node when emitting xxeval instruction.

Support for other types (v2i64, v16i8, v8i16) will done as an future patch. Currently bit Op operands of AND, XOR, OR gets promoted to v4i32 during PPCISelLowering. We can add a LowerVECTOR_SELECT operation to change the operands back to their original types. This along with the support for XXEvalPatterns for v2i64, v16i8, v8i16 in llvm/lib/Target/PowerPC/PPCInstrP10.td can help us pattern match to the xxeval ternary instructions.

RolandF77 · 2025-06-24T21:27:55Z

llvm/test/CodeGen/PowerPC/xxeval-vselect-x-or.ll

@@ -120,11 +120,11 @@ define <4 x i32> @ternary_A_eqv_BC_or_BC_4x32(<4 x i1> %A, <4 x i32> %B, <4 x i3
 ; CHECK-LABEL: ternary_A_eqv_BC_or_BC_4x32:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    xxleqv v5, v5, v5
-; CHECK-NEXT:    xxleqv vs0, v3, v4
+; CHECK-NEXT:    xxlxor vs0, v3, v4


This is weird and confusing. Were the or(B,C) patterns supposed to be included in this patch? There are patterns for these, and a few more or(B,C) patterns, but as they are not implemented this test is always using the same other pattern.

This testcase was submitted as part of [NFC][PowerPC] Pre-commit test case for exploitation of xxeval for the pattern ternary(A,X,or(B,C)). The pattern getting matched here to xxeval is :

// Ternary(A, not(C), B) => imm: baseImm + 9 = 58 def : XXEvalPattern<vt, (vselect vt:$vA, XXEvalUnaryPattern<vt>.vnotC, vt:$vB), !add(baseImm, 9)>;

I agree that it is confusing but I think the pattern which is getting matched here is ternary(A, not(xor(B,C)), or(B,C) ). You can see this on the left hand side with xxsel v2, vs1, vs0, v2.

Fyi, I confirmed that this new testcase gets matched to xxeval pattern for Ternary(A, eqv(B,C), or(B,C)) => imm:121 with the upcoming changes when Ternary(A, X, or(B,C)) style patterns are supported.

.abiversion 2 .file "t.ll" .text .globl ternary_A_eqv_BC_or_BC_4x32 # -- Begin function ternary_A_eqv_BC_or_BC_4x32 .p2align 4 .type ternary_A_eqv_BC_or_BC_4x32,@function ternary_A_eqv_BC_or_BC_4x32: # @ternary_A_eqv_BC_or_BC_4x32 .Lfunc_begin0: .cfi_startproc # %bb.0: # %entry xxleqv v5, v5, v5 vslw v2, v2, v5 vsraw v2, v2, v5 xxeval v2, v2, v3, v4, 121 blr .long 0 .quad 0 .Lfunc_end0: .size ternary_A_eqv_BC_or_BC_4x32, .Lfunc_end0-.Lfunc_begin0 .cfi_endproc # -- End function .section ".note.GNU-stack","",@progbits

…(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms.

tonykuttai · 2025-07-08T04:29:55Z

Latest commit supports v4i32, v2i64, v16i8 and v8i16 operand datatypes to xxeval instruction.
XXEvalVSelectWithXAnd class used for v4i32 and v2i64 types that xxeval supports.
XXEvalVSelectWithXAndCast class used for v16i8 and v8i16 types that needs casting support.
xxeval instruction supports VSRC as input operands which inturn supports only v4i32,v4f32,v2f64,v2i64 types.

tonykuttai marked this pull request as ready for review May 28, 2025 10:46

llvmbot added the backend:PowerPC label May 28, 2025

lei137 requested review from diggerlin, redstar, amy-kwan, lei137 and RolandF77 May 28, 2025 15:27

lei137 reviewed May 28, 2025

View reviewed changes

llvm/lib/Target/PowerPC/PPCInstrP10.td Outdated Show resolved Hide resolved

tonykuttai force-pushed the tvarghese/xxeval branch 5 times, most recently from fcebb02 to 604d89b Compare June 3, 2025 16:23

tonykuttai requested a review from lei137 June 3, 2025 16:25

tonykuttai force-pushed the tvarghese/xxeval branch from 604d89b to 077fb2e Compare June 3, 2025 16:55

tonykuttai mentioned this pull request Jun 6, 2025

Request Commit Access For tonykuttai #143064

Closed

lei137 requested changes Jun 6, 2025

View reviewed changes

llvm/lib/Target/PowerPC/PPCInstrP10.td Show resolved Hide resolved

llvm/lib/Target/PowerPC/PPCInstrP10.td Outdated Show resolved Hide resolved

llvm/lib/Target/PowerPC/PPCInstrP10.td Outdated Show resolved Hide resolved

tonykuttai force-pushed the tvarghese/xxeval branch from d0d1374 to f20275c Compare June 11, 2025 08:45

tonykuttai requested a review from lei137 June 11, 2025 08:56

tonykuttai mentioned this pull request Jun 11, 2025

[NFC][PowerPC] Rename xxevalPattern to adhere to naming convention. #143675

Merged

tonykuttai changed the title ~~[PowerPC][XXEVAL] Exploit xxeval instruction for cases of the ternary(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms.~~ [PowerPC] Exploit xxeval instruction for ternary patterns - part 1 Jun 12, 2025

tonykuttai force-pushed the tvarghese/xxeval branch 2 times, most recently from 9921b8c to 3ebc442 Compare June 12, 2025 05:29

amy-kwan reviewed Jun 20, 2025

View reviewed changes

tonykuttai requested a review from amy-kwan June 20, 2025 19:00

tonykuttai changed the title ~~[PowerPC] Exploit xxeval instruction for ternary patterns - part 1~~ [PowerPC] Exploit xxeval instruction for ternary patterns for v4i32 type- part 1 Jun 20, 2025

tonykuttai changed the title ~~[PowerPC] Exploit xxeval instruction for ternary patterns for v4i32 type- part 1~~ [PowerPC] Exploit xxeval instruction for ternary patterns of v4i32 operands- part 1 Jun 20, 2025

tonykuttai force-pushed the tvarghese/xxeval branch 2 times, most recently from 2661e08 to 29e456e Compare June 23, 2025 16:20

RolandF77 reviewed Jun 24, 2025

View reviewed changes

tonykuttai requested a review from RolandF77 June 25, 2025 03:39

tonykuttai changed the title ~~[PowerPC] Exploit xxeval instruction for ternary patterns of v4i32 operands- part 1~~ [PowerPC] Exploit xxeval instruction for ternary patterns - part 1 Jun 28, 2025

tonykuttai force-pushed the tvarghese/xxeval branch from e3ab08c to f6658cb Compare July 4, 2025 09:02

[PowerPC][XXEVAL] Exploit xxeval instruction for cases of the ternary…

f6658cb

…(A,X, and(B,C)), ternary(A,X,B), ternary(A,X,C), ternary(A,X,xor(B,C)) forms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PowerPC] Exploit xxeval instruction for ternary patterns - part 1 #141733

[PowerPC] Exploit xxeval instruction for ternary patterns - part 1 #141733

Uh oh!

tonykuttai commented May 28, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 28, 2025

Description

Uh oh!

tonykuttai commented May 28, 2025

Uh oh!

Uh oh!

tonykuttai commented Jun 3, 2025 •

edited

Loading

Uh oh!

lei137 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tonykuttai commented Jun 11, 2025

Uh oh!

tonykuttai commented Jun 12, 2025

Uh oh!

amy-kwan left a comment

Uh oh!

tonykuttai commented Jun 20, 2025 •

edited

Loading

Uh oh!

RolandF77 Jun 24, 2025

Uh oh!

tonykuttai Jun 25, 2025 •

edited

Loading

Uh oh!

tonykuttai commented Jul 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

[PowerPC] Exploit xxeval instruction for ternary patterns - part 1 #141733

Are you sure you want to change the base?

[PowerPC] Exploit xxeval instruction for ternary patterns - part 1 #141733

Uh oh!

Conversation

tonykuttai commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

llvmbot commented May 28, 2025

Description

Uh oh!

tonykuttai commented May 28, 2025

Uh oh!

Uh oh!

tonykuttai commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lei137 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tonykuttai commented Jun 11, 2025

Uh oh!

tonykuttai commented Jun 12, 2025

Uh oh!

amy-kwan left a comment

Choose a reason for hiding this comment

Uh oh!

tonykuttai commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RolandF77 Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

tonykuttai Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonykuttai commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tonykuttai commented May 28, 2025 •

edited

Loading

tonykuttai commented Jun 3, 2025 •

edited

Loading

tonykuttai commented Jun 20, 2025 •

edited

Loading

tonykuttai Jun 25, 2025 •

edited

Loading

tonykuttai commented Jul 8, 2025 •

edited

Loading