Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GlobalIsel][AArch64] MVP for vectorized selects. #76104

Closed
wants to merge 5 commits into from

Conversation

tschuett
Copy link
Member

Try to select Selects where the condition is a vector.

Inspired by arm64-vselect.ll and extensions to it.

Note that there are not tests for all legal types.

Try to select Selects where the condition is a vector.

Inspired by arm64-vselect.ll and extensions to it.

Note that there are not tests for all legal types.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 20, 2023

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Author: Thorsten Schütt (tschuett)

Changes

Try to select Selects where the condition is a vector.

Inspired by arm64-vselect.ll and extensions to it.

Note that there are not tests for all legal types.


Full diff: https://github.com/llvm/llvm-project/pull/76104.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+34)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+7-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-select.mir (+70-21)
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index a4ace6cce46342..6fbd850395fda6 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -507,6 +507,9 @@ class AArch64InstructionSelector : public InstructionSelector {
   /// zero extended.
   bool isDef32(const MachineInstr &MI) const;
 
+  /// Select selects where the condition is a vector.
+  bool selectVectorSelect(GSelect &Sel, MachineIRBuilder &MIB) const;
+
   const AArch64TargetMachine &TM;
   const AArch64Subtarget &STI;
   const AArch64InstrInfo &TII;
@@ -1154,6 +1157,34 @@ static unsigned selectFPConvOpc(unsigned GenericOpc, LLT DstTy, LLT SrcTy) {
   return GenericOpc;
 }
 
+bool AArch64InstructionSelector::selectVectorSelect(
+    GSelect &Sel, MachineIRBuilder &MIB) const {
+  MachineRegisterInfo *MRI = MIB.getMRI();
+  Register Dst = Sel.getReg(0);
+  Register Cond = Sel.getCondReg();
+  Register True = Sel.getTrueReg();
+  Register False = Sel.getFalseReg();
+  LLT DestTy = MRI->getType(Dst);
+  LLT CondTy = MRI->getType(Cond);
+
+  // There are no scalable vectors yet.
+  if (CondTy.isScalable())
+    return false;
+
+  // We would need to sext the Cond to the Dest, i.e., sshll.
+  // It will fail for v2s64, v4s32, and v8s16.
+  if (CondTy != DestTy)
+    return false;
+
+  unsigned Opcode =
+      CondTy.getSizeInBits() == 128 ? AArch64::BSLv16i8 : AArch64::BSLv8i8;
+  auto SelMI = MIB.buildInstr(Opcode, {Dst}, {Cond, True, False});
+  constrainSelectedInstRegOperands(*SelMI, TII, TRI, RBI);
+
+  Sel.eraseFromParent();
+  return true;
+}
+
 MachineInstr *
 AArch64InstructionSelector::emitSelect(Register Dst, Register True,
                                        Register False, AArch64CC::CondCode CC,
@@ -3442,6 +3473,9 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
     const Register TReg = Sel.getTrueReg();
     const Register FReg = Sel.getFalseReg();
 
+    if (MRI.getType(CondReg).isVector())
+      return selectVectorSelect(Sel, MIB);
+
     if (tryOptSelect(Sel))
       return true;
 
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 8b909f53c84460..0af18108c6710f 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -708,7 +708,13 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
   getActionDefinitionsBuilder(G_BRINDIRECT).legalFor({p0});
 
   getActionDefinitionsBuilder(G_SELECT)
-      .legalFor({{s32, s32}, {s64, s32}, {p0, s32}})
+      .legalFor({{s32, s32},
+                 {s64, s32},
+                 {p0, s32},
+                 {v2s32, v2s32},
+                 {v4s16, v4s16},
+                 {v8s8, v8s8},
+                 {v16s8, v16s8}})
       .widenScalarToNextPow2(0)
       .clampScalar(0, s32, s64)
       .clampScalar(1, s32, s32)
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-select.mir b/llvm/test/CodeGen/AArch64/GlobalISel/select-select.mir
index e26c1431350979..9602dd7ef0a7fd 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-select.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-select.mir
@@ -88,6 +88,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, 0, 1 -> CSINC zreg, zreg, cc
 name:            csinc_t_0_f_1
 legalized:       true
 regBankSelected: true
@@ -95,7 +96,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1
-    ; G_SELECT cc, 0, 1 -> CSINC zreg, zreg, cc
 
     ; CHECK-LABEL: name: csinc_t_0_f_1
     ; CHECK: liveins: $w0, $w1
@@ -116,6 +116,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc 0, -1 -> CSINV zreg, zreg cc
 name:            csinv_t_0_f_neg_1
 legalized:       true
 regBankSelected: true
@@ -123,7 +124,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1
-    ; G_SELECT cc 0, -1 -> CSINV zreg, zreg cc
 
     ; CHECK-LABEL: name: csinv_t_0_f_neg_1
     ; CHECK: liveins: $w0, $w1
@@ -144,6 +144,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 name:            csinc_t_1
 legalized:       true
 regBankSelected: true
@@ -151,7 +152,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 
     ; CHECK-LABEL: name: csinc_t_1
     ; CHECK: liveins: $w0, $w1, $w2
@@ -180,7 +180,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, -1, f -> CSINV f, zreg, inv_cc
 
     ; CHECK-LABEL: name: csinv_t_neg_1
     ; CHECK: liveins: $w0, $w1, $w2
@@ -202,6 +201,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, t, 1 -> CSINC t, zreg, cc
 name:            csinc_f_1
 legalized:       true
 regBankSelected: true
@@ -209,7 +209,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, t, 1 -> CSINC t, zreg, cc
 
     ; CHECK-LABEL: name: csinc_f_1
     ; CHECK: liveins: $w0, $w1, $w2
@@ -231,6 +230,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, t, -1 -> CSINC t, zreg, cc
 name:            csinc_f_neg_1
 legalized:       true
 regBankSelected: true
@@ -238,7 +238,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, t, -1 -> CSINC t, zreg, cc
 
     ; CHECK-LABEL: name: csinc_f_neg_1
     ; CHECK: liveins: $w0, $w1, $w2
@@ -260,6 +259,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 name:            csinc_t_1_no_cmp
 legalized:       true
 regBankSelected: true
@@ -267,7 +267,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1
-    ; G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 
     ; CHECK-LABEL: name: csinc_t_1_no_cmp
     ; CHECK: liveins: $w0, $w1
@@ -287,6 +286,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, t, 1 -> CSINC t, zreg, cc
 name:            csinc_f_1_no_cmp
 legalized:       true
 regBankSelected: true
@@ -294,7 +294,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1
-    ; G_SELECT cc, t, 1 -> CSINC t, zreg, cc
 
     ; CHECK-LABEL: name: csinc_f_1_no_cmp
     ; CHECK: liveins: $w0, $w1
@@ -314,6 +313,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 name:            csinc_t_1_no_cmp_s64
 legalized:       true
 regBankSelected: true
@@ -321,7 +321,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $x0, $x1
-    ; G_SELECT cc, 1, f -> CSINC f, zreg, inv_cc
 
     ; CHECK-LABEL: name: csinc_t_1_no_cmp_s64
     ; CHECK: liveins: $x0, $x1
@@ -343,6 +342,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, true, (G_SUB 0, x) -> CSNEG true, x, cc
 name:            csneg_s32
 legalized:       true
 regBankSelected: true
@@ -350,7 +350,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, true, (G_SUB 0, x) -> CSNEG true, x, cc
 
     ; CHECK-LABEL: name: csneg_s32
     ; CHECK: liveins: $w0, $w1, $w2
@@ -373,6 +372,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, (G_SUB 0, %x), %false -> CSNEG %x, %false, inv_cc
 name:            csneg_inverted_cc
 legalized:       true
 regBankSelected: true
@@ -380,7 +380,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, (G_SUB 0, %x), %false -> CSNEG %x, %false, inv_cc
 
     ; CHECK-LABEL: name: csneg_inverted_cc
     ; CHECK: liveins: $w0, $w1, $w2
@@ -403,6 +402,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, true, (G_SUB 0, x) -> CSNEG true, x, cc
 name:            csneg_s64
 legalized:       true
 regBankSelected: true
@@ -410,7 +410,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $x0, $x1, $x2
-    ; G_SELECT cc, true, (G_SUB 0, x) -> CSNEG true, x, cc
 
     ; CHECK-LABEL: name: csneg_s64
     ; CHECK: liveins: $x0, $x1, $x2
@@ -434,6 +433,8 @@ body:             |
     RET_ReallyLR implicit $x0
 ...
 ---
+# We should prefer eliminating the G_SUB over eliminating the constant true
+#  value.
 name:            csneg_with_true_cst
 legalized:       true
 regBankSelected: true
@@ -441,8 +442,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; We should prefer eliminating the G_SUB over eliminating the constant true
-    ; value.
 
     ; CHECK-LABEL: name: csneg_with_true_cst
     ; CHECK: liveins: $w0, $w1, $w2
@@ -465,6 +464,7 @@ body:             |
     RET_ReallyLR implicit $w0
 ...
 ---
+# G_SELECT cc, true, (G_XOR x, -1) -> CSINV true, x, cc
 name:            csinv_s32
 legalized:       true
 regBankSelected: true
@@ -472,7 +472,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, true, (G_XOR x, -1) -> CSINV true, x, cc
 
     ; CHECK-LABEL: name: csinv_s32
     ; CHECK: liveins: $w0, $w1, $w2
@@ -495,6 +494,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, (G_XOR x, -1), %false -> CSINV %x, %false, inv_cc
 name:            csinv_inverted_cc
 legalized:       true
 regBankSelected: true
@@ -502,7 +502,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, (G_XOR x, -1), %false -> CSINV %x, %false, inv_cc
 
     ; CHECK-LABEL: name: csinv_inverted_cc
     ; CHECK: liveins: $w0, $w1, $w2
@@ -525,6 +524,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, true, (G_XOR x, -1) -> CSINV true, x, cc
 name:            csinv_s64
 legalized:       true
 regBankSelected: true
@@ -532,7 +532,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $x0, $x1, $x2
-    ; G_SELECT cc, true, (G_XOR x, -1) -> CSINV true, x, cc
 
     ; CHECK-LABEL: name: csinv_s64
     ; CHECK: liveins: $x0, $x1, $x2
@@ -557,6 +556,7 @@ body:             |
 
 ...
 ---
+# zext(s32 -1) != s64 -1, so we can't fold it away.
 name:            xor_not_negative_one
 legalized:       true
 regBankSelected: true
@@ -564,7 +564,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $x0, $x1, $x2
-    ; zext(s32 -1) != s64 -1, so we can't fold it away.
 
     ; CHECK-LABEL: name: xor_not_negative_one
     ; CHECK: liveins: $x0, $x1, $x2
@@ -594,6 +593,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, %true, (G_ADD %x, 1) -> CSINC %true, %x, cc
 name:            csinc_s32
 legalized:       true
 regBankSelected: true
@@ -601,7 +601,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, %true, (G_ADD %x, 1) -> CSINC %true, %x, cc
     ; CHECK-LABEL: name: csinc_s32
     ; CHECK: liveins: $w0, $w1, $w2
     ; CHECK-NEXT: {{  $}}
@@ -623,6 +622,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, (G_ADD %x, 1), %false -> CSINC %x, %false, inv_cc
 name:            csinc_s32_inverted_cc
 legalized:       true
 regBankSelected: true
@@ -630,7 +630,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $w0, $w1, $w2
-    ; G_SELECT cc, (G_ADD %x, 1), %false -> CSINC %x, %false, inv_cc
     ; CHECK-LABEL: name: csinc_s32_inverted_cc
     ; CHECK: liveins: $w0, $w1, $w2
     ; CHECK-NEXT: {{  $}}
@@ -652,6 +651,7 @@ body:             |
 
 ...
 ---
+# G_SELECT cc, %true, (G_PTR_ADD %x, 1) -> CSINC %true, %x, cc
 name:            csinc_ptr_add
 legalized:       true
 regBankSelected: true
@@ -659,7 +659,6 @@ tracksRegLiveness: true
 body:             |
   bb.0:
     liveins: $x0, $x1, $x2
-    ; G_SELECT cc, %true, (G_PTR_ADD %x, 1) -> CSINC %true, %x, cc
 
     ; CHECK-LABEL: name: csinc_ptr_add
     ; CHECK: liveins: $x0, $x1, $x2
@@ -713,3 +712,53 @@ body:             |
     %select:gpr(s32) = G_SELECT %reg0, %xor, %sub
     $w0 = COPY %select(s32)
     RET_ReallyLR implicit $w0
+...
+---
+name:            select_vectorized_conditon_v2s32
+legalized:       true
+regBankSelected: true
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    liveins: $w0, $w1, $w2
+    ; CHECK-LABEL: name: select_vectorized_conditon_v2s32
+    ; CHECK: liveins: $w0, $w1, $w2
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: %reg0:gpr32all = COPY $w0
+    ; CHECK-NEXT: %reg1:gpr32 = COPY $w1
+    ; CHECK-NEXT: %reg2:gpr32all = COPY $w2
+    ; CHECK-NEXT: %reg3:gpr32 = COPY $w0
+    ; CHECK-NEXT: %reg4:gpr32all = COPY $w1
+    ; CHECK-NEXT: %reg5:gpr32 = COPY $w2
+    ; CHECK-NEXT: [[DEF:%[0-9]+]]:fpr128 = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG:%[0-9]+]]:fpr128 = INSERT_SUBREG [[DEF]], %reg0, %subreg.ssub
+    ; CHECK-NEXT: [[INSvi32gpr:%[0-9]+]]:fpr128 = INSvi32gpr [[INSERT_SUBREG]], 1, %reg1
+    ; CHECK-NEXT: %true:fpr64 = COPY [[INSvi32gpr]].dsub
+    ; CHECK-NEXT: [[DEF1:%[0-9]+]]:fpr128 = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG1:%[0-9]+]]:fpr128 = INSERT_SUBREG [[DEF1]], %reg2, %subreg.ssub
+    ; CHECK-NEXT: [[INSvi32gpr1:%[0-9]+]]:fpr128 = INSvi32gpr [[INSERT_SUBREG1]], 1, %reg3
+    ; CHECK-NEXT: %false:fpr64 = COPY [[INSvi32gpr1]].dsub
+    ; CHECK-NEXT: [[DEF2:%[0-9]+]]:fpr128 = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG2:%[0-9]+]]:fpr128 = INSERT_SUBREG [[DEF2]], %reg4, %subreg.ssub
+    ; CHECK-NEXT: [[INSvi32gpr2:%[0-9]+]]:fpr128 = INSvi32gpr [[INSERT_SUBREG2]], 1, %reg5
+    ; CHECK-NEXT: %cond:fpr64 = COPY [[INSvi32gpr2]].dsub
+    ; CHECK-NEXT: %select:fpr64 = BSLv8i8 %cond, %true, %false
+    ; CHECK-NEXT: [[DEF3:%[0-9]+]]:fpr128 = IMPLICIT_DEF
+    ; CHECK-NEXT: [[INSERT_SUBREG3:%[0-9]+]]:fpr128 = INSERT_SUBREG [[DEF3]], %select, %subreg.dsub
+    ; CHECK-NEXT: %extract:fpr32 = DUPi32 [[INSERT_SUBREG3]], 1
+    ; CHECK-NEXT: $w0 = COPY %extract
+    ; CHECK-NEXT: RET_ReallyLR implicit $w0
+    %idx:gpr(s64) = G_CONSTANT i64 1
+    %reg0:gpr(s32) = COPY $w0
+    %reg1:gpr(s32) = COPY $w1
+    %reg2:gpr(s32) = COPY $w2
+    %reg3:gpr(s32) = COPY $w0
+    %reg4:gpr(s32) = COPY $w1
+    %reg5:gpr(s32) = COPY $w2
+    %true:fpr(<2 x s32>) = G_BUILD_VECTOR %reg0(s32), %reg1(s32)
+    %false:fpr(<2 x s32>) = G_BUILD_VECTOR %reg2(s32), %reg3(s32)
+    %cond:fpr(<2 x s32>) = G_BUILD_VECTOR %reg4(s32), %reg5(s32)
+    %select:fpr(<2 x s32>) = G_SELECT %cond, %true, %false
+    %extract:fpr(s32) = G_EXTRACT_VECTOR_ELT %select:fpr(<2 x s32>), %idx:gpr(s64)
+    $w0 = COPY %extract(s32)
+    RET_ReallyLR implicit $w0

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for vectors we should be expanding the select into or+and+not during legalization, then selecting them with the existing bsp patterns. I think that's how it currently works, is it not working well in practice?

@tschuett
Copy link
Member Author

I thought that legalizing them and manually selecting them generates better code. They are kept at higher abstraction in the pass pipeline. E.g. the generic combiner sees them as selects and not as binary operations.

; CHECK-NEXT: [[AND1:%[0-9]+]]:_(<2 x s32>) = G_AND [[COPY]], [[XOR]]
; CHECK-NEXT: [[OR:%[0-9]+]]:_(<2 x s32>) = G_OR [[AND]], [[AND1]]
; CHECK-NEXT: $d0 = COPY [[OR]](<2 x s32>)
; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(<2 x s32>) = G_SELECT [[SEXT_INREG]](<2 x s32>), [[COPY1]], [[COPY]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combiner now sees a select and not the binary operators and a build vector.

@tschuett
Copy link
Member Author

RegbankSelect has an incorrect optimization for vectorized selects that do not exist today.

// If we're taking in vectors, we have no choice but to put everything on

@tschuett
Copy link
Member Author

The select to floating point minimax fold can never hit for vectorized selects. They do not exist today.

// Match: select (fcmp cond x, y) x, y

@tschuett tschuett closed this Jan 2, 2024
@tschuett tschuett deleted the gisel-vselect branch January 2, 2024 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants