[AArch64][GlobalISel] Take abs scalar codegen closer to SDAG #84886

madhur13490 · 2024-03-12T08:21:18Z

This patch improves codegen for scalar (<128bits) version
of llvm.abs intrinsic by using the existing non-XOR based lowering.
This takes the generated code closer to SDAG.

codegen with GISel for > 128 bit types is not very good
with these method so not doing so.

llvmbot · 2024-03-12T08:21:49Z

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Madhur Amilkanthwar (madhur13490)

Changes

This patch improves codegen for scalar (<128bits) version
of llvm.abs intrinsic by using the existing non-XOR based lowering.
This takes the generated code closer to SDAG.

codegen with GISel for > 128 bit types is not very good
with these method so not doing so.

Full diff: https://github.com/llvm/llvm-project/pull/84886.diff

3 Files Affected:

(modified) llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp (+8)
(modified) llvm/test/CodeGen/AArch64/GlobalISel/legalize-abs.mir (+18-10)
(modified) llvm/test/CodeGen/AArch64/abs.ll (+15-15)

diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 36adada2796531..6aa43f8b14a782 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -1021,6 +1021,12 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
       .clampNumElements(0, v2s32, v4s32)
       .clampNumElements(0, v2s64, v2s64)
       .moreElementsToNextPow2(0)
+      // Do SMax based lowering for < 128 bits.
+      .customIf(
+          [=](const LegalityQuery &Q) {
+            LLT SrcTy = Q.Types[0];
+            return SrcTy.isScalar() && SrcTy.getSizeInBits() < 128;
+          })
       .lower();
 
   // For fadd reductions we have pairwise operations available. We treat the
@@ -1262,6 +1268,8 @@ bool AArch64LegalizerInfo::legalizeCustom(
     return legalizeDynStackAlloc(MI, Helper);
   case TargetOpcode::G_PREFETCH:
     return legalizePrefetch(MI, Helper);
+  case TargetOpcode::G_ABS:
+    return Helper.lowerAbsToMaxNeg(MI);
   }
 
   llvm_unreachable("expected switch to return");
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-abs.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-abs.mir
index 3123e304116fe5..cd27d4498e0ef9 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-abs.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-abs.mir
@@ -8,11 +8,12 @@ body:             |
   bb.0:
     ; CHECK-LABEL: name: abs_s32
     ; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
-    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 31
-    ; CHECK-NEXT: [[ASHR:%[0-9]+]]:_(s32) = G_ASHR [[COPY]], [[C]](s64)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY]], [[ASHR]]
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s32) = G_XOR [[ADD]], [[ASHR]]
-    ; CHECK-NEXT: $w0 = COPY [[XOR]](s32)
+    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+    ; CHECK-NEXT: [[SUB:%[0-9]+]]:_(s32) = G_SUB [[C]], [[COPY]]
+    ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s32) = G_ICMP intpred(sgt), [[COPY]](s32), [[SUB]]
+    ; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(s32) = G_SELECT [[ICMP]](s32), [[COPY]], [[SUB]]
+    ; CHECK-NEXT: $w0 = COPY [[SELECT]](s32)
+    ;
     ; CHECK-CSSC-LABEL: name: abs_s32
     ; CHECK-CSSC: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
     ; CHECK-CSSC-NEXT: [[ABS:%[0-9]+]]:_(s32) = G_ABS [[COPY]]
@@ -28,11 +29,12 @@ body:             |
   bb.0:
     ; CHECK-LABEL: name: abs_s64
     ; CHECK: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
-    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 63
-    ; CHECK-NEXT: [[ASHR:%[0-9]+]]:_(s64) = G_ASHR [[COPY]], [[C]](s64)
-    ; CHECK-NEXT: [[ADD:%[0-9]+]]:_(s64) = G_ADD [[COPY]], [[ASHR]]
-    ; CHECK-NEXT: [[XOR:%[0-9]+]]:_(s64) = G_XOR [[ADD]], [[ASHR]]
-    ; CHECK-NEXT: $x0 = COPY [[XOR]](s64)
+    ; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 0
+    ; CHECK-NEXT: [[SUB:%[0-9]+]]:_(s64) = G_SUB [[C]], [[COPY]]
+    ; CHECK-NEXT: [[ICMP:%[0-9]+]]:_(s32) = G_ICMP intpred(sgt), [[COPY]](s64), [[SUB]]
+    ; CHECK-NEXT: [[SELECT:%[0-9]+]]:_(s64) = G_SELECT [[ICMP]](s32), [[COPY]], [[SUB]]
+    ; CHECK-NEXT: $x0 = COPY [[SELECT]](s64)
+    ;
     ; CHECK-CSSC-LABEL: name: abs_s64
     ; CHECK-CSSC: [[COPY:%[0-9]+]]:_(s64) = COPY $x0
     ; CHECK-CSSC-NEXT: [[ABS:%[0-9]+]]:_(s64) = G_ABS [[COPY]]
@@ -55,6 +57,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<4 x s16>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $d0 = COPY [[ABS]](<4 x s16>)
     ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v4s16
     ; CHECK-CSSC: liveins: $d0
     ; CHECK-CSSC-NEXT: {{  $}}
@@ -82,6 +85,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<8 x s16>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $q0 = COPY [[ABS]](<8 x s16>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v8s16
     ; CHECK-CSSC: liveins: $q0
     ; CHECK-CSSC-NEXT: {{  $}}
@@ -109,6 +113,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<2 x s32>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $d0 = COPY [[ABS]](<2 x s32>)
     ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v2s32
     ; CHECK-CSSC: liveins: $d0
     ; CHECK-CSSC-NEXT: {{  $}}
@@ -136,6 +141,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<4 x s32>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $q0 = COPY [[ABS]](<4 x s32>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v4s32
     ; CHECK-CSSC: liveins: $q0
     ; CHECK-CSSC-NEXT: {{  $}}
@@ -163,6 +169,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<8 x s8>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $d0 = COPY [[ABS]](<8 x s8>)
     ; CHECK-NEXT: RET_ReallyLR implicit $d0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v4s8
     ; CHECK-CSSC: liveins: $d0
     ; CHECK-CSSC-NEXT: {{  $}}
@@ -190,6 +197,7 @@ body:             |
     ; CHECK-NEXT: [[ABS:%[0-9]+]]:_(<16 x s8>) = G_ABS [[COPY]]
     ; CHECK-NEXT: $q0 = COPY [[ABS]](<16 x s8>)
     ; CHECK-NEXT: RET_ReallyLR implicit $q0
+    ;
     ; CHECK-CSSC-LABEL: name: abs_v16s8
     ; CHECK-CSSC: liveins: $q0
     ; CHECK-CSSC-NEXT: {{  $}}
diff --git a/llvm/test/CodeGen/AArch64/abs.ll b/llvm/test/CodeGen/AArch64/abs.ll
index e00f70b94e3b42..bfedd2bce8c86c 100644
--- a/llvm/test/CodeGen/AArch64/abs.ll
+++ b/llvm/test/CodeGen/AArch64/abs.ll
@@ -15,9 +15,9 @@ define i8 @abs_i8(i8 %a){
 ; CHECK-GI-LABEL: abs_i8:
 ; CHECK-GI:       // %bb.0: // %entry
 ; CHECK-GI-NEXT:    sxtb w8, w0
-; CHECK-GI-NEXT:    asr w8, w8, #7
-; CHECK-GI-NEXT:    add w9, w0, w8
-; CHECK-GI-NEXT:    eor w0, w9, w8
+; CHECK-GI-NEXT:    neg w9, w0
+; CHECK-GI-NEXT:    cmp w8, w9, sxtb
+; CHECK-GI-NEXT:    cneg w0, w0, le
 ; CHECK-GI-NEXT:    ret
 entry:
   %res = call i8 @llvm.abs.i8(i8 %a, i1 0)
@@ -36,9 +36,9 @@ define i16 @abs_i16(i16 %a){
 ; CHECK-GI-LABEL: abs_i16:
 ; CHECK-GI:       // %bb.0: // %entry
 ; CHECK-GI-NEXT:    sxth w8, w0
-; CHECK-GI-NEXT:    asr w8, w8, #15
-; CHECK-GI-NEXT:    add w9, w0, w8
-; CHECK-GI-NEXT:    eor w0, w9, w8
+; CHECK-GI-NEXT:    neg w9, w0
+; CHECK-GI-NEXT:    cmp w8, w9, sxth
+; CHECK-GI-NEXT:    cneg w0, w0, le
 ; CHECK-GI-NEXT:    ret
 entry:
   %res = call i16 @llvm.abs.i16(i16 %a, i1 0)
@@ -55,9 +55,9 @@ define i32 @abs_i32(i32 %a){
 ;
 ; CHECK-GI-LABEL: abs_i32:
 ; CHECK-GI:       // %bb.0: // %entry
-; CHECK-GI-NEXT:    asr w8, w0, #31
-; CHECK-GI-NEXT:    add w9, w0, w8
-; CHECK-GI-NEXT:    eor w0, w9, w8
+; CHECK-GI-NEXT:    neg w8, w0
+; CHECK-GI-NEXT:    cmp w0, w8
+; CHECK-GI-NEXT:    cneg w0, w0, le
 ; CHECK-GI-NEXT:    ret
 entry:
   %res = call i32 @llvm.abs.i32(i32 %a, i1 0)
@@ -74,9 +74,9 @@ define i64 @abs_i64(i64 %a){
 ;
 ; CHECK-GI-LABEL: abs_i64:
 ; CHECK-GI:       // %bb.0: // %entry
-; CHECK-GI-NEXT:    asr x8, x0, #63
-; CHECK-GI-NEXT:    add x9, x0, x8
-; CHECK-GI-NEXT:    eor x0, x9, x8
+; CHECK-GI-NEXT:    neg x8, x0
+; CHECK-GI-NEXT:    cmp x0, x8
+; CHECK-GI-NEXT:    cneg x0, x0, le
 ; CHECK-GI-NEXT:    ret
 entry:
   %res = call i64 @llvm.abs.i64(i64 %a, i1 0)
@@ -248,9 +248,9 @@ define <1 x i32> @abs_v1i32(<1 x i32> %a){
 ; CHECK-GI-LABEL: abs_v1i32:
 ; CHECK-GI:       // %bb.0: // %entry
 ; CHECK-GI-NEXT:    fmov w8, s0
-; CHECK-GI-NEXT:    asr w9, w8, #31
-; CHECK-GI-NEXT:    add w8, w8, w9
-; CHECK-GI-NEXT:    eor w8, w8, w9
+; CHECK-GI-NEXT:    neg w9, w8
+; CHECK-GI-NEXT:    cmp w8, w9
+; CHECK-GI-NEXT:    cneg w8, w8, le
 ; CHECK-GI-NEXT:    fmov s0, w8
 ; CHECK-GI-NEXT:    // kill: def $d0 killed $d0 killed $q0
 ; CHECK-GI-NEXT:    ret

github-actions · 2024-03-12T08:24:09Z

✅ With the latest revision this PR passed the C/C++ code formatter.

arsenm · 2024-03-12T09:35:05Z

llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp

@@ -1021,6 +1021,11 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
      .clampNumElements(0, v2s32, v4s32)
      .clampNumElements(0, v2s64, v2s64)
      .moreElementsToNextPow2(0)
+      // Do SMax based lowering for < 128 bits.
+      .customIf([=](const LegalityQuery &Q) {
+        LLT SrcTy = Q.Types[0];


Should the default implementation of lower make a better guess based on legal operations?

The customIf action should also go up below the legal cases

tschuett · 2024-03-12T10:36:53Z

Do you have an mca proof that smax is better than xor?

madhur13490 · 2024-03-12T11:40:43Z

Do you have an mca proof that smax is better than xor?

I understand the question but my intention of the patch is to take the codegen closer to what SDAG is doing. However, if you see the i8 case, it is not quite close yet and I am also not sure if the final assembly is right. Ideally, it should just be cmp+cneg (which is what SDAG) is doing.

If we get it closer to SDAG's codegen, then the benefit would be visible as Global ISel's current lowering leads to 3 instructions whereas SDAG is using just cmp+cneg.
I am looking for suggestions around this.

aemerson · 2024-03-12T17:16:15Z

I agree with Matt, lower() should be checking for the legality of SMAX and use that if available, falling back to the current XOR method, see TargetLowering::expandABS(). But for AArch64 that's only legal with the CSSC extension, so you'll need to check why we generate the compare lowering in DAG at all.

davemgreen · 2024-03-12T21:20:43Z

Given we have a csneg instruction, do we just want cmp x, 0; csneg x, x? i.e I would expect us to want to lower to a compare+neg+select (which isn't either of the existing lowerAbs* methods, and might not be easy to tell generically that we want to generate code for, if there is not an obvious way to check for csneg).

madhur13490 · 2024-03-13T17:42:48Z

Given we have a csneg instruction, do we just want cmp x, 0; csneg x, x? i.e I would expect us to want to lower to a compare+neg+select (which isn't either of the existing lowerAbs* methods, and might not be easy to tell generically that we want to generate code for, if there is not an obvious way to check for csneg).

@davemgreen Yes, I was experimenting with this idea locally after this patch but was running into legalizing G_SELECT issue. I have fixed it now and the new commit now introduces cneg-based lowering. Please have a look.

@arsenm @aemerson Given that there is a different approach now, do you still think lower() should be modified? If so, how do we reliably detect? Can you please elaborate a bit more?

arsenm · 2024-03-14T05:41:28Z

@arsenm @aemerson Given that there is a different approach now, do you still think lower() should be modified? If so, how do we reliably detect? Can you please elaborate a bit more?

Yes. Currently the lower implementation just calls lowerAbsToAddXor. You can conditionally call your new implementation based on isLegalOrCustom for the type with the relevant opcodes used in the expansion

tschuett · 2024-03-14T07:10:17Z

Even if SMAX and lowerAbsToAddXor are legal, shouldn't we have the means to say for FOO cores, we prefer SMAX, and for BAR cores we prefer Xor?

arsenm · 2024-03-14T07:27:51Z

Even if SMAX and lowerAbsToAddXor are legal, shouldn't we have the means to say for FOO cores, we prefer SMAX, and for BAR cores we prefer Xor?

You can always put the decision logic in custom lowering, but I think the default lower should try to guess a reasonable default

madhur13490 · 2024-03-14T07:33:28Z

@arsenm @aemerson Given that there is a different approach now, do you still think lower() should be modified? If so, how do we reliably detect? Can you please elaborate a bit more?

Yes. Currently the lower implementation just calls lowerAbsToAddXor. You can conditionally call your new implementation based on isLegalOrCustom for the type with the relevant opcodes used in the expansion

I have updated. Please have a look.

arsenm · 2024-03-14T07:37:31Z

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

-  case G_ABS:
+  case G_ABS: {
+    LLT Ty = MRI.getType(MI.getOperand(0).getReg());
+    if (LI.isLegalOrCustom({G_ABS, Ty}))


Checking the G_ABS legality doesn't make sense in the G_ABS lowering implementation. You should need to check the operations in the expansion

I think the problem might be that there is no good way with the isLegal calls to check if there is a single instruction that does "select and conditional neg" at the same time, in the way csneg would do.

May be I can check for G_SELECT?

For reference, the DAG has this chain of implementations:

llvm-project/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

Line 8962 in 8f68022

SDValue TargetLowering::expandABS(SDNode *N, SelectionDAG &DAG,

Checking for G_SELECT and G_ICMP leads to assertion when compiled for AMDGPU in abs.ll.
const T &llvm::ArrayRef<llvm::LLT>::operator[](size_t) const [T = llvm::LLT]: Assertion Index < Length && "Invalid index!" failed.

LLT Ty = MRI.getType(MI.getOperand(0).getReg()); dbgs() << LI.isLegalOrCustom({G_ICMP, Ty});

Leads to an assert for llvm.abs.ll test. This doesn't need my patch to trigger an assert.
@arsenm Thoughts?
If this is uncovering a latent issue then I would like to go with legalizeCustom() and follow @davemgreen's suggestion. This also aligns with what RISCV is doing.

Checking for them how?

The specific instruction AArch64 wants to use is only indirectly related

Found that the assert is triggering for llvm.abs.i64 case in AMDGPU/GlobalISel/llvm.abs.ll. Constraining the lowering to < 8 bytes has no assert. Do you know what might be causing this?

Don't know what that has to do with abs, but this isn't a well formed query for G_ICMP legality. G_ICMP has 2 type indexes

Hmm... I was looking at how G_SSUBSAT is using G_SMIN and followed the same approach.
Anyway, updated the branch to check both type indexes.

davemgreen

I think considering these are fairly AArch64 specific instructions, it might be best to just make this custom, in the same way we have it for SDAG.

davemgreen · 2024-03-14T07:34:26Z

llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp

@@ -515,6 +515,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
  getActionDefinitionsBuilder(G_ICMP)
      .legalFor({{s32, s32},
                 {s32, s64},
+                 {s64, s64},


I think this and the select below is marking a s64 condition code as legal, which Im not sure is what we would want. We have the clampScalar(0, s32, s32) below.

Well, s64 case doesn't work without this. Why don't want this to be legal?

I believe this is the flags between the cmp+select. I'm not sure what the ramifications of making them i64 would be, but it might be better if we can keep them as i32 if we can, to avoid awkward cases like select(anyext(cmp(..)), ..).

The nodes should be re-visited until they are legal. It looks like G_SELECT isn't handling narrowScalar yet, probably because it hasn't come up before:

LegalizerHelper::LegalizeResult LegalizerHelper::narrowScalarSelect(MachineInstr &MI, unsigned TypeIdx, LLT NarrowTy) { if (TypeIdx != 0) return UnableToLegalize;

Hopefully if that is handled correctly for the icmp/sel, it will work better?

{s64, s64} is legal for G_SELECT but further the index is clamped to s32. I don't understand why this is done.
Inspiring from this I made {s64, s64} legal for G_ICMP.

davemgreen · 2024-03-14T07:39:58Z

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

-  case G_ABS:
+  case G_ABS: {
+    LLT Ty = MRI.getType(MI.getOperand(0).getReg());
+    if (LI.isLegalOrCustom({G_ABS, Ty}))


I think the problem might be that there is no good way with the isLegal calls to check if there is a single instruction that does "select and conditional neg" at the same time, in the way csneg would do.

madhur13490 · 2024-03-14T07:58:22Z

I think considering these are fairly AArch64 specific instructions, it might be best to just make this custom, in the same way we have it for SDAG.

Do you mean just use customifScalar and not have anything in lower()?

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

tschuett · 2024-03-18T18:29:11Z

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

+  case G_ABS: {
+    LLT Ty = MRI.getType(MI.getOperand(0).getReg());
+    if (Ty.isScalar() &&  LI.isLegalOrCustom({G_ICMP, {Ty, Ty}}))
+        return lowerAbsToCNeg(MI);


You probably should check for:

isLegalOrCustom({G_ICMP, {Ty, Ty}}) && isLegalOrCustom({G_SELECT, {Ty, Ty}}) && isLegalOrCustom({G_SUB, {Ty}}) && isLegalOrCustom({G_CONSTANT, {Ty}})

I personally don't know if there is a good check that captures what we would want through the isLegalOrCustom interface. I think many targets would want the ashr/AddXor form even if they have legal sub+sel. I would remove this code and just keep the AArch64LegalizerInfo::legalizeCustom method.

Reusing the same type for the compare result isn't right, but we don't have a great API for this. We kind of want the equivalent of getSetCCResultType. A more awkward solution would be to query with s1 and then look through the legalize actions until you find transforms on the original. Failing that, using i1 as the basis of the query is a more reasonable choice I think

My main point is that we should do something akin to the DAG. We check for the legality of icmp and then select a select without checking whether it is legal or not. The icmp check is insufficient.

@tschuett How would that handle @arsenm's concern?

I agree with @davemgreen's approach and will update the patch accordingly. I would not like to touch lower as of now and handle it in legalizeCustom. Given that we don't have the right APIs, the right approach would be handle this separately.

These are independent issues. We have no API to query the result type from compares. But building a Select without checking whether it is legal to do so may cause trouble.

but that would still reuse the same type (as @arsenm pointed out). Isn't it?

"Reusing the same type for the compare result isn't right"

So, how would it solve the core issue?
I understand the lack of API and thus I would like to avoid changing lower() in this patch.

arsenm · 2024-03-19T10:33:24Z

Do you mean just use customifScalar and not have anything in lower()?

Regardless of what AArch64 wants or does, we should still try to pick a preferred lowering in the default implementation of lower.

This patch improves codegen for scalar (<128bits) version of llvm.abs intrinsic by using the existing non-XOR based lowering. This takes the generated code closer to SDAG. codegen with GISel for > 128 bit types is not very good with these method so not doing so.

madhur13490 · 2024-03-19T15:24:15Z

Updated the patch as discussed above. It also handles G_SELECT issue.

davemgreen

This LGTM. Thanks.

) This patch improves codegen for scalar (<128bits) version of llvm.abs intrinsic by using the existing non-XOR based lowering. This takes the generated code closer to SDAG. codegen with GISel for > 128 bit types is not very good with these method so not doing so.

llvmbot added backend:AArch64 llvm:globalisel labels Mar 12, 2024

madhur13490 requested review from chuongg3 and davemgreen March 12, 2024 08:21

madhur13490 force-pushed the madhur13490/cneg branch from 7ec0ad3 to d1e00f3 Compare March 12, 2024 09:07

madhur13490 requested a review from aemerson March 12, 2024 09:23

arsenm reviewed Mar 12, 2024

View reviewed changes

madhur13490 force-pushed the madhur13490/cneg branch from d1e00f3 to 62e994b Compare March 13, 2024 17:38

madhur13490 force-pushed the madhur13490/cneg branch from 62e994b to 3ab319f Compare March 14, 2024 07:29

arsenm reviewed Mar 14, 2024

View reviewed changes

davemgreen reviewed Mar 14, 2024

View reviewed changes

arsenm reviewed Mar 14, 2024

View reviewed changes

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp Outdated Show resolved Hide resolved

madhur13490 force-pushed the madhur13490/cneg branch 2 times, most recently from 0c41e1a to 49d5931 Compare March 18, 2024 14:48

tschuett self-requested a review March 18, 2024 19:40

tschuett reviewed Mar 18, 2024

View reviewed changes

madhur13490 force-pushed the madhur13490/cneg branch from 49d5931 to 8e7134f Compare March 19, 2024 15:23

davemgreen approved these changes Mar 19, 2024

View reviewed changes

madhur13490 merged commit 7bb87d5 into llvm:main Mar 21, 2024
4 checks passed

[AArch64][GlobalISel] Take abs scalar codegen closer to SDAG #84886

[AArch64][GlobalISel] Take abs scalar codegen closer to SDAG #84886

Conversation

madhur13490 commented Mar 12, 2024

llvmbot commented Mar 12, 2024 • edited

github-actions bot commented Mar 12, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tschuett commented Mar 12, 2024

madhur13490 commented Mar 12, 2024

aemerson commented Mar 12, 2024

davemgreen commented Mar 12, 2024

madhur13490 commented Mar 13, 2024

arsenm commented Mar 14, 2024

tschuett commented Mar 14, 2024

arsenm commented Mar 14, 2024

madhur13490 commented Mar 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madhur13490 Mar 14, 2024 • edited

Choose a reason for hiding this comment

madhur13490 Mar 15, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davemgreen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madhur13490 commented Mar 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madhur13490 Mar 19, 2024 • edited

Choose a reason for hiding this comment

arsenm commented Mar 19, 2024

madhur13490 commented Mar 19, 2024

davemgreen left a comment

Choose a reason for hiding this comment

llvmbot commented Mar 12, 2024 •

edited

github-actions bot commented Mar 12, 2024 •

edited

madhur13490 Mar 14, 2024 •

edited

madhur13490 Mar 15, 2024 •

edited

madhur13490 Mar 19, 2024 •

edited