Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FastISel][X86] Use getTypeForExtReturn in GetReturnInfo. #80803

Merged
merged 2 commits into from
Feb 6, 2024

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Feb 6, 2024

The comment and code here seems to match getTypeForExtReturn. The history shows that at the time this code was added, similar code existed in SelectionDAGBuilder. SelectionDAGBuiler code has since been refactored into getTypeForExtReturn.

This patch makes FastISel match SelectionDAGBuilder.

The test changes are because X86 has customization of getTypeForExtReturn. So now we only extend returns to i8.

Stumbled onto this difference by accident.

The comment and code here seems to match getTypeForExtReturn. The
history shows that at the time this code was added, similar code
existed in SelectionDAGBuilder. SelectionDAGBuiler code has since
been refactored into getTypeForExtReturn.

This patch makes FastISel match SelectionDAGBuilder.

The test changes are because X86 has customization of getTypeForExtReturn.
So now we only extend returns to i8.

Stumbled onto this difference by accident.
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 6, 2024

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-x86

Author: Craig Topper (topperc)

Changes

The comment and code here seems to match getTypeForExtReturn. The history shows that at the time this code was added, similar code existed in SelectionDAGBuilder. SelectionDAGBuiler code has since been refactored into getTypeForExtReturn.

This patch makes FastISel match SelectionDAGBuilder.

The test changes are because X86 has customization of getTypeForExtReturn. So now we only extend returns to i8.

Stumbled onto this difference by accident.


Patch is 79.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80803.diff

11 Files Affected:

  • (modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+2-9)
  • (modified) llvm/lib/Target/X86/X86FastISel.cpp (+7-6)
  • (modified) llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll (+12-12)
  • (modified) llvm/test/CodeGen/X86/avx512bwvl-intrinsics-fast-isel.ll (+18-18)
  • (modified) llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll (+24-24)
  • (modified) llvm/test/CodeGen/X86/fast-isel-fcmp.ll (+35-120)
  • (modified) llvm/test/CodeGen/X86/fast-isel-ret-ext.ll (+1-1)
  • (modified) llvm/test/CodeGen/X86/keylocker-intrinsics-fast-isel.ll (-16)
  • (modified) llvm/test/CodeGen/X86/xaluo.ll (+18-49)
  • (modified) llvm/test/CodeGen/X86/xmulo.ll (+23-40)
  • (modified) llvm/test/DebugInfo/X86/convert-debugloc.ll (+1-1)
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index fe7bed760572b..16cd14ba3de9b 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -1738,15 +1738,8 @@ void llvm::GetReturnInfo(CallingConv::ID CC, Type *ReturnType,
     else if (attr.hasRetAttr(Attribute::ZExt))
       ExtendKind = ISD::ZERO_EXTEND;
 
-    // FIXME: C calling convention requires the return type to be promoted to
-    // at least 32-bit. But this is not necessary for non-C calling
-    // conventions. The frontend should mark functions whose return values
-    // require promoting with signext or zeroext attributes.
-    if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger()) {
-      MVT MinVT = TLI.getRegisterType(MVT::i32);
-      if (VT.bitsLT(MinVT))
-        VT = MinVT;
-    }
+    if (ExtendKind != ISD::ANY_EXTEND && VT.isInteger())
+      VT = TLI.getTypeForExtReturn(ReturnType->getContext(), VT, ExtendKind);
 
     unsigned NumParts =
         TLI.getNumRegistersForCallingConv(ReturnType->getContext(), CC, VT);
diff --git a/llvm/lib/Target/X86/X86FastISel.cpp b/llvm/lib/Target/X86/X86FastISel.cpp
index 1ce1e6f6a5635..746fa432ade7a 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -1250,8 +1250,6 @@ bool X86FastISel::X86SelectRet(const Instruction *I) {
       if (!Outs[0].Flags.isZExt() && !Outs[0].Flags.isSExt())
         return false;
 
-      assert(DstVT == MVT::i32 && "X86 should always ext to i32");
-
       if (SrcVT == MVT::i1) {
         if (Outs[0].Flags.isSExt())
           return false;
@@ -1259,10 +1257,13 @@ bool X86FastISel::X86SelectRet(const Instruction *I) {
         SrcReg = fastEmitZExtFromI1(MVT::i8, SrcReg);
         SrcVT = MVT::i8;
       }
-      unsigned Op = Outs[0].Flags.isZExt() ? ISD::ZERO_EXTEND :
-                                             ISD::SIGN_EXTEND;
-      // TODO
-      SrcReg = fastEmit_r(SrcVT.getSimpleVT(), DstVT.getSimpleVT(), Op, SrcReg);
+      if (SrcVT != DstVT) {
+        unsigned Op =
+            Outs[0].Flags.isZExt() ? ISD::ZERO_EXTEND : ISD::SIGN_EXTEND;
+        // TODO
+        SrcReg =
+            fastEmit_r(SrcVT.getSimpleVT(), DstVT.getSimpleVT(), Op, SrcReg);
+      }
     }
 
     // Make the copy.
diff --git a/llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll b/llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
index 780abc9f9dc43..1ca870add95b5 100644
--- a/llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
+++ b/llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
@@ -21,7 +21,7 @@ define zeroext i16 @test_mm512_kunpackb(<8 x i64> %__A, <8 x i64> %__B, <8 x i64
 ; X86-NEXT:    kunpckbw %k0, %k1, %k1
 ; X86-NEXT:    vpcmpneqd 72(%ebp), %zmm3, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    movl %ebp, %esp
 ; X86-NEXT:    popl %ebp
 ; X86-NEXT:    .cfi_def_cfa %esp, 4
@@ -35,7 +35,7 @@ define zeroext i16 @test_mm512_kunpackb(<8 x i64> %__A, <8 x i64> %__B, <8 x i64
 ; X64-NEXT:    kunpckbw %k0, %k1, %k1
 ; X64-NEXT:    vpcmpneqd %zmm5, %zmm4, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -367,7 +367,7 @@ define zeroext i16 @test_mm512_testn_epi32_mask(<8 x i64> %__A, <8 x i64> %__B)
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmd %zmm0, %zmm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzwl %ax, %eax
+; CHECK-NEXT:    # kill: def $ax killed $ax killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -385,7 +385,7 @@ define zeroext i16 @test_mm512_mask_testn_epi32_mask(i16 zeroext %__U, <8 x i64>
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestnmd %zmm0, %zmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -394,7 +394,7 @@ define zeroext i16 @test_mm512_mask_testn_epi32_mask(i16 zeroext %__U, <8 x i64>
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestnmd %zmm0, %zmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -412,7 +412,7 @@ define zeroext i8 @test_mm512_testn_epi64_mask(<8 x i64> %__A, <8 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmq %zmm0, %zmm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -429,7 +429,7 @@ define zeroext i8 @test_mm512_mask_testn_epi64_mask(i8 zeroext %__U, <8 x i64> %
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestnmq %zmm0, %zmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -438,7 +438,7 @@ define zeroext i8 @test_mm512_mask_testn_epi64_mask(i8 zeroext %__U, <8 x i64> %
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestnmq %zmm0, %zmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -457,7 +457,7 @@ define zeroext i16 @test_mm512_mask_test_epi32_mask(i16 zeroext %__U, <8 x i64>
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmd %zmm0, %zmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -466,7 +466,7 @@ define zeroext i16 @test_mm512_mask_test_epi32_mask(i16 zeroext %__U, <8 x i64>
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmd %zmm0, %zmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -486,7 +486,7 @@ define zeroext i8 @test_mm512_mask_test_epi64_mask(i8 zeroext %__U, <8 x i64> %_
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmq %zmm0, %zmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -495,7 +495,7 @@ define zeroext i8 @test_mm512_mask_test_epi64_mask(i8 zeroext %__U, <8 x i64> %_
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmq %zmm0, %zmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
diff --git a/llvm/test/CodeGen/X86/avx512bwvl-intrinsics-fast-isel.ll b/llvm/test/CodeGen/X86/avx512bwvl-intrinsics-fast-isel.ll
index a32b84986e895..00729262473da 100644
--- a/llvm/test/CodeGen/X86/avx512bwvl-intrinsics-fast-isel.ll
+++ b/llvm/test/CodeGen/X86/avx512bwvl-intrinsics-fast-isel.ll
@@ -9,7 +9,7 @@ define zeroext i16 @test_mm_test_epi8_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmb %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzwl %ax, %eax
+; CHECK-NEXT:    # kill: def $ax killed $ax killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -25,7 +25,7 @@ define zeroext i16 @test_mm_mask_test_epi8_mask(i16 zeroext %__U, <2 x i64> %__A
 ; X86-NEXT:    kmovw {{[0-9]+}}(%esp), %k1
 ; X86-NEXT:    vptestmb %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_test_epi8_mask:
@@ -33,7 +33,7 @@ define zeroext i16 @test_mm_mask_test_epi8_mask(i16 zeroext %__U, <2 x i64> %__A
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestmb %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -91,7 +91,7 @@ define zeroext i8 @test_mm_test_epi16_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmw %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -108,7 +108,7 @@ define zeroext i8 @test_mm_mask_test_epi16_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X86-NEXT:    kmovd %eax, %k1
 ; X86-NEXT:    vptestmw %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_test_epi16_mask:
@@ -116,7 +116,7 @@ define zeroext i8 @test_mm_mask_test_epi16_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestmw %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -133,7 +133,7 @@ define zeroext i16 @test_mm256_test_epi16_mask(<4 x i64> %__A, <4 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmw %ymm0, %ymm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzwl %ax, %eax
+; CHECK-NEXT:    # kill: def $ax killed $ax killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -150,7 +150,7 @@ define zeroext i16 @test_mm256_mask_test_epi16_mask(i16 zeroext %__U, <4 x i64>
 ; X86-NEXT:    kmovw {{[0-9]+}}(%esp), %k1
 ; X86-NEXT:    vptestmw %ymm0, %ymm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -159,7 +159,7 @@ define zeroext i16 @test_mm256_mask_test_epi16_mask(i16 zeroext %__U, <4 x i64>
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestmw %ymm0, %ymm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -177,7 +177,7 @@ define zeroext i16 @test_mm_testn_epi8_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmb %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzwl %ax, %eax
+; CHECK-NEXT:    # kill: def $ax killed $ax killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -193,7 +193,7 @@ define zeroext i16 @test_mm_mask_testn_epi8_mask(i16 zeroext %__U, <2 x i64> %__
 ; X86-NEXT:    kmovw {{[0-9]+}}(%esp), %k1
 ; X86-NEXT:    vptestnmb %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_testn_epi8_mask:
@@ -201,7 +201,7 @@ define zeroext i16 @test_mm_mask_testn_epi8_mask(i16 zeroext %__U, <2 x i64> %__
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestnmb %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -259,7 +259,7 @@ define zeroext i8 @test_mm_testn_epi16_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmw %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -276,7 +276,7 @@ define zeroext i8 @test_mm_mask_testn_epi16_mask(i8 zeroext %__U, <2 x i64> %__A
 ; X86-NEXT:    kmovd %eax, %k1
 ; X86-NEXT:    vptestnmw %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_testn_epi16_mask:
@@ -284,7 +284,7 @@ define zeroext i8 @test_mm_mask_testn_epi16_mask(i8 zeroext %__U, <2 x i64> %__A
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestnmw %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -301,7 +301,7 @@ define zeroext i16 @test_mm256_testn_epi16_mask(<4 x i64> %__A, <4 x i64> %__B)
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmw %ymm0, %ymm1, %k0
 ; CHECK-NEXT:    kmovd %k0, %eax
-; CHECK-NEXT:    movzwl %ax, %eax
+; CHECK-NEXT:    # kill: def $ax killed $ax killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -318,7 +318,7 @@ define zeroext i16 @test_mm256_mask_testn_epi16_mask(i16 zeroext %__U, <4 x i64>
 ; X86-NEXT:    kmovw {{[0-9]+}}(%esp), %k1
 ; X86-NEXT:    vptestnmw %ymm0, %ymm1, %k0 {%k1}
 ; X86-NEXT:    kmovd %k0, %eax
-; X86-NEXT:    movzwl %ax, %eax
+; X86-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -327,7 +327,7 @@ define zeroext i16 @test_mm256_mask_testn_epi16_mask(i16 zeroext %__U, <4 x i64>
 ; X64-NEXT:    kmovd %edi, %k1
 ; X64-NEXT:    vptestnmw %ymm0, %ymm1, %k0 {%k1}
 ; X64-NEXT:    kmovd %k0, %eax
-; X64-NEXT:    movzwl %ax, %eax
+; X64-NEXT:    # kill: def $ax killed $ax killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
diff --git a/llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll b/llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
index 173e2bad8aceb..06e7096e430bb 100644
--- a/llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
+++ b/llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
@@ -1547,7 +1547,7 @@ define zeroext i8 @test_mm_test_epi32_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmd %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1565,7 +1565,7 @@ define zeroext i8 @test_mm_mask_test_epi32_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmd %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_test_epi32_mask:
@@ -1573,7 +1573,7 @@ define zeroext i8 @test_mm_mask_test_epi32_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmd %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1592,7 +1592,7 @@ define zeroext i8 @test_mm256_test_epi32_mask(<4 x i64> %__A, <4 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmd %ymm0, %ymm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -1610,7 +1610,7 @@ define zeroext i8 @test_mm256_mask_test_epi32_mask(i8 zeroext %__U, <4 x i64> %_
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmd %ymm0, %ymm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -1619,7 +1619,7 @@ define zeroext i8 @test_mm256_mask_test_epi32_mask(i8 zeroext %__U, <4 x i64> %_
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmd %ymm0, %ymm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -1637,7 +1637,7 @@ define zeroext i8 @test_mm_test_epi64_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmq %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1654,7 +1654,7 @@ define zeroext i8 @test_mm_mask_test_epi64_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmq %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_test_epi64_mask:
@@ -1662,7 +1662,7 @@ define zeroext i8 @test_mm_mask_test_epi64_mask(i8 zeroext %__U, <2 x i64> %__A,
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmq %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1680,7 +1680,7 @@ define zeroext i8 @test_mm256_test_epi64_mask(<4 x i64> %__A, <4 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestmq %ymm0, %ymm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -1698,7 +1698,7 @@ define zeroext i8 @test_mm256_mask_test_epi64_mask(i8 zeroext %__U, <4 x i64> %_
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestmq %ymm0, %ymm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    vzeroupper
 ; X86-NEXT:    retl
 ;
@@ -1707,7 +1707,7 @@ define zeroext i8 @test_mm256_mask_test_epi64_mask(i8 zeroext %__U, <4 x i64> %_
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestmq %ymm0, %ymm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    vzeroupper
 ; X64-NEXT:    retq
 entry:
@@ -1726,7 +1726,7 @@ define zeroext i8 @test_mm_testn_epi32_mask(<2 x i64> %__A, <2 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmd %xmm0, %xmm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1744,7 +1744,7 @@ define zeroext i8 @test_mm_mask_testn_epi32_mask(i8 zeroext %__U, <2 x i64> %__A
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestnmd %xmm0, %xmm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al killed $eax
 ; X86-NEXT:    retl
 ;
 ; X64-LABEL: test_mm_mask_testn_epi32_mask:
@@ -1752,7 +1752,7 @@ define zeroext i8 @test_mm_mask_testn_epi32_mask(i8 zeroext %__U, <2 x i64> %__A
 ; X64-NEXT:    kmovw %edi, %k1
 ; X64-NEXT:    vptestnmd %xmm0, %xmm1, %k0 {%k1}
 ; X64-NEXT:    kmovw %k0, %eax
-; X64-NEXT:    movzbl %al, %eax
+; X64-NEXT:    # kill: def $al killed $al killed $eax
 ; X64-NEXT:    retq
 entry:
   %and.i.i = and <2 x i64> %__B, %__A
@@ -1771,7 +1771,7 @@ define zeroext i8 @test_mm256_testn_epi32_mask(<4 x i64> %__A, <4 x i64> %__B) {
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    vptestnmd %ymm0, %ymm1, %k0
 ; CHECK-NEXT:    kmovw %k0, %eax
-; CHECK-NEXT:    movzbl %al, %eax
+; CHECK-NEXT:    # kill: def $al killed $al killed $eax
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    ret{{[l|q]}}
 entry:
@@ -1789,7 +1789,7 @@ define zeroext i8 @test_mm256_mask_testn_epi32_mask(i8 zeroext %__U, <4 x i64> %
 ; X86-NEXT:    kmovw %eax, %k1
 ; X86-NEXT:    vptestnmd %ymm0, %ymm1, %k0 {%k1}
 ; X86-NEXT:    kmovw %k0, %eax
-; X86-NEXT:    movzbl %al, %eax
+; X86-NEXT:    # kill: def $al killed $al kil...
[truncated]

if (SrcVT != DstVT) {
unsigned Op =
Outs[0].Flags.isZExt() ? ISD::ZERO_EXTEND : ISD::SIGN_EXTEND;
// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO what?

Copy link
Collaborator Author

@topperc topperc Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Came from here e36a41b

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it was originally saying that the Kill flags could be set in some cases.

@@ -27,7 +27,7 @@
; RUN: | FileCheck %s --check-prefix=VERBOSE --check-prefix=CONV "--implicit-check-not={{DW_TAG|NULL}}"


; SPLITCONV: Compile Unit:{{.*}} DWO_id = 0x24191746f389535f
; SPLITCONV: Compile Unit:{{.*}} DWO_id = 0x06580df9fdd5b54b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it changed? It doesn't test for fast-isel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tests -O0 which uses fast isel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are correct!

@@ -21,7 +21,7 @@ define zeroext i16 @test_mm512_kunpackb(<8 x i64> %__A, <8 x i64> %__B, <8 x i64
; X86-NEXT: kunpckbw %k0, %k1, %k1
; X86-NEXT: vpcmpneqd 72(%ebp), %zmm3, %k0 {%k1}
; X86-NEXT: kmovw %k0, %eax
; X86-NEXT: movzwl %ax, %eax
; X86-NEXT: # kill: def $ax killed $ax killed $eax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see problem from test case, because the high bits are zeros anyway.
I'm not sure if it's correct in general. The zeroext requires the callee to zero-extend the return type. I think we still need an explicit zero-extend for i8/16 type if the high bits are not zero.

See LangRef about zeroext:

This indicates to the code generator that the parameter or return value should be zero-extended to the extent required by the target’s ABI by the caller (for a parameter) or the callee (for a return value).

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given DAGIsel doesn't do zeroext on callee and both do zeroext in caller https://godbolt.org/z/e4qf3Mdrn. I think we are safe to go with this patch.

@arsenm
Copy link
Contributor

arsenm commented Feb 6, 2024

Given DAGIsel doesn't do zeroext on callee and both do zeroext in caller https://godbolt.org/z/e4qf3Mdrn. I think we are safe to go with this patch.

This sounds broken though

@phoebewang
Copy link
Contributor

Given DAGIsel doesn't do zeroext on callee and both do zeroext in caller https://godbolt.org/z/e4qf3Mdrn. I think we are safe to go with this patch.

This sounds broken though

The LangRef says it's required by the target’s ABI. I checked 64-bit psABI, and didn't see such requirement. So we don't break anythink here. 32-bit maybe a different story https://godbolt.org/z/be9Pvzecn
Besides, GCC does the same thing here https://godbolt.org/z/9rY4GeGd8

@topperc topperc merged commit cca4966 into llvm:main Feb 6, 2024
4 checks passed
@topperc topperc deleted the pr/getTypeForExtReturn-fastisel branch February 6, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants