Skip to content

Commit

Permalink
[x86][Costmodel] AVX512VL: add missing costs for v8 i1<->i32 casts
Browse files Browse the repository at this point in the history
This would come up as a regression in the follow-up Replication-of-i1 patch.

https://godbolt.org/z/fxr9Mzssr
  • Loading branch information
LebedevRI committed Dec 13, 2022
1 parent 3187976 commit ff5fcda
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 58 deletions.
4 changes: 4 additions & 0 deletions llvm/lib/Target/X86/X86TargetTransformInfo.cpp
Expand Up @@ -2448,6 +2448,7 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
{ ISD::TRUNCATE, MVT::v2i1, MVT::v2i32, 2 }, // vpslld+vptestmd
{ ISD::TRUNCATE, MVT::v4i1, MVT::v4i32, 2 }, // vpslld+vptestmd
{ ISD::TRUNCATE, MVT::v8i1, MVT::v8i32, 2 }, // vpslld+vptestmd
{ ISD::TRUNCATE, MVT::v16i1, MVT::v8i32, 2 }, // vpslld+vptestmd
{ ISD::TRUNCATE, MVT::v2i1, MVT::v2i64, 2 }, // vpsllq+vptestmq
{ ISD::TRUNCATE, MVT::v4i1, MVT::v4i64, 2 }, // vpsllq+vptestmq
{ ISD::TRUNCATE, MVT::v4i32, MVT::v4i64, 1 }, // vpmovqd
Expand Down Expand Up @@ -2483,6 +2484,9 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
{ ISD::ZERO_EXTEND, MVT::v4i32, MVT::v4i1, 2 }, // vpternlogd+psrld
{ ISD::SIGN_EXTEND, MVT::v8i32, MVT::v8i1, 1 }, // vpternlogd
{ ISD::ZERO_EXTEND, MVT::v8i32, MVT::v8i1, 2 }, // vpternlogd+psrld
{ ISD::SIGN_EXTEND, MVT::v8i32, MVT::v16i1, 1 }, // vpternlogd
{ ISD::ZERO_EXTEND, MVT::v8i32, MVT::v16i1, 2 }, // vpternlogd+psrld

{ ISD::SIGN_EXTEND, MVT::v2i64, MVT::v2i1, 1 }, // vpternlogq
{ ISD::ZERO_EXTEND, MVT::v2i64, MVT::v2i1, 2 }, // vpternlogq+psrlq
{ ISD::SIGN_EXTEND, MVT::v4i64, MVT::v4i1, 1 }, // vpternlogq
Expand Down
32 changes: 16 additions & 16 deletions llvm/test/Analysis/CostModel/X86/extend.ll
Expand Up @@ -919,10 +919,10 @@ define i32 @zext_vXi1() "min-legal-vector-width"="256" {
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2i32 = zext <2 x i1> undef to <2 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4i32 = zext <4 x i1> undef to <4 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8i32 = zext <8 x i1> undef to <8 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32i32 = zext <32 x i1> undef to <32 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64i32 = zext <64 x i1> undef to <64 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 40 for instruction: %V128i32 = zext <128 x i1> undef to <128 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V32i32 = zext <32 x i1> undef to <32 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V64i32 = zext <64 x i1> undef to <64 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V128i32 = zext <128 x i1> undef to <128 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I16 = zext i1 undef to i16
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2i16 = zext <2 x i1> undef to <2 x i16>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4i16 = zext <4 x i1> undef to <4 x i16>
Expand Down Expand Up @@ -1059,10 +1059,10 @@ define i32 @zext_vXi1() "min-legal-vector-width"="256" {
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2i32 = zext <2 x i1> undef to <2 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4i32 = zext <4 x i1> undef to <4 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8i32 = zext <8 x i1> undef to <8 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V32i32 = zext <32 x i1> undef to <32 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 23 for instruction: %V64i32 = zext <64 x i1> undef to <64 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V128i32 = zext <128 x i1> undef to <128 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V32i32 = zext <32 x i1> undef to <32 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 19 for instruction: %V64i32 = zext <64 x i1> undef to <64 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %V128i32 = zext <128 x i1> undef to <128 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I16 = zext i1 undef to i16
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2i16 = zext <2 x i1> undef to <2 x i16>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4i16 = zext <4 x i1> undef to <4 x i16>
Expand Down Expand Up @@ -2077,10 +2077,10 @@ define i32 @sext_vXi1() "min-legal-vector-width"="256" {
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2i32 = sext <2 x i1> undef to <2 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V4i32 = sext <4 x i1> undef to <4 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V8i32 = sext <8 x i1> undef to <8 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %V32i32 = sext <32 x i1> undef to <32 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 12 for instruction: %V64i32 = sext <64 x i1> undef to <64 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 24 for instruction: %V128i32 = sext <128 x i1> undef to <128 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V32i32 = sext <32 x i1> undef to <32 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V64i32 = sext <64 x i1> undef to <64 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V128i32 = sext <128 x i1> undef to <128 x i32>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I16 = sext i1 undef to i16
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2i16 = sext <2 x i1> undef to <2 x i16>
; AVX512FVEC256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4i16 = sext <4 x i1> undef to <4 x i16>
Expand Down Expand Up @@ -2217,10 +2217,10 @@ define i32 @sext_vXi1() "min-legal-vector-width"="256" {
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2i32 = sext <2 x i1> undef to <2 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V4i32 = sext <4 x i1> undef to <4 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V8i32 = sext <8 x i1> undef to <8 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V32i32 = sext <32 x i1> undef to <32 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 15 for instruction: %V64i32 = sext <64 x i1> undef to <64 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 30 for instruction: %V128i32 = sext <128 x i1> undef to <128 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V32i32 = sext <32 x i1> undef to <32 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V64i32 = sext <64 x i1> undef to <64 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V128i32 = sext <128 x i1> undef to <128 x i32>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I16 = sext i1 undef to i16
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2i16 = sext <2 x i1> undef to <2 x i16>
; AVX512BWVEC256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V4i16 = sext <4 x i1> undef to <4 x i16>
Expand Down
10 changes: 5 additions & 5 deletions llvm/test/Analysis/CostModel/X86/min-legal-vector-width.ll
Expand Up @@ -279,7 +279,7 @@ define i32 @zext256_vXi1() "min-legal-vector-width"="256" {
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2i32 = zext <2 x i1> undef to <2 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4i32 = zext <4 x i1> undef to <4 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8i32 = zext <8 x i1> undef to <8 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16i32 = zext <16 x i1> undef to <16 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2i16 = zext <2 x i1> undef to <2 x i16>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4i16 = zext <4 x i1> undef to <4 x i16>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V8i16 = zext <8 x i1> undef to <8 x i16>
Expand Down Expand Up @@ -416,7 +416,7 @@ define i32 @sext256_vXi1() "min-legal-vector-width"="256" {
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V2i32 = sext <2 x i1> undef to <2 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V4i32 = sext <4 x i1> undef to <4 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V8i32 = sext <8 x i1> undef to <8 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16i32 = sext <16 x i1> undef to <16 x i32>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %I16 = sext i1 undef to i16
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2i16 = sext <2 x i1> undef to <2 x i16>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4i16 = sext <4 x i1> undef to <4 x i16>
Expand Down Expand Up @@ -574,9 +574,9 @@ define i32 @trunc_vXi1() "min-legal-vector-width"="256" {
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2i32 = trunc <2 x i32> undef to <2 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4i32 = trunc <4 x i32> undef to <4 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8i32 = trunc <8 x i32> undef to <8 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V16i32 = trunc <16 x i32> undef to <16 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V32i32 = trunc <32 x i32> undef to <32 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64i32 = trunc <64 x i32> undef to <64 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16i32 = trunc <16 x i32> undef to <16 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V32i32 = trunc <32 x i32> undef to <32 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V64i32 = trunc <64 x i32> undef to <64 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2i16 = trunc <2 x i16> undef to <2 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V4i16 = trunc <4 x i16> undef to <4 x i1>
; AVX512VL256-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V8i16 = trunc <8 x i16> undef to <8 x i1>
Expand Down

0 comments on commit ff5fcda

Please sign in to comment.