Skip to content

Look more aggressively for vpgatherdd opportunities #163023

@Validark

Description

@Validark

(Zig Version)
LLVM version:

define dso_local <16 x i32> @foo(ptr nonnull align 4 %0, <16 x i8> %1, <16 x i32> %2) local_unnamed_addr {
Entry:
  %3 = ptrtoint ptr %0 to i64
  %4 = insertelement <1 x i64> poison, i64 %3, i64 0
  %5 = shufflevector <1 x i64> %4, <1 x i64> poison, <16 x i32> zeroinitializer
  %6 = zext <16 x i32> %2 to <16 x i64>
  %7 = add nuw <16 x i64> %5, %6
  %8 = inttoptr <16 x i64> %7 to <16 x ptr>
  %9 = tail call fastcc <16 x i32> @llvm.masked.gather.v16i32.v16p0(<16 x ptr> %8, i32 4, <16 x i1> splat (i1 true), <16 x i32> poison)
  ret <16 x i32> %9
}

declare fastcc <16 x i32> @llvm.masked.gather.v16i32.v16p0(<16 x ptr>, i32 immarg, <16 x i1>, <16 x i32>) #1

Emit:

foo:
        vpmovzxdq       zmm0, ymm1
        vextracti64x4   ymm1, zmm1, 1
        kxnorw  k1, k0, k0
        kxnorw  k2, k0, k0
        vpxor   xmm2, xmm2, xmm2
        vpxor   xmm3, xmm3, xmm3
        vpmovzxdq       zmm1, ymm1
        vpgatherqd      ymm2 {k1}, dword ptr [rdi + zmm0]
        vpgatherqd      ymm3 {k2}, dword ptr [rdi + zmm1]
        vinserti64x4    zmm0, zmm2, ymm3, 1
        ret

Should be:

bar:
        vxor    xmm0, xmm0, xmm0
        kxnorw  k1, k0, k0
        vpgatherdd      zmm0 {k1}, dword ptr [rdi + zmm1]
        ret

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions