[X86] Worse code generation from patch 24780e13e5be1501e34330148137a10fa9965166 #61923

xortator · 2023-04-04T05:56:05Z

We see worse code generation on memcmp-like function caused by

commit 24780e13e5be1501e34330148137a10fa9965166 (HEAD -> main)
Author: Simon Pilgrim <llvm-dev@redking.me.uk>
Date:   Sat Apr 1 15:38:38 2023 +0100

    [X86] MatchVectorAllEqualTest - add support for icmp(reduce_and(X),-1) allof reduction patterns

    Also, improve codegen in LowerVectorAllEqual for X == -1 cases to reduce over sized vector using a AND reduction

Asm on last release: https://godbolt.org/z/Ebcd7P3jP
Current asm: https://godbolt.org/z/nxv1vc16e

Loop block changed from

.LBB0_2:                                # %memcmp.loop
        vmovdqu (%rsi,%rcx), %xmm0
        vmovdqu 16(%rsi,%rcx), %xmm1
        vpsubb  16(%rdi,%rcx), %xmm1, %xmm1
        vpsubb  (%rdi,%rcx), %xmm0, %xmm0
        vpor    %xmm1, %xmm0, %xmm0
        vptest  %xmm0, %xmm0
        jne     .LBB0_4
        addq    $32, %rcx
        cmpq    %rax, %rcx
        jb      .LBB0_2

to

.LBB0_2:                                # %memcmp.loop
        vmovdqu (%rsi,%rcx), %xmm0
        vmovdqu 16(%rsi,%rcx), %xmm1
        vpcmpeqb        (%rdi,%rcx), %xmm0, %xmm0
        vpmovmskb       %xmm0, %edx
        vpcmpeqb        16(%rdi,%rcx), %xmm1, %xmm0
        vpmovmskb       %xmm0, %r8d
        shll    $16, %r8d
        orl     %edx, %r8d
        cmpl    $-1, %r8d
        jne     .LBB0_4
        addq    $32, %rcx
        cmpq    %rax, %rcx
        jb      .LBB0_2

I am not sure into how big performance regression this translates (build & benchmark runs are underway), but the assembly definitely looks worse in the loop block.

The text was updated successfully, but these errors were encountered:

xortator · 2023-04-04T05:56:36Z

@RKSimon can you please take a look?

xortator · 2023-04-04T07:29:45Z

Reduced the test (links updated).

We see increased number of assembly instructions after patch 24780e1 for this test. See details at #61923.

RKSimon · 2023-04-04T10:08:57Z

This will be fixed by https://reviews.llvm.org/D147452

llvmbot · 2023-04-04T10:26:00Z

@llvm/issue-subscribers-backend-x86

We see increased number of assembly instructions after patch 24780e1 for this test. See details at llvm#61923.

… bitcast(<X x i1> V)) canonicalization This already exists in InstCombine but was missing from the late stage ExpandReductions pass Fixes llvm#53419 Fixes llvm#61923 Differential Revision: https://reviews.llvm.org/D147452

We see increased number of assembly instructions after patch 24780e1 for this test. See details at llvm#61923.

… bitcast(<X x i1> V)) canonicalization This already exists in InstCombine but was missing from the late stage ExpandReductions pass Fixes llvm#53419 Fixes llvm#61923 Differential Revision: https://reviews.llvm.org/D147452

github-actions bot added the new issue label Apr 4, 2023

xortator added a commit that referenced this issue Apr 4, 2023

[Test] Commit test for PR61923

ad2a48e

We see increased number of assembly instructions after patch 24780e1 for this test. See details at #61923.

RKSimon self-assigned this Apr 4, 2023

RKSimon closed this as completed in 00e3ae4 Apr 4, 2023

RKSimon added backend:X86 llvm:codegen and removed new issue labels Apr 4, 2023

EugeneZelenko removed the backend:X86 label Apr 4, 2023

gysit pushed a commit to nextsilicon/llvm-project that referenced this issue Apr 27, 2023

[Test] Commit test for PR61923

886ddf5

We see increased number of assembly instructions after patch 24780e1 for this test. See details at llvm#61923.

DianQK pushed a commit to DianQK/llvm-project that referenced this issue Oct 10, 2023

[Test] Commit test for PR61923

6da8569

We see increased number of assembly instructions after patch 24780e1 for this test. See details at llvm#61923.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86] Worse code generation from patch 24780e13e5be1501e34330148137a10fa9965166 #61923

[X86] Worse code generation from patch 24780e13e5be1501e34330148137a10fa9965166 #61923

xortator commented Apr 4, 2023 •

edited

Loading

xortator commented Apr 4, 2023

xortator commented Apr 4, 2023 •

edited

Loading

RKSimon commented Apr 4, 2023

llvmbot commented Apr 4, 2023

[X86] Worse code generation from patch 24780e13e5be1501e34330148137a10fa9965166 #61923

[X86] Worse code generation from patch 24780e13e5be1501e34330148137a10fa9965166 #61923

Comments

xortator commented Apr 4, 2023 • edited Loading

xortator commented Apr 4, 2023

xortator commented Apr 4, 2023 • edited Loading

RKSimon commented Apr 4, 2023

llvmbot commented Apr 4, 2023

xortator commented Apr 4, 2023 •

edited

Loading

xortator commented Apr 4, 2023 •

edited

Loading