Skip to content

Performance degradation of memcmp(size=24) after commit 308322 #33261

@ZviRackover

Description

@ZviRackover
Bugzilla Link 33914
Resolution FIXED
Resolved on Aug 02, 2017 05:23
Version trunk
OS Windows NT
Blocks #33196
CC @topperc,@zmodem,@RKSimon,@rnk,@rotateright

Extended Description

We observed a 13% degradation in an internal benchmark after commit 308322.

The minimal reproducer:

define i32 @​foo(i8* %A, i8* %B) {
%res = call i32 @​memcmp(i8* %A, i8* %B, i64 24)
ret i32 %res
}
declare i32 @​memcmp(i8* nocapture, i8* nocapture, i64) local_unnamed_addr #​5

Before the commit the call to memcmp was lowered to a call to glibc's memcmp which was dispatched to __memcmp_sse4_1. The hot code in __memcmp_sse4_1 was doing 1-XMM-load pair+ptest+jcc+8-byte-load-pair+cmp+jcc:

...
│ movdqu -0x18(%rdi),%xmm2
│ movdqu -0x18(%rsi),%xmm1
11.11 │ pxor %xmm1,%xmm2
7.41 │ ptest %xmm2,%xmm0
14.81 │ ↓ jae 15e8
14.81 │ mov -0x8(%rsi),%rcx
│ mov -0x8(%rdi),%rax
│ cmp %rax,%rcx
│ ↓ jne 1603
│ xor %eax,%eax
11.11 │ ← retq
...

After the commit the memcmp is expanded inline to three 8-byte-load-pairs+cmp+jcc's:

...

# BB#0: # %loadbb

    movbeq  (%rdi), %rcx
    movbeq  (%rsi), %rdx
    cmpq    %rdx, %rcx
    jne     .LBB0_1

BB#2: # %loadbb1

    movbeq  8(%rdi), %rcx
    movbeq  8(%rsi), %rdx
    cmpq    %rdx, %rcx
    jne     .LBB0_1

BB#3: # %loadbb2

    movbeq  16(%rdi), %rcx
    movbeq  16(%rsi), %rdx
    xorl    %eax, %eax
    cmpq    %rdx, %rcx
    jne     .LBB0_1

BB#4: # %endblock

    retq

.LBB0_1: # %res_block
cmpq %rdx, %rcx
movl $-1, %ecx
movl $1, %eax
cmovbl %ecx, %eax
retq
...

Options for fixing:

  1. Improve the inline expansion to generate a similar sequence to glibc's: 1 16-byte pair load + ptest + jcc + 8-byte load + cmp + jmp

  2. call libc's memcmp

I would like to request this commit be reverted until we get this issue fixed. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions