Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect folding of two __builtin_memcpy_inline into memmove #61791

Closed
miyuki opened this issue Mar 29, 2023 · 1 comment
Closed

Incorrect folding of two __builtin_memcpy_inline into memmove #61791

miyuki opened this issue Mar 29, 2023 · 1 comment
Assignees

Comments

@miyuki
Copy link
Collaborator

miyuki commented Mar 29, 2023

Consider the following code implementing a 16-byte memmove:

void test(void *src, void *dest) {
  char temp[16];
  __builtin_memcpy_inline(temp, src, 16);
  __builtin_memcpy_inline(dest, temp, 16);
}

The point of memcpy_inline builtin is to avoid calls to library functions so that the builtin can be used to implement library functions such as memcpy and memmove. When targeting Arm M-profile Clang compiles this code into a call to __aeabi_memmove:

$clang -target arm-arm-none-eabi -mcpu=cortex-m7 -O2 -S test.c

produces

test:
        push    {r7, lr}
        mov     r7, sp
        mov     r3, r0
        movs    r2, #16
        mov     r0, r1
        mov     r1, r3
        bl      __aeabi_memmove
        pop     {r7, pc}

The folding happens at LLVM IR level, in MemCpyOptPass:

*** IR Dump After ADCEPass on test ***
; Function Attrs: nounwind
define dso_local void @test(ptr noundef %0, ptr noundef %1) local_unnamed_addr #0 {
  %3 = alloca [16 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %3)
  call void @llvm.memcpy.inline.p0.p0.i64(ptr nonnull align 1 %3, ptr align 1 %0, i64 16, i1 false)
  call void @llvm.memcpy.inline.p0.p0.i64(ptr align 1 %1, ptr nonnull align 1 %3, i64 16, i1 false)
  call void @llvm.lifetime.end.p0(i64 16, ptr nonnull %3)
  ret void
}
*** IR Dump After MemCpyOptPass on test ***
; Function Attrs: nounwind
define dso_local void @test(ptr noundef %0, ptr noundef %1) local_unnamed_addr #0 {
  %3 = alloca [16 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %3)
  call void @llvm.memcpy.inline.p0.p0.i64(ptr nonnull align 1 %3, ptr align 1 %0, i64 16, i1 false)
  call void @llvm.memmove.p0.p0.i64(ptr align 1 %1, ptr align 1 %0, i64 16, i1 false)
  call void @llvm.lifetime.end.p0(i64 16, ptr nonnull %3)
  ret void
}
@miyuki miyuki self-assigned this Mar 29, 2023
@miyuki
Copy link
Collaborator Author

miyuki commented Mar 29, 2023

@miyuki miyuki closed this as completed in ab8150a Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant