Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Optimize the offset of memory access #71917

Open
vfdff opened this issue Nov 10, 2023 · 3 comments · Fixed by #75343 · May be fixed by #79951
Open

[AArch64] Optimize the offset of memory access #71917

vfdff opened this issue Nov 10, 2023 · 3 comments · Fixed by #75343 · May be fixed by #79951

Comments

@vfdff
Copy link
Contributor

vfdff commented Nov 10, 2023

int testOffset(char a[]) {
  return a[0xfde78];
}
  • gcc:
testOffset(char*):
        add     x0, x0, 1036288
        ldrb    w0, [x0, 3704]
        ret
  • llvm:
testOffset(char*):                       // @testOffset(char*)
        mov     w8, #56952                      // =0xde78
        movk    w8, #15, lsl #16
        ldrb    w0, [x0, x8]
        ret
  • Gcc using base addressing produces more efficient instructions than llvm using index addressing.
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 10, 2023

@llvm/issue-subscribers-backend-aarch64

Author: Allen (vfdff)

* test: https://gcc.godbolt.org/z/nhYcWq1WE ``` int testOffset(char a[]) { return a[0xfde78]; } ``` * gcc: ``` testOffset(char*): add x0, x0, 1036288 ldrb w0, [x0, 3704] ret ``` * llvm: ``` testOffset(char*): // @testOffset(char*) mov w8, #56952 // =0xde78 movk w8, #15, lsl #16 ldrb w0, [x0, x8] ret ```
  • Gcc using base addressing produces more efficient instructions than llvm using index addressing.

@vfdff
Copy link
Contributor Author

vfdff commented Nov 13, 2023

  • its related IR
define i32 @testOffset(ptr nocapture noundef readonly %a)  {
entry:
  %arrayidx = getelementptr inbounds i8, ptr %a, i64 1039992
  %0 = load i8, ptr %arrayidx, align 1
  %conv = zext i8 %0 to i32
  ret i32 %conv
}

vfdff added a commit to vfdff/llvm-project that referenced this issue Dec 5, 2023
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
```
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]
```
Only support single use base, multi-use scenes are supported by PR74046.
Fix llvm#71917

TODO: support the multiple-uses with reuseing common base offset.
https://gcc.godbolt.org/z/Mr7srTjnz
@vfdff vfdff closed this as completed in 32878c2 Dec 21, 2023
@vfdff vfdff reopened this Dec 22, 2023
@vfdff
Copy link
Contributor Author

vfdff commented Dec 22, 2023

Reverted and reported issues #76202

vfdff added a commit to vfdff/llvm-project that referenced this issue Jan 30, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, llvm#56952
  movk    w8, llvm#15, lsl llvm#16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm#71917
qihangkong pushed a commit to rvgpu/llvm that referenced this issue Apr 18, 2024
qihangkong pushed a commit to rvgpu/rvgpu-llvm that referenced this issue Apr 23, 2024
A case for this transformation, https://gcc.godbolt.org/z/nhYcWq1WE
Fold
  mov     w8, #56952
  movk    w8, #15, lsl #16
  ldrb    w0, [x0, x8]
into
  add     x0, x0, 1036288
  ldrb    w0, [x0, 3704]

Only LDRBBroX is supported for the first time.
Fix llvm/llvm-project#71917
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment