[AArch64] Fix invalid address-mode folding #142167

Dudeldu · 2025-05-30T15:16:23Z

In some cases, we are too aggressive when folding an add-lsl into an ldr/str due to an accidental truncation of the 64-bit scale to 32-bit. In cases where we shift by more than 31 bits (which is valid for 64-bit registers) we just drop the shift...

In some cases we are too aggressive when folding an add-lsl into an ldr/str due to an accidential truncation of the 64-bit scale to 32 bit. In cases where we shift by more than 31 bits (which is valid for 64-bit registers) we just drop the shift...

llvmbot · 2025-05-30T15:16:58Z

@llvm/pr-subscribers-backend-aarch64

Author: None (Dudeldu)

Changes

In some cases, we are too aggressive when folding an add-lsl into an ldr/str due to an accidental truncation of the 64-bit scale to 32-bit. In cases where we shift by more than 31 bits (which is valid for 64-bit registers) we just drop the shift...

Full diff: https://github.com/llvm/llvm-project/pull/142167.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2)
(added) llvm/test/CodeGen/AArch64/fuse-addr-mode.mir (+39)

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index d1e0d37e33e4e..9ac3727aad1f1 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -3229,6 +3229,8 @@ bool AArch64InstrInfo::canFoldIntoAddrMode(const MachineInstr &MemI,
           ExtAddrMode::Formula Form = ExtAddrMode::Formula::Basic) -> bool {
     if (MemI.getOperand(2).getImm() != 0)
       return false;
+    if ((unsigned)Scale != Scale)
+      return false;
     if (!isLegalAddressingMode(NumBytes, /* Offset */ 0, Scale))
       return false;
     AM.BaseReg = AddrI.getOperand(1).getReg();
diff --git a/llvm/test/CodeGen/AArch64/fuse-addr-mode.mir b/llvm/test/CodeGen/AArch64/fuse-addr-mode.mir
new file mode 100644
index 0000000000000..69bdbc9809ed2
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/fuse-addr-mode.mir
@@ -0,0 +1,39 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64 -o - %s -run-pass machine-sink | FileCheck %s
+
+# we want to fuse an addition with lsl into an ldr but we have to be careful with
+# the shift distance: we can only represent specific shift distances: e.g: 3
+# but nothing large like 32
+
+--- |
+  define dso_local i64 @fuse_shift_add_into_addr_mode()  {
+  entry:
+    ret i64 0
+  }
+
+---
+name:            fuse_shift_add_into_addr_mode
+body:             |
+  bb.1.entry:
+    liveins: $x0, $x1
+
+    ; CHECK-LABEL: name: fuse_shift_add_into_addr_mode
+    ; CHECK: liveins: $x0, $x1
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64common = COPY $x0
+    ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64 = COPY $x1
+    ; CHECK-NEXT: [[LDRXroX:%[0-9]+]]:gpr64 = LDRXroX [[COPY]], [[COPY1]], 0, 1 :: (load (s64))
+    ; CHECK-NEXT: [[ADDXrs:%[0-9]+]]:gpr64common = ADDXrs [[COPY]], [[COPY1]], 5
+    ; CHECK-NEXT: [[LDRXui:%[0-9]+]]:gpr64 = LDRXui [[ADDXrs]], 0 :: (load (s64))
+    ; CHECK-NEXT: [[ADDXrs1:%[0-9]+]]:gpr64common = ADDXrs [[COPY]], [[COPY1]], 32
+    ; CHECK-NEXT: [[LDRXui1:%[0-9]+]]:gpr64 = LDRXui [[ADDXrs1]], 0 :: (load (s64))
+    ; CHECK-NEXT: RET_ReallyLR implicit $x0
+    %0:gpr64 = COPY $x0
+    %1:gpr64 = COPY $x1
+    %2:gpr64common = ADDXrs %0, %1, 3
+    %3:gpr64 = LDRXui %2, 0 :: (load (s64))
+    %4:gpr64common = ADDXrs %0, %1, 5
+    %5:gpr64 = LDRXui %4, 0 :: (load (s64))
+    %6:gpr64common = ADDXrs %0, %1, 32
+    %7:gpr64 = LDRXui %6, 0 :: (load (s64))
+    RET_ReallyLR implicit $x0

davemgreen

LGTM thanks. I wasn't able to get this to come up without turning off sinking in CGP, but I can imagine it could in some more complex situation.

davemgreen · 2025-05-30T18:59:04Z

llvm/test/CodeGen/AArch64/fuse-addr-mode.mir

+    %6:gpr64common = ADDXrs %0, %1, 32
+    %7:gpr64 = LDRXui %6, 0 :: (load (s64))


Can we add a test with 35 too? Just in case it gets truncated differently.

Dudeldu · 2025-06-02T06:41:14Z

I added additional testcases for 35 and 63. @davemgreen Could you please merge the PR for me (I do not have write access).

davemgreen · 2025-06-03T07:19:33Z

Thanks

In some cases, we are too aggressive when folding an add-lsl into an ldr/str due to an accidental truncation of the 64-bit scale to 32-bit. In cases where we shift by more than 31 bits (which is valid for 64-bit registers) we just drop the shift...

llvmbot added the backend:AArch64 label May 30, 2025

davemgreen requested review from momchil-velikov and davemgreen May 30, 2025 19:00

davemgreen approved these changes May 30, 2025

View reviewed changes

Add more testcases

e32bfcb

davemgreen merged commit c6c2b81 into llvm:main Jun 3, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Fix invalid address-mode folding #142167

[AArch64] Fix invalid address-mode folding #142167

Uh oh!

Dudeldu commented May 30, 2025

Uh oh!

llvmbot commented May 30, 2025

Uh oh!

davemgreen left a comment

Uh oh!

davemgreen May 30, 2025

Uh oh!

Dudeldu commented Jun 2, 2025

Uh oh!

davemgreen commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

		%6:gpr64common = ADDXrs %0, %1, 32
		%7:gpr64 = LDRXui %6, 0 :: (load (s64))

[AArch64] Fix invalid address-mode folding #142167

[AArch64] Fix invalid address-mode folding #142167

Uh oh!

Conversation

Dudeldu commented May 30, 2025

Uh oh!

llvmbot commented May 30, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

davemgreen May 30, 2025

Choose a reason for hiding this comment

Uh oh!

Dudeldu commented Jun 2, 2025

Uh oh!

davemgreen commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!