Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MachineSink] Clear kill flags of sunk addressing mode registers #75072

Merged
merged 3 commits into from
Dec 13, 2023

Conversation

momchil-velikov
Copy link
Collaborator

When doing sink-and-fold, the MachineSink clears the "killed" flags of the operands of the sunk (and deleted) instruction. However, this is not always sufficient. In some cases we can create the new load/store instruction with operands other than the ones present in the deleted instruction. One such example is folding a zero word extend into a memory load on AArch64. The zero-extend is represented by a pair of instructions - MOV (i.e. ORRwrs) followed by a SUBREG_TO_REG. The SUBREG_TO_REG is deleted (it is the sunk instruction), but the new load instruction mentions operands "killed" in the MOV, which is no longer correct.

To fix this, clear the "killed" flags of the registers paritipating in the adressing mode.

When doing sink-and-fold, the MachineSink clears the "killed" flags
of the operands of the sunk (and deleted) instruction. However, this is
not always sufficient. In some cases we can create the new load/store
instruction with operands other than the ones present in the deleted
instruction. One such example is folding a zero word extend into
a memory load on AArch64. The zero-extend is represented by a pair
of instructions - `MOV` (i.e. `ORRwrs`) followed by a `SUBREG_TO_REG`.
The `SUBREG_TO_REG` is deleted (it is the sunk instruction), but the new
load instruction mentions operands "killed" in the `MOV`, which is no
longer correct.

To fix this, clear the "killed" flags of the registers paritipating in
the adressing mode.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 11, 2023

@llvm/pr-subscribers-backend-aarch64

Author: Momchil Velikov (momchil-velikov)

Changes

When doing sink-and-fold, the MachineSink clears the "killed" flags of the operands of the sunk (and deleted) instruction. However, this is not always sufficient. In some cases we can create the new load/store instruction with operands other than the ones present in the deleted instruction. One such example is folding a zero word extend into a memory load on AArch64. The zero-extend is represented by a pair of instructions - MOV (i.e. ORRwrs) followed by a SUBREG_TO_REG. The SUBREG_TO_REG is deleted (it is the sunk instruction), but the new load instruction mentions operands "killed" in the MOV, which is no longer correct.

To fix this, clear the "killed" flags of the registers paritipating in the adressing mode.


Full diff: https://github.com/llvm/llvm-project/pull/75072.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+8-5)
  • (added) llvm/test/CodeGen/AArch64/sink-and-fold-clear-kill-flags.mir (+134)
diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index 83d775055dfd73..6441b456cc39d1 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -500,11 +500,6 @@ bool MachineSinking::PerformSinkAndFold(MachineInstr &MI,
     return false;
 
   // Now we know we can fold the instruction in all its users.
-  if (UsedRegA)
-    MRI->clearKillFlags(UsedRegA);
-  if (UsedRegB)
-    MRI->clearKillFlags(UsedRegB);
-
   for (auto &[SinkDst, MaybeAM] : SinkInto) {
     MachineInstr *New = nullptr;
     LLVM_DEBUG(dbgs() << "Sinking copy of"; MI.dump(); dbgs() << "into";
@@ -530,6 +525,14 @@ bool MachineSinking::PerformSinkAndFold(MachineInstr &MI,
     } else {
       // Fold instruction into the addressing mode of a memory instruction.
       New = TII->emitLdStWithAddr(*SinkDst, MaybeAM);
+
+      // The registers of the addressing mode may have their live range extended
+      // and their kill flags may no longer be correct. Conservatively clear the
+      // kill flags.
+      if (Register R = MaybeAM.BaseReg; R.isValid() && R.isVirtual())
+        MRI->clearKillFlags(R);
+      if (Register R = MaybeAM.ScaledReg; R.isValid() && R.isVirtual())
+        MRI->clearKillFlags(R);
     }
     LLVM_DEBUG(dbgs() << "yielding"; New->dump());
     // Clear the StoreInstrCache, since we may invalidate it by erasing.
diff --git a/llvm/test/CodeGen/AArch64/sink-and-fold-clear-kill-flags.mir b/llvm/test/CodeGen/AArch64/sink-and-fold-clear-kill-flags.mir
new file mode 100644
index 00000000000000..effc346848b757
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sink-and-fold-clear-kill-flags.mir
@@ -0,0 +1,134 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+# RUN: llc --run-pass=machine-sink %s -o - | FileCheck %s
+
+# Test that the "killed" flags is not present in the ORRWrs instruction below
+--- |
+  source_filename = "crash.ll"
+  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+  target triple = "aarch64-linux"
+
+  define i32 @f(ptr %image, i32 %i) {
+  entry:
+    %add = add i32 %i, 1
+    %idx = zext i32 %add to i64
+    br label %A
+
+  A:                                                ; preds = %B, %A, %entry
+    %sunkaddr = getelementptr i8, ptr %image, i64 %idx
+    %0 = load i8, ptr %sunkaddr, align 1
+    %cmp153 = icmp eq i8 %0, 0
+    br i1 %cmp153, label %B, label %A
+
+  B:                                                ; preds = %A
+    store i32 0, ptr %image, align 1
+    br label %A
+  }
+
+...
+---
+name:            f
+alignment:       4
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   false
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gpr64, preferred-register: '' }
+  - { id: 1, class: gpr64common, preferred-register: '' }
+  - { id: 2, class: gpr32common, preferred-register: '' }
+  - { id: 3, class: gpr32common, preferred-register: '' }
+  - { id: 4, class: gpr32, preferred-register: '' }
+  - { id: 5, class: gpr32, preferred-register: '' }
+  - { id: 6, class: gpr32, preferred-register: '' }
+liveins:
+  - { reg: '$x0', virtual-reg: '%1' }
+  - { reg: '$w1', virtual-reg: '%2' }
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    false
+  hasCalls:        false
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 0
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     false
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo: {}
+body:             |
+  ; CHECK-LABEL: name: f
+  ; CHECK: bb.0.entry:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT:   liveins: $x0, $w1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32common = COPY $w1
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr64common = COPY $x0
+  ; CHECK-NEXT:   [[ADDWri:%[0-9]+]]:gpr32common = ADDWri [[COPY]], 1, 0
+  ; CHECK-NEXT:   [[ORRWrs:%[0-9]+]]:gpr32 = ORRWrs $wzr, [[ADDWri]], 0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1.A:
+  ; CHECK-NEXT:   successors: %bb.2(0x30000000), %bb.1(0x50000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[LDRBBroW:%[0-9]+]]:gpr32 = LDRBBroW [[COPY1]], [[ADDWri]], 0, 0 :: (load (s8) from %ir.sunkaddr)
+  ; CHECK-NEXT:   CBNZW killed [[LDRBBroW]], %bb.1
+  ; CHECK-NEXT:   B %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2.B:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr32 = COPY $wzr
+  ; CHECK-NEXT:   STRWui [[COPY2]], [[COPY1]], 0 :: (store (s32) into %ir.image, align 1)
+  ; CHECK-NEXT:   B %bb.1
+  bb.0.entry:
+    successors: %bb.1(0x80000000)
+    liveins: $x0, $w1
+
+    %2:gpr32common = COPY $w1
+    %1:gpr64common = COPY $x0
+    %3:gpr32common = ADDWri %2, 1, 0
+    %4:gpr32 = ORRWrs $wzr, killed %3, 0
+    %0:gpr64 = SUBREG_TO_REG 0, killed %4, %subreg.sub_32
+
+  bb.1.A:
+    successors: %bb.2(0x30000000), %bb.1(0x50000000)
+
+    %5:gpr32 = LDRBBroX %1, %0, 0, 0 :: (load (s8) from %ir.sunkaddr)
+    CBNZW killed %5, %bb.1
+    B %bb.2
+
+  bb.2.B:
+    successors: %bb.1(0x80000000)
+
+    %6:gpr32 = COPY $wzr
+    STRWui %6, %1, 0 :: (store (s32) into %ir.image, align 1)
+    B %bb.1
+
+...

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@momchil-velikov momchil-velikov merged commit d7ee99a into llvm:main Dec 13, 2023
4 checks passed
@momchil-velikov momchil-velikov deleted the sink-and-fold-zext-hotfix branch January 30, 2024 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants