Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MachineSink] Drop debug info for instructions deleted by sink-and-fold #71443

Merged
merged 2 commits into from
Nov 11, 2023

Conversation

momchil-velikov
Copy link
Collaborator

After performing sink-and-fold over a COPY, the original instruction is
replaced with one that produces its output in the destination of the
copy. Its value is still available (in a hard register), so if there are
debug instructions which refer to the (now deleted) virtual register
they could be updated to refer to the hard register, in principle.
However, it's not clear how to do that, moreover in some cases the debug
instructions may need to be replicated proportionally to the number of
the COPY instructions replaced and in some extreme cases we can end up
with quadratic increase in the number of debug instructions, e.g:

    int f(int);

    void g(int x) {
      int y = x + 1;

      int t0 = y;
      f(t0);

      int t1 = y;
      f(t1);
    }

After performing sink-and-fold over a COPY, the original instruction is
replaced with one that produces its output in the destination of the
copy. Its value is still available (in a hard register), so if there are
debug instructions which refer to the (now deleted) virtual register
they could be updated to refer to the hard register, in principle.
However, it's not clear how to do that, moreover in some cases the debug
instructions may need to be replicated proportionally to the number of
the COPY instructions replaced and in some extreme cases we can end up
with quadratic increase in the number of debug instructions, e.g:

    int f(int);

    void g(int x) {
      int y = x + 1;

      int t0 = y;
      f(t0);

      int t1 = y;
      f(t1);
    }
@momchil-velikov
Copy link
Collaborator Author

This is intended to fix the issue in #67432 (comment)

@bgra8

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 6, 2023

@llvm/pr-subscribers-debuginfo

Author: Momchil Velikov (momchil-velikov)

Changes

After performing sink-and-fold over a COPY, the original instruction is
replaced with one that produces its output in the destination of the
copy. Its value is still available (in a hard register), so if there are
debug instructions which refer to the (now deleted) virtual register
they could be updated to refer to the hard register, in principle.
However, it's not clear how to do that, moreover in some cases the debug
instructions may need to be replicated proportionally to the number of
the COPY instructions replaced and in some extreme cases we can end up
with quadratic increase in the number of debug instructions, e.g:

    int f(int);

    void g(int x) {
      int y = x + 1;

      int t0 = y;
      f(t0);

      int t1 = y;
      f(t1);
    }

Patch is 21.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71443.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+12-19)
  • (removed) llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir (-304)
  • (added) llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir (+144)
  • (modified) llvm/test/DebugInfo/Generic/no-empty-child-vars.ll (+6-3)
diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index d0e119cb7470632..9ed6b22eae972cc 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -510,30 +510,23 @@ bool MachineSinking::PerformSinkAndFold(MachineInstr &MI,
     LLVM_DEBUG(dbgs() << "Sinking copy of"; MI.dump(); dbgs() << "into";
                SinkDst->dump());
     if (SinkDst->isCopy()) {
+      // TODO: After performing the sink-and-fold, the original instruction is
+      // deleted. Its value is still available (in a hard register), so if there
+      // are debug instructions which refer to the (now deleted) virtual
+      // register they could be updated to refer to the hard register, in
+      // principle. However, it's not clear how to do that, moreover in some
+      // cases the debug instructions may need to be replicated proportionally
+      // to the number of the COPY instructions replaced and in some extreme
+      // cases we can end up with quadratic increase in the number of debug
+      // instructions.
+
       // Sink a copy of the instruction, replacing a COPY instruction.
       MachineBasicBlock::iterator InsertPt = SinkDst->getIterator();
       Register DstReg = SinkDst->getOperand(0).getReg();
       TII->reMaterialize(*SinkDst->getParent(), InsertPt, DstReg, 0, MI, *TRI);
-      // If the original instruction did not have source location, reuse a one
-      // from the COPY.
+      // Reuse the source location from the COPY.
       New = &*std::prev(InsertPt);
-      if (const DebugLoc &NewLoc = New->getDebugLoc(); !NewLoc)
-        New->setDebugLoc(SinkDst->getDebugLoc());
-      // Sink DBG_VALUEs, which refer to the original instruction's destination
-      // (DefReg).
-      MachineBasicBlock &SinkMBB = *SinkDst->getParent();
-      auto &DbgUsers = SeenDbgUsers[DefReg];
-      for (auto &U : DbgUsers) {
-        MachineInstr *DbgMI = U.getPointer();
-        if (U.getInt())
-          continue;
-        MachineInstr *NewDbgMI = SinkDst->getMF()->CloneMachineInstr(DbgMI);
-        SinkMBB.insertAfter(InsertPt, NewDbgMI);
-        for (auto &SrcMO : DbgMI->getDebugOperandsForReg(DefReg)) {
-          auto &DstMO = NewDbgMI->getOperand(SrcMO.getOperandNo());
-          DstMO.setReg(DstReg);
-        }
-      }
+      New->setDebugLoc(SinkDst->getDebugLoc());
     } else {
       // Fold instruction into the addressing mode of a memory instruction.
       New = TII->emitLdStWithAddr(*SinkDst, MaybeAM);
diff --git a/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir b/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir
deleted file mode 100644
index 82dae8a86ea2b77..000000000000000
--- a/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir
+++ /dev/null
@@ -1,304 +0,0 @@
-# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
-# RUN: llc --aarch64-enable-sink-fold=true --run-pass=machine-sink %s -o - | FileCheck %s
---- |
-  source_filename = "x.ll"
-  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
-  target triple = "aarch64-linux"
-
-  %S = type <{ ptr, %T, i64, i64, i32, [4 x i8] }>
-  %T = type { ptr, ptr, ptr, ptr, i64 }
-
-  define void @f(ptr %p, i1 %c0, i1 %c1) {
-  entry:
-    %a = getelementptr %S, ptr %p, i64 0, i32 1
-    call void @llvm.dbg.value(metadata ptr %a, metadata !4, metadata !DIExpression()), !dbg !10
-    br i1 %c0, label %if.then, label %if.end
-
-  if.then:                                          ; preds = %entry
-    %v0 = tail call ptr @g(ptr %a)
-    br i1 %c1, label %exit, label %if.end
-
-  if.end:                                           ; preds = %if.then, %entry
-    %v1 = load i64, ptr %a, align 8
-    br label %exit
-
-  exit:                                             ; preds = %if.end, %if.then
-    ret void
-  }
-
-  define ptr @g(ptr) {
-  entry:
-    br label %if.then
-  if.then:
-    br label %if.end
-  if.end:
-    br label %exit
-  exit:
-    ret ptr null
-  }
-
-  declare void @llvm.dbg.value(metadata, metadata, metadata) #0
-
-  attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
-
-  !llvm.dbg.cu = !{!0}
-  !llvm.module.flags = !{!2, !3}
-
-  !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang ", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
-  !1 = !DIFile(filename: "f.c", directory: "/usr/rms")
-  !2 = !{i32 7, !"Dwarf Version", i32 4}
-  !3 = !{i32 2, !"Debug Info Version", i32 3}
-  !4 = !DILocalVariable(name: "x", arg: 1, scope: !5, file: !1, line: 2, type: !8)
-  !5 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 2, type: !6, scopeLine: 2, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !9)
-  !6 = !DISubroutineType(types: !7)
-  !7 = !{!8, !8}
-  !8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
-  !9 = !{}
-  !10 = !DILocation(line: 2, column: 11, scope: !5)
-
-...
----
-name:            f
-alignment:       4
-exposesReturnsTwice: false
-legalized:       false
-regBankSelected: false
-selected:        false
-failedISel:      false
-tracksRegLiveness: true
-hasWinCFI:       false
-callsEHReturn:   false
-callsUnwindInit: false
-hasEHCatchret:   false
-hasEHScopes:     false
-hasEHFunclets:   false
-isOutlined:      false
-debugInstrRef:   false
-failsVerification: false
-tracksDebugUserValues: false
-registers:
-  - { id: 0, class: gpr64all, preferred-register: '' }
-  - { id: 1, class: gpr64common, preferred-register: '' }
-  - { id: 2, class: gpr32, preferred-register: '' }
-  - { id: 3, class: gpr32, preferred-register: '' }
-  - { id: 4, class: gpr32, preferred-register: '' }
-  - { id: 5, class: gpr64sp, preferred-register: '' }
-  - { id: 6, class: gpr64all, preferred-register: '' }
-liveins:
-  - { reg: '$x0', virtual-reg: '%1' }
-  - { reg: '$w1', virtual-reg: '%2' }
-  - { reg: '$w2', virtual-reg: '%3' }
-frameInfo:
-  isFrameAddressTaken: false
-  isReturnAddressTaken: false
-  hasStackMap:     false
-  hasPatchPoint:   false
-  stackSize:       0
-  offsetAdjustment: 0
-  maxAlignment:    1
-  adjustsStack:    true
-  hasCalls:        true
-  stackProtector:  ''
-  functionContext: ''
-  maxCallFrameSize: 0
-  cvBytesOfCalleeSavedRegisters: 0
-  hasOpaqueSPAdjustment: false
-  hasVAStart:      false
-  hasMustTailInVarArgFunc: false
-  hasTailCall:     false
-  localFrameSize:  0
-  savePoint:       ''
-  restorePoint:    ''
-fixedStack:      []
-stack:           []
-entry_values:    []
-callSites:       []
-debugValueSubstitutions: []
-constants:       []
-machineFunctionInfo: {}
-body:             |
-  ; CHECK-LABEL: name: f
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT:   liveins: $x0, $w1, $w2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32 = COPY $w2
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr32 = COPY $w1
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr64common = COPY $x0
-  ; CHECK-NEXT:   DBG_VALUE $noreg, $noreg, !4, !DIExpression(), debug-location !10
-  ; CHECK-NEXT:   TBZW [[COPY1]], 0, %bb.2
-  ; CHECK-NEXT:   B %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.if.then:
-  ; CHECK-NEXT:   successors: %bb.3(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr32 = COPY [[COPY]]
-  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   $x0 = ADDXri [[COPY2]], 8, 0
-  ; CHECK-NEXT:   DBG_VALUE $x0, $noreg, !4, !DIExpression(), debug-location !10
-  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   TBNZW [[COPY3]], 0, %bb.3
-  ; CHECK-NEXT:   B %bb.2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.if.end:
-  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.3.exit:
-  ; CHECK-NEXT:   RET_ReallyLR
-  bb.0.entry:
-    successors: %bb.1(0x40000000), %bb.2(0x40000000)
-    liveins: $x0, $w1, $w2
-
-    %3:gpr32 = COPY $w2
-    %2:gpr32 = COPY $w1
-    %1:gpr64common = COPY $x0
-    %4:gpr32 = COPY %3
-    %5:gpr64sp = ADDXri %1, 8, 0
-    DBG_VALUE %5, $noreg, !4, !DIExpression(), debug-location !10
-    %0:gpr64all = COPY %5
-    TBZW %2, 0, %bb.2
-    B %bb.1
-
-  bb.1.if.then:
-    successors: %bb.3(0x40000000), %bb.2(0x40000000)
-
-    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-    $x0 = COPY %0
-    BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-    TBNZW %4, 0, %bb.3
-    B %bb.2
-
-  bb.2.if.end:
-    successors: %bb.3(0x80000000)
-
-
-  bb.3.exit:
-    RET_ReallyLR
-
-...
----
-name:            g
-alignment:       4
-exposesReturnsTwice: false
-legalized:       false
-regBankSelected: false
-selected:        false
-failedISel:      false
-tracksRegLiveness: true
-hasWinCFI:       false
-callsEHReturn:   false
-callsUnwindInit: false
-hasEHCatchret:   false
-hasEHScopes:     false
-hasEHFunclets:   false
-isOutlined:      false
-debugInstrRef:   false
-failsVerification: false
-tracksDebugUserValues: false
-registers:
-  - { id: 0, class: gpr64all, preferred-register: '' }
-  - { id: 1, class: gpr64common, preferred-register: '' }
-  - { id: 2, class: gpr32, preferred-register: '' }
-  - { id: 3, class: gpr32, preferred-register: '' }
-  - { id: 4, class: gpr32, preferred-register: '' }
-  - { id: 5, class: gpr64sp, preferred-register: '' }
-  - { id: 6, class: gpr64all, preferred-register: '' }
-liveins:
-  - { reg: '$x0', virtual-reg: '%1' }
-  - { reg: '$w1', virtual-reg: '%2' }
-  - { reg: '$w2', virtual-reg: '%3' }
-frameInfo:
-  isFrameAddressTaken: false
-  isReturnAddressTaken: false
-  hasStackMap:     false
-  hasPatchPoint:   false
-  stackSize:       0
-  offsetAdjustment: 0
-  maxAlignment:    1
-  adjustsStack:    true
-  hasCalls:        true
-  stackProtector:  ''
-  functionContext: ''
-  maxCallFrameSize: 0
-  cvBytesOfCalleeSavedRegisters: 0
-  hasOpaqueSPAdjustment: false
-  hasVAStart:      false
-  hasMustTailInVarArgFunc: false
-  hasTailCall:     false
-  localFrameSize:  0
-  savePoint:       ''
-  restorePoint:    ''
-fixedStack:      []
-stack:           []
-entry_values:    []
-callSites:       []
-debugValueSubstitutions: []
-constants:       []
-machineFunctionInfo: {}
-body:             |
-  ; CHECK-LABEL: name: g
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT:   liveins: $x0, $w1, $w2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32 = COPY $w2
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr32 = COPY $w1
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr64common = COPY $x0
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY]], $noreg, debug-location !10
-  ; CHECK-NEXT:   TBZW [[COPY1]], 0, %bb.2
-  ; CHECK-NEXT:   B %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.if.then:
-  ; CHECK-NEXT:   successors: %bb.3(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr32 = COPY [[COPY]]
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY3]], $noreg, debug-location !10
-  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   $x0 = ADDXri [[COPY2]], 8, 0
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY3]], $x0, debug-location !10
-  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   TBNZW [[COPY3]], 0, %bb.3
-  ; CHECK-NEXT:   B %bb.2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.if.end:
-  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.3.exit:
-  ; CHECK-NEXT:   RET_ReallyLR
-  bb.0.entry:
-    successors: %bb.1(0x40000000), %bb.2(0x40000000)
-    liveins: $x0, $w1, $w2
-
-    %3:gpr32 = COPY $w2
-    %2:gpr32 = COPY $w1
-    %1:gpr64common = COPY $x0
-    %4:gpr32 = COPY %3
-    %5:gpr64sp = ADDXri %1, 8, 0
-    DBG_VALUE_LIST !4, !DIExpression(), %4:gpr32, %5:gpr64sp, debug-location !10
-    %0:gpr64all = COPY %5
-    TBZW %2, 0, %bb.2
-    B %bb.1
-
-  bb.1.if.then:
-    successors: %bb.3(0x40000000), %bb.2(0x40000000)
-
-    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-    $x0 = COPY %0
-    BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-    TBNZW %4, 0, %bb.3
-    B %bb.2
-
-  bb.2.if.end:
-    successors: %bb.3(0x80000000)
-
-
-  bb.3.exit:
-    RET_ReallyLR
-
-...
diff --git a/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir b/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir
new file mode 100644
index 000000000000000..3cf490474bdc480
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir
@@ -0,0 +1,144 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc --aarch64-enable-sink-fold=true --run-pass=machine-sink %s -o - | FileCheck %s
+--- |
+  source_filename = "dbg.c"
+  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+  target triple = "aarch64-unknown-linux"
+
+  define dso_local void @g(i32 noundef %x) local_unnamed_addr !dbg !11 {
+  entry:
+    call void @llvm.dbg.value(metadata i32 %x, metadata !16, metadata !DIExpression()), !dbg !20
+    %add = add nsw i32 %x, 1, !dbg !21
+    call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression()), !dbg !20
+    call void @llvm.dbg.value(metadata i32 %add, metadata !18, metadata !DIExpression()), !dbg !20
+    %call = tail call i32 @f(i32 noundef %add), !dbg !22
+    call void @llvm.dbg.value(metadata i32 %add, metadata !19, metadata !DIExpression()), !dbg !20
+    %call1 = tail call i32 @f(i32 noundef %add), !dbg !23
+    ret void, !dbg !24
+  }
+
+  declare !dbg !25 i32 @f(i32 noundef)
+
+  declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+  !llvm.dbg.cu = !{!0}
+  !llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8, !9}
+  !llvm.ident = !{!10}
+
+  !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 18.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+  !1 = !DIFile(filename: "dbg.c", directory: "/home/sweet", checksumkind: CSK_MD5, checksum: "b6bd7ad2140c696af9201a17ab73c918")
+  !2 = !{i32 7, !"Dwarf Version", i32 5}
+  !3 = !{i32 2, !"Debug Info Version", i32 3}
+  !4 = !{i32 1, !"wchar_size", i32 4}
+  !5 = !{i32 8, !"PIC Level", i32 2}
+  !6 = !{i32 7, !"PIE Level", i32 2}
+  !7 = !{i32 7, !"uwtable", i32 2}
+  !8 = !{i32 7, !"frame-pointer", i32 1}
+  !9 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+  !10 = !{!"clang version 18.0.0"}
+  !11 = distinct !DISubprogram(name: "g", scope: !1, file: !1, line: 3, type: !12, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !15)
+  !12 = !DISubroutineType(types: !13)
+  !13 = !{null, !14}
+  !14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+  !15 = !{!16, !17, !18, !19}
+  !16 = !DILocalVariable(name: "x", arg: 1, scope: !11, file: !1, line: 3, type: !14)
+  !17 = !DILocalVariable(name: "y", scope: !11, file: !1, line: 4, type: !14)
+  !18 = !DILocalVariable(name: "t0", scope: !11, file: !1, line: 6, type: !14)
+  !19 = !DILocalVariable(name: "t1", scope: !11, file: !1, line: 9, type: !14)
+  !20 = !DILocation(line: 0, scope: !11)
+  !21 = !DILocation(line: 4, column: 13, scope: !11)
+  !22 = !DILocation(line: 7, column: 3, scope: !11)
+  !23 = !DILocation(line: 10, column: 3, scope: !11)
+  !24 = !DILocation(line: 11, column: 1, scope: !11)
+  !25 = !DISubprogram(name: "f", scope: !1, file: !1, line: 1, type: !26, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
+  !26 = !DISubroutineType(types: !27)
+  !27 = !{!14, !14}
+
+...
+---
+name:            g
+alignment:       4
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   false
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gpr32common, preferred-register: '' }
+  - { id: 1, class: gpr32sp, preferred-register: '' }
+  - { id: 2, class: gpr32all, preferred-register: '' }
+liveins:
+  - { reg: '$w0', virtual-reg: '%0' }
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    true
+  hasCalls:        true
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 0
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     true
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo: {}
+body:             |
+  bb.0.entry:
+    liveins: $w0
+
+    ; CHECK-LABEL: name: g
+    ; CHECK: liveins: $w0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: DBG_VALUE $w0, $noreg, !16, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr32common = COPY $w0
+    ; CHECK-NEXT: DBG_VALUE [[COPY]], $noreg, !16, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !17, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !18, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    ; CHECK-NEXT: $w0 = nsw ADDWri [[COPY]], 1, 0, debug-location !22
+    ; CHECK-NEXT: BL @f, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0, debug-location !22
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !19, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: $w0 = nsw ADDWri [[COPY]], 1, 0, debug-location !23
+    ; CHECK-NEXT: TCRETURNdi @f, 0, csr_aarch64_aapcs, implicit $sp, implicit $w0, debug-location !23
+    DBG_VALUE $w0, $noreg, !16, !DIExpression(), debug-location !20
+    %0:gpr32common = COPY $w0
+    DBG_VALUE %0, $noreg, !16, !DIExpression(), debug-location !20
+    %1:gpr32sp = nsw ADDWri %0, 1, 0, debug-location !21
+    DBG_VALUE %1, $noreg, !17, !DIExpression(), debug-location !20
+    DBG_VALUE %1, $noreg, !18, !DIExpression(), debug-location !20
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    $w0 = COPY %1, debug-location !22
+    BL @f, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0, debug-location !22
+    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    DBG_VALUE %1, $noreg, !19, !DIExpression(), debug-location !20
+    $w0 = COPY %1, debug-location !23
+    TCRETURNdi @f, 0, csr_aarch64_aapcs, implicit $sp, implicit $w0, debug-location !23
+
+...
diff --git a/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll b/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
index 1399b68b9b7f184..2b533968e330021 100644
--- a/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
+++ b/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
@@ -68,11 +68,14 @@
 ; CHECK:       DW_TAG_inlined_subroutine
 ; CHECK:     NULL
 
+; NOTE: Instructions below changed from `add` to `mul` to make them more expensive
+; and unlikely to be sunk to replace COPYs into a return value register.
+
 ; Function Attrs: norecurs...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 6, 2023

@llvm/pr-subscribers-backend-aarch64

Author: Momchil Velikov (momchil-velikov)

Changes

After performing sink-and-fold over a COPY, the original instruction is
replaced with one that produces its output in the destination of the
copy. Its value is still available (in a hard register), so if there are
debug instructions which refer to the (now deleted) virtual register
they could be updated to refer to the hard register, in principle.
However, it's not clear how to do that, moreover in some cases the debug
instructions may need to be replicated proportionally to the number of
the COPY instructions replaced and in some extreme cases we can end up
with quadratic increase in the number of debug instructions, e.g:

    int f(int);

    void g(int x) {
      int y = x + 1;

      int t0 = y;
      f(t0);

      int t1 = y;
      f(t1);
    }

Patch is 21.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71443.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+12-19)
  • (removed) llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir (-304)
  • (added) llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir (+144)
  • (modified) llvm/test/DebugInfo/Generic/no-empty-child-vars.ll (+6-3)
diff --git a/llvm/lib/CodeGen/MachineSink.cpp b/llvm/lib/CodeGen/MachineSink.cpp
index d0e119cb7470632..9ed6b22eae972cc 100644
--- a/llvm/lib/CodeGen/MachineSink.cpp
+++ b/llvm/lib/CodeGen/MachineSink.cpp
@@ -510,30 +510,23 @@ bool MachineSinking::PerformSinkAndFold(MachineInstr &MI,
     LLVM_DEBUG(dbgs() << "Sinking copy of"; MI.dump(); dbgs() << "into";
                SinkDst->dump());
     if (SinkDst->isCopy()) {
+      // TODO: After performing the sink-and-fold, the original instruction is
+      // deleted. Its value is still available (in a hard register), so if there
+      // are debug instructions which refer to the (now deleted) virtual
+      // register they could be updated to refer to the hard register, in
+      // principle. However, it's not clear how to do that, moreover in some
+      // cases the debug instructions may need to be replicated proportionally
+      // to the number of the COPY instructions replaced and in some extreme
+      // cases we can end up with quadratic increase in the number of debug
+      // instructions.
+
       // Sink a copy of the instruction, replacing a COPY instruction.
       MachineBasicBlock::iterator InsertPt = SinkDst->getIterator();
       Register DstReg = SinkDst->getOperand(0).getReg();
       TII->reMaterialize(*SinkDst->getParent(), InsertPt, DstReg, 0, MI, *TRI);
-      // If the original instruction did not have source location, reuse a one
-      // from the COPY.
+      // Reuse the source location from the COPY.
       New = &*std::prev(InsertPt);
-      if (const DebugLoc &NewLoc = New->getDebugLoc(); !NewLoc)
-        New->setDebugLoc(SinkDst->getDebugLoc());
-      // Sink DBG_VALUEs, which refer to the original instruction's destination
-      // (DefReg).
-      MachineBasicBlock &SinkMBB = *SinkDst->getParent();
-      auto &DbgUsers = SeenDbgUsers[DefReg];
-      for (auto &U : DbgUsers) {
-        MachineInstr *DbgMI = U.getPointer();
-        if (U.getInt())
-          continue;
-        MachineInstr *NewDbgMI = SinkDst->getMF()->CloneMachineInstr(DbgMI);
-        SinkMBB.insertAfter(InsertPt, NewDbgMI);
-        for (auto &SrcMO : DbgMI->getDebugOperandsForReg(DefReg)) {
-          auto &DstMO = NewDbgMI->getOperand(SrcMO.getOperandNo());
-          DstMO.setReg(DstReg);
-        }
-      }
+      New->setDebugLoc(SinkDst->getDebugLoc());
     } else {
       // Fold instruction into the addressing mode of a memory instruction.
       New = TII->emitLdStWithAddr(*SinkDst, MaybeAM);
diff --git a/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir b/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir
deleted file mode 100644
index 82dae8a86ea2b77..000000000000000
--- a/llvm/test/CodeGen/AArch64/sink-and-fold-dbg-value-crash.mir
+++ /dev/null
@@ -1,304 +0,0 @@
-# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
-# RUN: llc --aarch64-enable-sink-fold=true --run-pass=machine-sink %s -o - | FileCheck %s
---- |
-  source_filename = "x.ll"
-  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
-  target triple = "aarch64-linux"
-
-  %S = type <{ ptr, %T, i64, i64, i32, [4 x i8] }>
-  %T = type { ptr, ptr, ptr, ptr, i64 }
-
-  define void @f(ptr %p, i1 %c0, i1 %c1) {
-  entry:
-    %a = getelementptr %S, ptr %p, i64 0, i32 1
-    call void @llvm.dbg.value(metadata ptr %a, metadata !4, metadata !DIExpression()), !dbg !10
-    br i1 %c0, label %if.then, label %if.end
-
-  if.then:                                          ; preds = %entry
-    %v0 = tail call ptr @g(ptr %a)
-    br i1 %c1, label %exit, label %if.end
-
-  if.end:                                           ; preds = %if.then, %entry
-    %v1 = load i64, ptr %a, align 8
-    br label %exit
-
-  exit:                                             ; preds = %if.end, %if.then
-    ret void
-  }
-
-  define ptr @g(ptr) {
-  entry:
-    br label %if.then
-  if.then:
-    br label %if.end
-  if.end:
-    br label %exit
-  exit:
-    ret ptr null
-  }
-
-  declare void @llvm.dbg.value(metadata, metadata, metadata) #0
-
-  attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
-
-  !llvm.dbg.cu = !{!0}
-  !llvm.module.flags = !{!2, !3}
-
-  !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang ", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
-  !1 = !DIFile(filename: "f.c", directory: "/usr/rms")
-  !2 = !{i32 7, !"Dwarf Version", i32 4}
-  !3 = !{i32 2, !"Debug Info Version", i32 3}
-  !4 = !DILocalVariable(name: "x", arg: 1, scope: !5, file: !1, line: 2, type: !8)
-  !5 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 2, type: !6, scopeLine: 2, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !9)
-  !6 = !DISubroutineType(types: !7)
-  !7 = !{!8, !8}
-  !8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
-  !9 = !{}
-  !10 = !DILocation(line: 2, column: 11, scope: !5)
-
-...
----
-name:            f
-alignment:       4
-exposesReturnsTwice: false
-legalized:       false
-regBankSelected: false
-selected:        false
-failedISel:      false
-tracksRegLiveness: true
-hasWinCFI:       false
-callsEHReturn:   false
-callsUnwindInit: false
-hasEHCatchret:   false
-hasEHScopes:     false
-hasEHFunclets:   false
-isOutlined:      false
-debugInstrRef:   false
-failsVerification: false
-tracksDebugUserValues: false
-registers:
-  - { id: 0, class: gpr64all, preferred-register: '' }
-  - { id: 1, class: gpr64common, preferred-register: '' }
-  - { id: 2, class: gpr32, preferred-register: '' }
-  - { id: 3, class: gpr32, preferred-register: '' }
-  - { id: 4, class: gpr32, preferred-register: '' }
-  - { id: 5, class: gpr64sp, preferred-register: '' }
-  - { id: 6, class: gpr64all, preferred-register: '' }
-liveins:
-  - { reg: '$x0', virtual-reg: '%1' }
-  - { reg: '$w1', virtual-reg: '%2' }
-  - { reg: '$w2', virtual-reg: '%3' }
-frameInfo:
-  isFrameAddressTaken: false
-  isReturnAddressTaken: false
-  hasStackMap:     false
-  hasPatchPoint:   false
-  stackSize:       0
-  offsetAdjustment: 0
-  maxAlignment:    1
-  adjustsStack:    true
-  hasCalls:        true
-  stackProtector:  ''
-  functionContext: ''
-  maxCallFrameSize: 0
-  cvBytesOfCalleeSavedRegisters: 0
-  hasOpaqueSPAdjustment: false
-  hasVAStart:      false
-  hasMustTailInVarArgFunc: false
-  hasTailCall:     false
-  localFrameSize:  0
-  savePoint:       ''
-  restorePoint:    ''
-fixedStack:      []
-stack:           []
-entry_values:    []
-callSites:       []
-debugValueSubstitutions: []
-constants:       []
-machineFunctionInfo: {}
-body:             |
-  ; CHECK-LABEL: name: f
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT:   liveins: $x0, $w1, $w2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32 = COPY $w2
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr32 = COPY $w1
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr64common = COPY $x0
-  ; CHECK-NEXT:   DBG_VALUE $noreg, $noreg, !4, !DIExpression(), debug-location !10
-  ; CHECK-NEXT:   TBZW [[COPY1]], 0, %bb.2
-  ; CHECK-NEXT:   B %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.if.then:
-  ; CHECK-NEXT:   successors: %bb.3(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr32 = COPY [[COPY]]
-  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   $x0 = ADDXri [[COPY2]], 8, 0
-  ; CHECK-NEXT:   DBG_VALUE $x0, $noreg, !4, !DIExpression(), debug-location !10
-  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   TBNZW [[COPY3]], 0, %bb.3
-  ; CHECK-NEXT:   B %bb.2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.if.end:
-  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.3.exit:
-  ; CHECK-NEXT:   RET_ReallyLR
-  bb.0.entry:
-    successors: %bb.1(0x40000000), %bb.2(0x40000000)
-    liveins: $x0, $w1, $w2
-
-    %3:gpr32 = COPY $w2
-    %2:gpr32 = COPY $w1
-    %1:gpr64common = COPY $x0
-    %4:gpr32 = COPY %3
-    %5:gpr64sp = ADDXri %1, 8, 0
-    DBG_VALUE %5, $noreg, !4, !DIExpression(), debug-location !10
-    %0:gpr64all = COPY %5
-    TBZW %2, 0, %bb.2
-    B %bb.1
-
-  bb.1.if.then:
-    successors: %bb.3(0x40000000), %bb.2(0x40000000)
-
-    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-    $x0 = COPY %0
-    BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-    TBNZW %4, 0, %bb.3
-    B %bb.2
-
-  bb.2.if.end:
-    successors: %bb.3(0x80000000)
-
-
-  bb.3.exit:
-    RET_ReallyLR
-
-...
----
-name:            g
-alignment:       4
-exposesReturnsTwice: false
-legalized:       false
-regBankSelected: false
-selected:        false
-failedISel:      false
-tracksRegLiveness: true
-hasWinCFI:       false
-callsEHReturn:   false
-callsUnwindInit: false
-hasEHCatchret:   false
-hasEHScopes:     false
-hasEHFunclets:   false
-isOutlined:      false
-debugInstrRef:   false
-failsVerification: false
-tracksDebugUserValues: false
-registers:
-  - { id: 0, class: gpr64all, preferred-register: '' }
-  - { id: 1, class: gpr64common, preferred-register: '' }
-  - { id: 2, class: gpr32, preferred-register: '' }
-  - { id: 3, class: gpr32, preferred-register: '' }
-  - { id: 4, class: gpr32, preferred-register: '' }
-  - { id: 5, class: gpr64sp, preferred-register: '' }
-  - { id: 6, class: gpr64all, preferred-register: '' }
-liveins:
-  - { reg: '$x0', virtual-reg: '%1' }
-  - { reg: '$w1', virtual-reg: '%2' }
-  - { reg: '$w2', virtual-reg: '%3' }
-frameInfo:
-  isFrameAddressTaken: false
-  isReturnAddressTaken: false
-  hasStackMap:     false
-  hasPatchPoint:   false
-  stackSize:       0
-  offsetAdjustment: 0
-  maxAlignment:    1
-  adjustsStack:    true
-  hasCalls:        true
-  stackProtector:  ''
-  functionContext: ''
-  maxCallFrameSize: 0
-  cvBytesOfCalleeSavedRegisters: 0
-  hasOpaqueSPAdjustment: false
-  hasVAStart:      false
-  hasMustTailInVarArgFunc: false
-  hasTailCall:     false
-  localFrameSize:  0
-  savePoint:       ''
-  restorePoint:    ''
-fixedStack:      []
-stack:           []
-entry_values:    []
-callSites:       []
-debugValueSubstitutions: []
-constants:       []
-machineFunctionInfo: {}
-body:             |
-  ; CHECK-LABEL: name: g
-  ; CHECK: bb.0.entry:
-  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT:   liveins: $x0, $w1, $w2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32 = COPY $w2
-  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr32 = COPY $w1
-  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr64common = COPY $x0
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY]], $noreg, debug-location !10
-  ; CHECK-NEXT:   TBZW [[COPY1]], 0, %bb.2
-  ; CHECK-NEXT:   B %bb.1
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.1.if.then:
-  ; CHECK-NEXT:   successors: %bb.3(0x40000000), %bb.2(0x40000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr32 = COPY [[COPY]]
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY3]], $noreg, debug-location !10
-  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   $x0 = ADDXri [[COPY2]], 8, 0
-  ; CHECK-NEXT:   DBG_VALUE_LIST !4, !DIExpression(), [[COPY3]], $x0, debug-location !10
-  ; CHECK-NEXT:   BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-  ; CHECK-NEXT:   TBNZW [[COPY3]], 0, %bb.3
-  ; CHECK-NEXT:   B %bb.2
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.2.if.end:
-  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT: bb.3.exit:
-  ; CHECK-NEXT:   RET_ReallyLR
-  bb.0.entry:
-    successors: %bb.1(0x40000000), %bb.2(0x40000000)
-    liveins: $x0, $w1, $w2
-
-    %3:gpr32 = COPY $w2
-    %2:gpr32 = COPY $w1
-    %1:gpr64common = COPY $x0
-    %4:gpr32 = COPY %3
-    %5:gpr64sp = ADDXri %1, 8, 0
-    DBG_VALUE_LIST !4, !DIExpression(), %4:gpr32, %5:gpr64sp, debug-location !10
-    %0:gpr64all = COPY %5
-    TBZW %2, 0, %bb.2
-    B %bb.1
-
-  bb.1.if.then:
-    successors: %bb.3(0x40000000), %bb.2(0x40000000)
-
-    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
-    $x0 = COPY %0
-    BL @g, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $x0
-    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
-    TBNZW %4, 0, %bb.3
-    B %bb.2
-
-  bb.2.if.end:
-    successors: %bb.3(0x80000000)
-
-
-  bb.3.exit:
-    RET_ReallyLR
-
-...
diff --git a/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir b/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir
new file mode 100644
index 000000000000000..3cf490474bdc480
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sink-and-fold-drop-dbg.mir
@@ -0,0 +1,144 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
+# RUN: llc --aarch64-enable-sink-fold=true --run-pass=machine-sink %s -o - | FileCheck %s
+--- |
+  source_filename = "dbg.c"
+  target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+  target triple = "aarch64-unknown-linux"
+
+  define dso_local void @g(i32 noundef %x) local_unnamed_addr !dbg !11 {
+  entry:
+    call void @llvm.dbg.value(metadata i32 %x, metadata !16, metadata !DIExpression()), !dbg !20
+    %add = add nsw i32 %x, 1, !dbg !21
+    call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression()), !dbg !20
+    call void @llvm.dbg.value(metadata i32 %add, metadata !18, metadata !DIExpression()), !dbg !20
+    %call = tail call i32 @f(i32 noundef %add), !dbg !22
+    call void @llvm.dbg.value(metadata i32 %add, metadata !19, metadata !DIExpression()), !dbg !20
+    %call1 = tail call i32 @f(i32 noundef %add), !dbg !23
+    ret void, !dbg !24
+  }
+
+  declare !dbg !25 i32 @f(i32 noundef)
+
+  declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+  !llvm.dbg.cu = !{!0}
+  !llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8, !9}
+  !llvm.ident = !{!10}
+
+  !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 18.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+  !1 = !DIFile(filename: "dbg.c", directory: "/home/sweet", checksumkind: CSK_MD5, checksum: "b6bd7ad2140c696af9201a17ab73c918")
+  !2 = !{i32 7, !"Dwarf Version", i32 5}
+  !3 = !{i32 2, !"Debug Info Version", i32 3}
+  !4 = !{i32 1, !"wchar_size", i32 4}
+  !5 = !{i32 8, !"PIC Level", i32 2}
+  !6 = !{i32 7, !"PIE Level", i32 2}
+  !7 = !{i32 7, !"uwtable", i32 2}
+  !8 = !{i32 7, !"frame-pointer", i32 1}
+  !9 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+  !10 = !{!"clang version 18.0.0"}
+  !11 = distinct !DISubprogram(name: "g", scope: !1, file: !1, line: 3, type: !12, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !15)
+  !12 = !DISubroutineType(types: !13)
+  !13 = !{null, !14}
+  !14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+  !15 = !{!16, !17, !18, !19}
+  !16 = !DILocalVariable(name: "x", arg: 1, scope: !11, file: !1, line: 3, type: !14)
+  !17 = !DILocalVariable(name: "y", scope: !11, file: !1, line: 4, type: !14)
+  !18 = !DILocalVariable(name: "t0", scope: !11, file: !1, line: 6, type: !14)
+  !19 = !DILocalVariable(name: "t1", scope: !11, file: !1, line: 9, type: !14)
+  !20 = !DILocation(line: 0, scope: !11)
+  !21 = !DILocation(line: 4, column: 13, scope: !11)
+  !22 = !DILocation(line: 7, column: 3, scope: !11)
+  !23 = !DILocation(line: 10, column: 3, scope: !11)
+  !24 = !DILocation(line: 11, column: 1, scope: !11)
+  !25 = !DISubprogram(name: "f", scope: !1, file: !1, line: 1, type: !26, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
+  !26 = !DISubroutineType(types: !27)
+  !27 = !{!14, !14}
+
+...
+---
+name:            g
+alignment:       4
+exposesReturnsTwice: false
+legalized:       false
+regBankSelected: false
+selected:        false
+failedISel:      false
+tracksRegLiveness: true
+hasWinCFI:       false
+callsEHReturn:   false
+callsUnwindInit: false
+hasEHCatchret:   false
+hasEHScopes:     false
+hasEHFunclets:   false
+isOutlined:      false
+debugInstrRef:   false
+failsVerification: false
+tracksDebugUserValues: false
+registers:
+  - { id: 0, class: gpr32common, preferred-register: '' }
+  - { id: 1, class: gpr32sp, preferred-register: '' }
+  - { id: 2, class: gpr32all, preferred-register: '' }
+liveins:
+  - { reg: '$w0', virtual-reg: '%0' }
+frameInfo:
+  isFrameAddressTaken: false
+  isReturnAddressTaken: false
+  hasStackMap:     false
+  hasPatchPoint:   false
+  stackSize:       0
+  offsetAdjustment: 0
+  maxAlignment:    1
+  adjustsStack:    true
+  hasCalls:        true
+  stackProtector:  ''
+  functionContext: ''
+  maxCallFrameSize: 0
+  cvBytesOfCalleeSavedRegisters: 0
+  hasOpaqueSPAdjustment: false
+  hasVAStart:      false
+  hasMustTailInVarArgFunc: false
+  hasTailCall:     true
+  localFrameSize:  0
+  savePoint:       ''
+  restorePoint:    ''
+fixedStack:      []
+stack:           []
+entry_values:    []
+callSites:       []
+debugValueSubstitutions: []
+constants:       []
+machineFunctionInfo: {}
+body:             |
+  bb.0.entry:
+    liveins: $w0
+
+    ; CHECK-LABEL: name: g
+    ; CHECK: liveins: $w0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: DBG_VALUE $w0, $noreg, !16, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr32common = COPY $w0
+    ; CHECK-NEXT: DBG_VALUE [[COPY]], $noreg, !16, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !17, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !18, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    ; CHECK-NEXT: $w0 = nsw ADDWri [[COPY]], 1, 0, debug-location !22
+    ; CHECK-NEXT: BL @f, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0, debug-location !22
+    ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    ; CHECK-NEXT: DBG_VALUE $noreg, $noreg, !19, !DIExpression(), debug-location !20
+    ; CHECK-NEXT: $w0 = nsw ADDWri [[COPY]], 1, 0, debug-location !23
+    ; CHECK-NEXT: TCRETURNdi @f, 0, csr_aarch64_aapcs, implicit $sp, implicit $w0, debug-location !23
+    DBG_VALUE $w0, $noreg, !16, !DIExpression(), debug-location !20
+    %0:gpr32common = COPY $w0
+    DBG_VALUE %0, $noreg, !16, !DIExpression(), debug-location !20
+    %1:gpr32sp = nsw ADDWri %0, 1, 0, debug-location !21
+    DBG_VALUE %1, $noreg, !17, !DIExpression(), debug-location !20
+    DBG_VALUE %1, $noreg, !18, !DIExpression(), debug-location !20
+    ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    $w0 = COPY %1, debug-location !22
+    BL @f, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0, debug-location !22
+    ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp, debug-location !22
+    DBG_VALUE %1, $noreg, !19, !DIExpression(), debug-location !20
+    $w0 = COPY %1, debug-location !23
+    TCRETURNdi @f, 0, csr_aarch64_aapcs, implicit $sp, implicit $w0, debug-location !23
+
+...
diff --git a/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll b/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
index 1399b68b9b7f184..2b533968e330021 100644
--- a/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
+++ b/llvm/test/DebugInfo/Generic/no-empty-child-vars.ll
@@ -68,11 +68,14 @@
 ; CHECK:       DW_TAG_inlined_subroutine
 ; CHECK:     NULL
 
+; NOTE: Instructions below changed from `add` to `mul` to make them more expensive
+; and unlikely to be sunk to replace COPYs into a return value register.
+
 ; Function Attrs: norecurs...
[truncated]

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank, I agree this LGTM

@dwblaikie
Copy link
Collaborator

@jmorse maybe take a look at this? Perhaps there's a way to keep it.

@jmorse
Copy link
Member

jmorse commented Nov 8, 2023

Hmmm, if I understand the optimisation correctly, the new SinkAndFold optimisation is both re-engineering how values are computed and moving around where those values are defined? By doing so, it exposes all the hard problems in debug-info at the same time unfortunately.

The biggest problem is that one dominating assignment to a vreg is being replaced with lots of localised computations. Inserting a new DBG_VALUE for each new computation works in the simple case, but it'll also re-order the appearance assignments in the complicated case, and as you're discovering it generates a lot of DBG_VALUEs. A perfect solution for this would have to do a dominance query for each existing DBG_VALUE to work out what the most-immediately dominating computation is and then update the DBG_VALUE operand to reference that, which would take up a lot of compile time. And as the outputs are all physregs, it would need a check to ensure it's not clobbered too.

Shameless plug: this is exactly why the instruction-referencing flavour of debug-info exists (https://www.youtube.com/watch?v=yxuwfNnp064 for me illustrating what it does), as it defers all of these problems until the end of compilation and then addresses them all at once. It doesn't immediately support this scenario but it'd be a very simple extension to support it. Alas it doesn't support ARM/aarch64 today, CC @felipepiovezan who was interested in that.

In the meantime, disabling this update to fix a compile-time regression seems fine. Presumably the instruction defining the vreg is deleted, so all the DBG_VALUEs will refer to a not-defined vreg, and will be undef'd at regalloc time in LiveDebugVariables? It might be clearer (and easier to test) to explicitly set the seen DBG_VALUEs to be $noreg as part of this patch.

An extension that would preserve /some/ variable locations is to examine the DBG_VALUEs that are in the same block as the sinking instruction, and examine whether sinking + updating those to the location of the new computations would re-order with other DBG_VALUEs for the same variable. Typically this would cover a good number of the variables that refer to that value. The rest of MachineSink attempts this to some extent, although it's imperfect (this was what convinced me instruction-referencing is the way to go)

@momchil-velikov
Copy link
Collaborator Author

Thanks for looking at it!

In the meantime, disabling this update to fix a compile-time regression seems fine. Presumably the instruction defining the vreg is deleted, so all the DBG_VALUEs will refer to a not-defined vreg, and will be undef'd at regalloc time in LiveDebugVariables? It might be clearer (and easier to test) to explicitly set the seen DBG_VALUEs to be $noreg as part of this patch.

That was added in the initial patch, because some instructions get folded into the addressing mode in which case we in fact do not have a named location that hold the value.

// Delete the dead COPYs and clear operands in debug instructions

@momchil-velikov momchil-velikov merged commit e8209b2 into llvm:main Nov 11, 2023
5 checks passed
@momchil-velikov momchil-velikov deleted the sink-and-fold-no-copy-dbg branch November 13, 2023 16:02
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…ld (llvm#71443)

After performing sink-and-fold over a COPY, the original instruction is
replaced with one that produces its output in the destination of the
copy. Its value is still available (in a hard register), so if there are
debug instructions which refer to the (now deleted) virtual register
they could be updated to refer to the hard register, in principle.
However, it's not clear how to do that, moreover in some cases the debug
instructions may need to be replicated proportionally to the number of
the COPY instructions replaced and in some extreme cases we can end up
with quadratic increase in the number of debug instructions, e.g:

        int f(int);
    
        void g(int x) {
          int y = x + 1;
    
          int t0 = y;
          f(t0);
    
          int t1 = y;
          f(t1);
        }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants