Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Guard copy propagation in machine CSE against undefs #97413

Merged
merged 4 commits into from
Jul 12, 2024

Conversation

vikramRH
Copy link
Contributor

@vikramRH vikramRH commented Jul 2, 2024

No description provided.

Copy link
Contributor Author

vikramRH commented Jul 2, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @vikramRH and the rest of your teammates on Graphite Graphite

@vikramRH vikramRH requested a review from arsenm July 2, 2024 13:03
@vikramRH vikramRH marked this pull request as ready for review July 2, 2024 13:03
@jayfoad
Copy link
Contributor

jayfoad commented Jul 2, 2024

[CodeGen][NFC] Guard copy propogation in machine CSE against undefs

Typo "propagation"

@arsenm
Copy link
Contributor

arsenm commented Jul 2, 2024

Definitely needs a test if we're going ahead with this hack

@vikramRH vikramRH changed the title [CodeGen][NFC] Guard copy propogation in machine CSE against undefs [CodeGen][NFC] Guard copy propagation in machine CSE against undefs Jul 2, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 4, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Vikram Hegde (vikramRH)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/97413.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/MachineCSE.cpp (+1-1)
  • (added) llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir (+21)
diff --git a/llvm/lib/CodeGen/MachineCSE.cpp b/llvm/lib/CodeGen/MachineCSE.cpp
index 4e6101f875589..e39aae56bf116 100644
--- a/llvm/lib/CodeGen/MachineCSE.cpp
+++ b/llvm/lib/CodeGen/MachineCSE.cpp
@@ -184,7 +184,7 @@ bool MachineCSE::PerformTrivialCopyPropagation(MachineInstr *MI,
       continue;
     bool OnlyOneUse = MRI->hasOneNonDBGUse(Reg);
     MachineInstr *DefMI = MRI->getVRegDef(Reg);
-    if (!DefMI->isCopy())
+    if (!DefMI || !DefMI->isCopy())
       continue;
     Register SrcReg = DefMI->getOperand(1).getReg();
     if (!SrcReg.isVirtual())
diff --git a/llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir b/llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir
new file mode 100644
index 0000000000000..ab759657d5d7a
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir
@@ -0,0 +1,21 @@
+# RUN: llc -mtriple=amdgcn -run-pass=machine-cse -verify-machineinstrs -o - %s | FileCheck %s
+
+# Test to ensure that this does not crash on undefs
+# CHECK-LABEL: name: machine-cse-copyprop
+# CHECK: IMPLICIT_DEF
+# CHECK-NOT: COPY
+# CHECK: S_ADD_I32
+---
+name: machine-cse-copyprop
+tracksRegLiveness: true
+body:               |
+  bb.0:
+    %0:sreg_32 = IMPLICIT_DEF
+    %1:sreg_32 = IMPLICIT_DEF
+    %2:sreg_32 = COPY %0
+    %3:sreg_32 = COPY %1
+    %4:sreg_64 = REG_SEQUENCE undef %10:sreg_32, %subreg.sub0, %2:sreg_32, %subreg.sub1
+    %5:sreg_64 = REG_SEQUENCE undef %11:sreg_32, %subreg.sub0, %3:sreg_32, %subreg.sub1
+    %6:sreg_32 = S_ADD_I32 %4.sub1:sreg_64, %5.sub1:sreg_64, implicit-def $scc
+
+...

llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir Outdated Show resolved Hide resolved
llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir Outdated Show resolved Hide resolved
%2:sreg_32 = COPY %0
%3:sreg_32 = COPY %1
%4:sreg_64 = REG_SEQUENCE undef %10:sreg_32, %subreg.sub0, %2:sreg_32, %subreg.sub1
%5:sreg_64 = REG_SEQUENCE undef %11:sreg_32, %subreg.sub0, %3:sreg_32, %subreg.sub1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did these get produced in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is a usecase (which is also the behaviour seen with new atomic optimizer changes). DPP combine translates following code

  %20:vreg_64 = V_SET_INACTIVE_B64 killed %106:vreg_64, 0, implicit-def dead $scc, implicit $exec
  %187:vgpr_32 = V_MOV_B32_dpp %19.sub0:vreg_64(tied-def 0), %20.sub0:vreg_64, 273, 15, 15, 0, implicit $exec
  %188:vgpr_32 = V_MOV_B32_dpp %19.sub1:vreg_64(tied-def 0), %20.sub1:vreg_64, 273, 15, 15, 0, implicit $exec
  %21:vreg_64 = REG_SEQUENCE %187:vgpr_32, %subreg.sub0, %188:vgpr_32, %subreg.sub1
  %111:vgpr_32, %113:sreg_64_xexec = V_ADD_CO_U32_e64 %20.sub0:vreg_64, %21.sub0:vreg_64, 0, implicit $exec
  %112:vgpr_32, dead %114:sreg_64_xexec = V_ADDC_U32_e64 %20.sub1:vreg_64, %21.sub1:vreg_64, killed %113:sreg_64_xexec, 0, implicit $exec

to

  %19:vreg_64 = V_SET_INACTIVE_B64 killed %105:vreg_64, 0, implicit-def dead $scc, implicit $exec
  %187:vgpr_32 = V_MOV_B32_dpp %18.sub1:vreg_64(tied-def 0), %19.sub1:vreg_64, 273, 15, 15, 0, implicit $exec
  %20:vreg_64 = REG_SEQUENCE undef %186:vgpr_32, %subreg.sub0, %187:vgpr_32, %subreg.sub1
  %110:vgpr_32, %112:sreg_64_xexec = V_ADD_CO_U32_e64_dpp %188:vgpr_32(tied-def 0), %19.sub0:vreg_64, %19.sub0:vreg_64, 0, 273, 15, 15, 1, implicit $exec
  %111:vgpr_32, dead %113:sreg_64_xexec = V_ADDC_U32_e64 %19.sub1:vreg_64, %20.sub1:vreg_64, killed %112:sreg_64_xexec, 0, implicit $exec

@jayfoad
Copy link
Contributor

jayfoad commented Jul 8, 2024

I'm not sure "NFC" is appropriate if this avoids a compiler crash.

@vikramRH vikramRH changed the title [CodeGen][NFC] Guard copy propagation in machine CSE against undefs [CodeGen] Guard copy propagation in machine CSE against undefs Jul 8, 2024
@vikramRH
Copy link
Contributor Author

ping

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nit. Also still would like to the sample IR that got here, it probably should not happen for anything well formed

llvm/test/CodeGen/AMDGPU/machine-cse-copyprop.mir Outdated Show resolved Hide resolved
@vikramRH vikramRH force-pushed the users/vikramRH/machineCSE_undef branch from 91c6382 to 08bdad3 Compare July 11, 2024 15:18
@vikramRH
Copy link
Contributor Author

vikramRH commented Jul 11, 2024

lgtm with nit. Also still would like to the sample IR that got here, it probably should not happen for anything well formed

https://godbolt.org/z/K7fPTq8zc (basically the optimized i64 atomic add via DPP path which would be generated after #96934).

@vikramRH vikramRH merged commit 0d9d5f7 into main Jul 12, 2024
7 checks passed
@vikramRH vikramRH deleted the users/vikramRH/machineCSE_undef branch July 12, 2024 04:53
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants