Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Dec 4, 2025

At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.

Copy link
Contributor Author

arsenm commented Dec 4, 2025

@arsenm arsenm marked this pull request as ready for review December 4, 2025 12:46
@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Dec 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2025

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.


Full diff: https://github.com/llvm/llvm-project/pull/170657.diff

13 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+12-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+11)
  • (modified) llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp (+2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp (+10)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp (+4-1)
  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+12)
  • (modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+2)
  • (modified) llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+8)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+2)
  • (added) llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll (+16)
  • (added) llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir (+40)
  • (added) llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir (+64)
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 18142c2c0adf3..bdd9fee795e08 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -2350,7 +2350,18 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
 
   /// Returns the callee operand from the given \p MI.
   virtual const MachineOperand &getCalleeOperand(const MachineInstr &MI) const {
-    return MI.getOperand(0);
+    assert(MI.isCall());
+
+    switch (MI.getOpcode()) {
+    case TargetOpcode::STATEPOINT:
+    case TargetOpcode::STACKMAP:
+    case TargetOpcode::PATCHPOINT:
+      return MI.getOperand(3);
+    default:
+      return MI.getOperand(0);
+    }
+
+    llvm_unreachable("impossible call instruction");
   }
 
   /// Return the uniformity behavior of the given instruction.
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index dd8f18d3b8a6a..7998da0ea06eb 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -331,6 +331,17 @@ namespace llvm {
 MachineBasicBlock *
 TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
                                             MachineBasicBlock *MBB) const {
+  switch (MI.getOpcode()) {
+  case TargetOpcode::STATEPOINT:
+    // As an implementation detail, STATEPOINT shares the STACKMAP format at
+    // this point in the process.  We diverge later.
+  case TargetOpcode::STACKMAP:
+  case TargetOpcode::PATCHPOINT:
+    return emitPatchPoint(MI, MBB);
+  default:
+    break;
+  }
+
 #ifndef NDEBUG
   dbgs() << "If a target marks an instruction with "
           "'usesCustomInserter', it must implement "
diff --git a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
index 46a5e44374e1c..5b8cd343557fa 100644
--- a/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
@@ -1145,6 +1145,8 @@ void SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl(
     const CallBase *Call, SDValue Callee, const BasicBlock *EHPadBB,
     bool VarArgDisallowed, bool ForceVoidReturnTy) {
   StatepointLoweringInfo SI(DAG);
+  SI.CLI.CB = Call;
+
   unsigned ArgBeginIndex = Call->arg_begin() - Call->op_begin();
   populateCallLoweringInfo(
       SI.CLI, Call, ArgBeginIndex, Call->arg_size(), Callee,
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index bf9b4297bd435..99c1ab8d379d5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -406,6 +406,16 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr *MI) {
       return;
     }
 
+    unsigned Opc = MI->getOpcode();
+    if (LLVM_UNLIKELY(Opc == TargetOpcode::STATEPOINT ||
+                      Opc == TargetOpcode::STACKMAP ||
+                      Opc == TargetOpcode::PATCHPOINT)) {
+      LLVMContext &Ctx = MI->getMF()->getFunction().getContext();
+      Ctx.emitError("unhandled statepoint-like instruction");
+      OutStreamer->emitRawComment("unsupported statepoint/stackmap/patchpoint");
+      return;
+    }
+
     if (isVerbose())
       if (STI.getInstrInfo()->isBlockLoadStore(MI->getOpcode()))
         emitVGPRBlockComment(MI, STI.getInstrInfo(), STI.getRegisterInfo(),
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp b/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
index b03d50f2d451d..4e664e084fb88 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
@@ -256,10 +256,13 @@ AMDGPUResourceUsageAnalysisImpl::analyzeResourceUsage(
         // Pseudo used just to encode the underlying global. Is there a better
         // way to track this?
 
+        // TODO: Some of the generic call-like pseudos do not encode the callee,
+        // so we overly conservatively treat this as an indirect call.
         const MachineOperand *CalleeOp =
             TII->getNamedOperand(MI, AMDGPU::OpName::callee);
 
-        const Function *Callee = getCalleeFunction(*CalleeOp);
+        const Function *Callee =
+            CalleeOp ? getCalleeFunction(*CalleeOp) : nullptr;
 
         auto isSameFunction = [](const MachineFunction &MF, const Function *F) {
           return F == &MF.getFunction();
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 301f2fc8dab45..0660e3a64c70e 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -275,6 +275,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
   setTruncStoreAction(MVT::v16i64, MVT::v16i32, Expand);
 
   setOperationAction(ISD::GlobalAddress, {MVT::i32, MVT::i64}, Custom);
+  setOperationAction(ISD::ExternalSymbol, {MVT::i32, MVT::i64}, Custom);
 
   setOperationAction(ISD::SELECT, MVT::i1, Promote);
   setOperationAction(ISD::SELECT, MVT::i64, Custom);
@@ -6838,6 +6839,8 @@ SDValue SITargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
     SIMachineFunctionInfo *MFI = MF.getInfo<SIMachineFunctionInfo>();
     return LowerGlobalAddress(MFI, Op, DAG);
   }
+  case ISD::ExternalSymbol:
+    return LowerExternalSymbol(Op, DAG);
   case ISD::INTRINSIC_WO_CHAIN:
     return LowerINTRINSIC_WO_CHAIN(Op, DAG);
   case ISD::INTRINSIC_W_CHAIN:
@@ -9017,6 +9020,15 @@ SDValue SITargetLowering::LowerGlobalAddress(AMDGPUMachineFunction *MFI,
                          MachineMemOperand::MOInvariant);
 }
 
+SDValue SITargetLowering::LowerExternalSymbol(SDValue Op,
+                                              SelectionDAG &DAG) const {
+  // TODO: Handle this. It should be mostly the same as LowerGlobalAddress.
+  const Function &Fn = DAG.getMachineFunction().getFunction();
+  DAG.getContext()->diagnose(DiagnosticInfoUnsupported(
+      Fn, "unsupported external symbol", Op.getDebugLoc()));
+  return DAG.getPOISON(Op.getValueType());
+}
+
 SDValue SITargetLowering::copyToM0(SelectionDAG &DAG, SDValue Chain,
                                    const SDLoc &DL, SDValue V) const {
   // We can't use S_MOV_B32 directly, because there is no way to specify m0 as
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.h b/llvm/lib/Target/AMDGPU/SIISelLowering.h
index fb162948caf4c..e82f4528fcd09 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.h
@@ -79,6 +79,8 @@ class SITargetLowering final : public AMDGPUTargetLowering {
 
   SDValue LowerGlobalAddress(AMDGPUMachineFunction *MFI, SDValue Op,
                              SelectionDAG &DAG) const override;
+  SDValue LowerExternalSymbol(SDValue Op, SelectionDAG &DAG) const;
+
   SDValue lowerImplicitZextParam(SelectionDAG &DAG, SDValue Op,
                                  MVT VT, unsigned Offset) const;
   SDValue lowerImage(SDValue Op, const AMDGPU::ImageDimIntrinsicInfo *Intr,
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 64a29e1841245..9e744c9187c48 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -1955,7 +1955,7 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
       // load). We also need to check WAW dependency with saved PC.
       Wait = AMDGPU::Waitcnt();
 
-      const auto &CallAddrOp = *TII->getNamedOperand(MI, AMDGPU::OpName::src0);
+      const MachineOperand &CallAddrOp = TII->getCalleeOperand(MI);
       if (CallAddrOp.isReg()) {
         RegInterval CallAddrOpInterval =
             ScoreBrackets.getRegInterval(&MI, CallAddrOp);
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 7535407741f1f..ef011236642f8 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -10510,6 +10510,14 @@ unsigned SIInstrInfo::getInstrLatency(const InstrItineraryData *ItinData,
   return SchedModel.computeInstrLatency(&MI);
 }
 
+const MachineOperand &
+SIInstrInfo::getCalleeOperand(const MachineInstr &MI) const {
+  if (const MachineOperand *CallAddrOp =
+          getNamedOperand(MI, AMDGPU::OpName::src0))
+    return *CallAddrOp;
+  return TargetInstrInfo::getCalleeOperand(MI);
+}
+
 InstructionUniformity
 SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
   const MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index b1d6563bf3c0b..7d7cbda644b77 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -1641,6 +1641,8 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
                            const MachineInstr &MI,
                            unsigned *PredCost = nullptr) const override;
 
+  const MachineOperand &getCalleeOperand(const MachineInstr &MI) const override;
+
   InstructionUniformity
   getInstructionUniformity(const MachineInstr &MI) const final;
 
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll b/llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll
new file mode 100644
index 0000000000000..ef8c00bb3b4b0
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/llvm.deoptimize.ll
@@ -0,0 +1,16 @@
+; RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s 2> %t.err | FileCheck %s
+; RUN: FileCheck -check-prefix=ERR %s < %t.err
+
+; ERR: error: <unknown>:0:0: in function caller_0 i32 (): unsupported external symbol
+; ERR: error: unhandled statepoint-like instruction
+
+; CHECK: ;unsupported statepoint/stackmap/patchpoint
+declare i32 @llvm.experimental.deoptimize.i32(...)
+declare i8  @llvm.experimental.deoptimize.i8(...)
+
+define i32 @caller_0() {
+entry:
+  %v = call i32(...) @llvm.experimental.deoptimize.i32() [ "deopt"(i32 0) ]
+  ret i32 %v
+}
+
diff --git a/llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir b/llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir
new file mode 100644
index 0000000000000..1c9ae91557b7e
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/statepoint-asm-printer.mir
@@ -0,0 +1,40 @@
+# RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -start-before=amdgpu-asm-printer -o - %s 2> %t.err | FileCheck %s
+# RUN: FileCheck -check-prefix=ERR %s < %t.err
+
+# CHECK: ;unsupported statepoint/stackmap/patchpoint
+# ERR: error: unhandled statepoint-like instruction
+
+---
+name:            test_statepoint
+tracksRegLiveness: true
+frameInfo:
+  stackSize:       16
+  maxAlignment:    4
+  adjustsStack:    true
+  hasCalls:        true
+  maxCallFrameSize: 0
+  isCalleeSavedInfoValid: true
+stack:
+  - { id: 0, type: spill-slot, size: 4, alignment: 4 }
+machineFunctionInfo:
+  hasSpilledSGPRs: true
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body:             |
+  bb.0.entry:
+    liveins: $sgpr12, $sgpr13, $sgpr14, $sgpr15, $sgpr30, $sgpr31, $vgpr31, $sgpr4_sgpr5, $sgpr6_sgpr7, $sgpr8_sgpr9, $sgpr10_sgpr11
+
+    S_WAITCNT 0
+    $sgpr16 = S_MOV_B32 $sgpr33
+    $sgpr33 = S_MOV_B32 $sgpr32
+    $sgpr18_sgpr19 = S_OR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
+    BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr33, 0, 0, 0, implicit $exec :: (store (s32) into %stack.0, addrspace 5)
+    $exec = S_MOV_B64 killed $sgpr18_sgpr19
+    $vgpr40 = V_WRITELANE_B32 killed $sgpr16, 2, undef $vgpr40
+    $vgpr40 = V_WRITELANE_B32 $sgpr30, 0, $vgpr40
+    $sgpr32 = frame-setup S_ADDK_I32 $sgpr32, 1024, implicit-def dead $scc
+    $vgpr40 = V_WRITELANE_B32 $sgpr31, 1, $vgpr40
+    STATEPOINT 2882400015, 0, 11, undef renamable $sgpr4_sgpr5, 0, killed $sgpr4_sgpr5, killed $sgpr6_sgpr7, killed $sgpr8_sgpr9, killed $sgpr10_sgpr11, killed $sgpr12, killed $sgpr13, killed $sgpr14, killed $sgpr15, killed $vgpr31, $sgpr0_sgpr1_sgpr2_sgpr3, 2, 0, 2, 0, 2, 1, 2, 0, 2, 0, 2, 0, 2, 0, csr_amdgpu
+...
+
diff --git a/llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir b/llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir
new file mode 100644
index 0000000000000..00b1f6e33412f
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/statepoint-insert-waitcnts.mir
@@ -0,0 +1,64 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 6
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -run-pass=si-insert-waitcnts -o - %s | FileCheck %s
+
+# Make sure the waitcnt pass doesn't crash on statepoint
+# pseudoinstructions, and handlest he wait for the callee operand
+# correctly.
+
+---
+name:            test_wait_statepoint_callee
+tracksRegLiveness: true
+frameInfo:
+  stackSize:       16
+  maxAlignment:    4
+  adjustsStack:    true
+  hasCalls:        true
+  maxCallFrameSize: 0
+  isCalleeSavedInfoValid: true
+stack:
+  - { id: 0, offset: 4, size: 4, alignment: 4 }
+  - { id: 1, type: spill-slot, size: 4, alignment: 4 }
+  - { id: 2, type: spill-slot, size: 4, alignment: 4, stack-id: sgpr-spill }
+machineFunctionInfo:
+  hasSpilledSGPRs: true
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+  spillPhysVGPRs:
+    - '$vgpr40'
+  wwmReservedRegs:
+    - '$vgpr40'
+  scavengeFI:      '%stack.0'
+body:             |
+  bb.0:
+    liveins: $sgpr12, $sgpr13, $sgpr14, $sgpr15, $sgpr30, $sgpr31, $vgpr31, $sgpr4_sgpr5, $sgpr6_sgpr7, $sgpr8_sgpr9, $sgpr10_sgpr11, $sgpr12_sgpr13
+
+    ; CHECK-LABEL: name: test_wait_statepoint_callee
+    ; CHECK: liveins: $sgpr12, $sgpr13, $sgpr14, $sgpr15, $sgpr30, $sgpr31, $vgpr31, $sgpr4_sgpr5, $sgpr6_sgpr7, $sgpr8_sgpr9, $sgpr10_sgpr11, $sgpr12_sgpr13
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: S_WAITCNT 0
+    ; CHECK-NEXT: $sgpr16 = S_MOV_B32 $sgpr33
+    ; CHECK-NEXT: $sgpr33 = S_MOV_B32 $sgpr32
+    ; CHECK-NEXT: $sgpr18_sgpr19 = S_OR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
+    ; CHECK-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr33, 0, 0, 0, implicit $exec :: (store (s32) into %stack.1, addrspace 5)
+    ; CHECK-NEXT: $exec = S_MOV_B64 killed $sgpr18_sgpr19
+    ; CHECK-NEXT: $vgpr40 = V_WRITELANE_B32 killed $sgpr16, 2, undef $vgpr40
+    ; CHECK-NEXT: $vgpr40 = V_WRITELANE_B32 $sgpr30, 0, $vgpr40
+    ; CHECK-NEXT: $sgpr32 = frame-setup S_ADDK_I32 $sgpr32, 1024, implicit-def dead $scc
+    ; CHECK-NEXT: $vgpr40 = V_WRITELANE_B32 $sgpr31, 1, $vgpr40
+    ; CHECK-NEXT: $sgpr14_sgpr15 = S_LOAD_DWORDX2_IMM $sgpr12_sgpr13, 0, 0
+    ; CHECK-NEXT: S_WAITCNT 49279
+    ; CHECK-NEXT: STATEPOINT 2882400015, 0, 11, renamable $sgpr14_sgpr15, 0, killed $sgpr4_sgpr5, killed $sgpr6_sgpr7, killed $sgpr8_sgpr9, killed $sgpr10_sgpr11, killed $sgpr12, killed $sgpr13, killed $sgpr14, killed $sgpr15, killed $vgpr31, $sgpr0_sgpr1_sgpr2_sgpr3, 2, 0, 2, 0, 2, 1, 2, 0, 2, 0, 2, 0, 2, 0, csr_amdgpu
+    $sgpr16 = S_MOV_B32 $sgpr33
+    $sgpr33 = S_MOV_B32 $sgpr32
+    $sgpr18_sgpr19 = S_OR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
+    BUFFER_STORE_DWORD_OFFSET $vgpr40, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr33, 0, 0, 0, implicit $exec :: (store (s32) into %stack.1, addrspace 5)
+    $exec = S_MOV_B64 killed $sgpr18_sgpr19
+    $vgpr40 = V_WRITELANE_B32 killed $sgpr16, 2, undef $vgpr40
+    $vgpr40 = V_WRITELANE_B32 $sgpr30, 0, $vgpr40
+    $sgpr32 = frame-setup S_ADDK_I32 $sgpr32, 1024, implicit-def dead $scc
+    $vgpr40 = V_WRITELANE_B32 $sgpr31, 1, $vgpr40
+    $sgpr14_sgpr15 = S_LOAD_DWORDX2_IMM $sgpr12_sgpr13, 0, 0
+    STATEPOINT 2882400015, 0, 11, renamable $sgpr14_sgpr15, 0, killed $sgpr4_sgpr5, killed $sgpr6_sgpr7, killed $sgpr8_sgpr9, killed $sgpr10_sgpr11, killed $sgpr12, killed $sgpr13, killed $sgpr14, killed $sgpr15, killed $vgpr31, $sgpr0_sgpr1_sgpr2_sgpr3, 2, 0, 2, 0, 2, 1, 2, 0, 2, 0, 2, 0, 2, 0, csr_amdgpu
+
+...

@arsenm arsenm force-pushed the users/arsenm/amdgpu/create-dummy-call-sequence-error-function branch from 6d54816 to ce03c68 Compare December 4, 2025 17:11
@arsenm arsenm force-pushed the users/arsenm/amdgpu/avoid-crashing-statepoint-like-insts branch from ecf5f9d to 5c91afe Compare December 4, 2025 17:11
Base automatically changed from users/arsenm/amdgpu/create-dummy-call-sequence-error-function to main December 4, 2025 17:37
Copy link
Contributor

@ro-i ro-i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently don't really know anything specific about stackmap stuff.
Your ISel choices match those in X86TargetLowering tho, so seems kinda proven

@arsenm arsenm force-pushed the users/arsenm/amdgpu/avoid-crashing-statepoint-like-insts branch from 5c91afe to 764e922 Compare December 4, 2025 18:35
At the moment the MIR tests are somewhat redundant. The waitcnt
one is needed to ensure we actually have a load, given we are
currently just emitting an error on ExternalSymbol. The asm printer
one is more redundant for the moment, since it's stressed by the IR
test. However I am planning to change the error path for the IR test,
so it will soon not be redundant.
@arsenm arsenm force-pushed the users/arsenm/amdgpu/avoid-crashing-statepoint-like-insts branch from 764e922 to 42a3179 Compare December 5, 2025 22:44
@arsenm
Copy link
Contributor Author

arsenm commented Dec 8, 2025

ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU llvm:SelectionDAG SelectionDAGISel as well

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants