Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions llvm/test/CodeGen/AMDGPU/wait-xcnt.mir
Original file line number Diff line number Diff line change
Expand Up @@ -945,6 +945,46 @@ body: |
$vgpr0 = V_MOV_B32_e32 0, implicit $exec
...

# FIXME: Missing S_WAIT_XCNT before overwriting vgpr0.
---
name: wait_kmcnt_with_outstanding_vmem_2
tracksRegLiveness: true
machineFunctionInfo:
isEntryFunction: true
body: |
; GCN-LABEL: name: wait_kmcnt_with_outstanding_vmem_2
; GCN: bb.0:
; GCN-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000)
; GCN-NEXT: liveins: $vgpr0_vgpr1, $sgpr0_sgpr1, $scc
; GCN-NEXT: {{ $}}
; GCN-NEXT: $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
; GCN-NEXT: S_CBRANCH_SCC1 %bb.2, implicit $scc
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.1:
; GCN-NEXT: successors: %bb.2(0x80000000)
; GCN-NEXT: liveins: $vgpr0_vgpr1, $sgpr2
; GCN-NEXT: {{ $}}
; GCN-NEXT: $vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.2:
; GCN-NEXT: liveins: $sgpr2
; GCN-NEXT: {{ $}}
; GCN-NEXT: S_WAIT_KMCNT 0
; GCN-NEXT: $sgpr2 = S_MOV_B32 $sgpr2
; GCN-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
bb.0:
liveins: $vgpr0_vgpr1, $sgpr0_sgpr1, $scc
$sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
S_CBRANCH_SCC1 %bb.2, implicit $scc
bb.1:
liveins: $vgpr0_vgpr1, $sgpr2
$vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
bb.2:
Copy link
Contributor

@easyonaadit easyonaadit Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the hardware insert an implicit Xcnt between VMEM and SMEM which span basic blocks?
In this case, does the GLOBAL_LOAD imply an implicit XCNT for the S_LOAD has been inserted?
What about cases where control flow doesn't end up in bb.1? Should the compiler be conservative and insert an Xcnt at the start of a basic block with outstanding VMEM/SMEM events?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardware doesn't know anything about basic blocks, and I think even branches don't have any effect on xcnt insertion.

Yes the compiler has to be conservative and assume that control flow may or may not have gone through bb.1. The way to handle this is for the merged state at bb.2 to have both SMEM_GROUP and VMEM_GROUP pending - even though a normal unmerged state would never have both event pending at the same time.

liveins: $sgpr2
$sgpr2 = S_MOV_B32 $sgpr2
$vgpr0 = V_MOV_B32_e32 0, implicit $exec
...

---
name: wait_loadcnt_with_outstanding_smem
tracksRegLiveness: true
Expand Down