[AMDGPU] Add structural stall heuristic to scheduling strategies #169617

kerbowa · 2025-11-26T06:58:49Z

Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions from the
pending queue.

Add getStructuralStallCycles() to GCNSchedStrategy that computes the
number of cycles an instruction must wait due to:
- Resource conflicts on unbuffered resources (from the SchedModel)
- Sequence-dependent hazards (from GCNHazardRecognizer)
Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
of wait states until all hazards for an instruction are resolved,
providing cycle-accurate hazard information for scheduling heuristics.

Implements a structural stall heuristic that considers both resource hazards and latency constraints when selecting instructions from the pending queue. - Add getStructuralStallCycles() to GCNSchedStrategy that computes the number of cycles an instruction must wait due to: - Resource conflicts on unbuffered resources (from the SchedModel) - Sequence-dependent hazards (from GCNHazardRecognizer) - Add getHazardWaitStates() to GCNHazardRecognizer that returns the number of wait states until all hazards for an instruction are resolved, providing cycle-accurate hazard information for scheduling heuristics.

kerbowa · 2025-11-26T06:59:06Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2025-11-26T07:00:32Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/lib/Target/AMDGPU/AMDGPUMLSchedStrategy.cpp llvm/lib/Target/AMDGPU/AMDGPUMLSchedStrategy.h llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp llvm/lib/Target/AMDGPU/GCNSchedStrategy.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.

diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
index 048eeecac..10989b4e6 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -68,7 +68,8 @@ protected:
   /// invisible to scheduling heuristics. However, in certain scenarios (such as
   /// avoiding register spilling), it may be beneficial to consider scheduling
   /// these not-yet-ready instructions.
-  virtual bool tryPendingCandidate(SchedCandidate &Cand, SchedCandidate &TryCand,
+  virtual bool tryPendingCandidate(SchedCandidate &Cand,
+                                   SchedCandidate &TryCand,
                                    SchedBoundary *Zone) const;
 
   void printCandidateDecision(const SchedCandidate &Current,

github-actions · 2025-11-26T07:49:54Z

🐧 Linux x64 Test Results

166036 tests passed
2837 tests skipped
3 tests failed

Failed Tests

(click on a test name to see its output)

LLVM

LLVM.CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir

Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -start-before=machine-scheduler -verify-misched -o - /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -start-before=machine-scheduler -verify-misched -o - /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir:10:14: error: GCN-NEXT: is not on the line after the previous match
# |  ; GCN-NEXT: ; implicit-def: $vgpr25
# |              ^
# | <stdin>:17:2: note: 'next' match was here
# |  ; implicit-def: $vgpr25
# |  ^
# | <stdin>:9:25: note: previous match ended here
# |  ; implicit-def: $vgpr16
# |                         ^
# | <stdin>:10:1: note: non-matching line after previous match is here
# |  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |          .
# |          .
# |          .
# |         12:  ; implicit-def: $vgpr18 
# |         13:  s_lshl_b32 s18, s17, 7 
# |         14:  ; implicit-def: $vgpr20 
# |         15:  v_add_lshl_u32 v69, v18, s18, 1 
# |         16:  v_add_u32_e32 v18, s17, v20 
# |         17:  ; implicit-def: $vgpr25 
# | next:10      !~~~~~~~~~~~~~~~~~~~~~~  error: match on wrong line
# |         18:  v_and_b32_e32 v18, 0x1fffffff, v18 
# |         19:  ; implicit-def: $sgpr16 
# |         20:  v_mul_lo_u32 v18, v18, s16 
# |         21:  v_lshl_add_u32 v25, s17, 4, v25 
# |         22:  ; implicit-def: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 
# |          .
# |          .
# |          .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

LLVM.CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll

Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn -mcpu=gfx90a -misched-cluster=0  < /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn -mcpu=gfx90a -misched-cluster=0
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll:675:13: error: GCN-NEXT: is not on the line after the previous match
# | ; GCN-NEXT: s_waitcnt lgkmcnt(8)
# |             ^
# | <stdin>:538:2: note: 'next' match was here
# |  s_waitcnt lgkmcnt(8)
# |  ^
# | <stdin>:535:59: note: previous match ended here
# |  ; sched_group_barrier mask(0x00000100) size(40) SyncID(0)
# |                                                           ^
# | <stdin>:536:1: note: non-matching line after previous match is here
# |  v_mfma_f32_32x32x1f32 a[0:31], v2, v1, a[0:31]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |           .
# |           .
# |           .
# |         533:  v_mfma_f32_32x32x1f32 a[32:63], v2, v1, a[32:63] 
# |         534:  v_add_u32_e32 v0, s1, v0 
# |         535:  ; sched_group_barrier mask(0x00000100) size(40) SyncID(0) 
# |         536:  v_mfma_f32_32x32x1f32 a[0:31], v2, v1, a[0:31] 
# |         537:  v_mfma_f32_32x32x1f32 a[64:95], v2, v1, a[64:95] 
# |         538:  s_waitcnt lgkmcnt(8) 
# | next:675      !~~~~~~~~~~~~~~~~~~~  error: match on wrong line
# |         539:  v_mfma_f32_32x32x1f32 a[96:127], v2, v1, a[96:127] 
# |         540:  s_waitcnt lgkmcnt(0) 
# |         541:  v_mfma_f32_32x32x1f32 a[128:159], v2, v1, a[128:159] 
# |         542:  s_nop 11 
# |         543:  ds_write_b128 v0, a[60:63] offset:112 
# |           .
# |           .
# |           .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

LLVM.CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll

Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -verify-machineinstrs -mcpu=gfx942 -amdgpu-mfma-vgpr-form < /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -verify-machineinstrs -mcpu=gfx942 -amdgpu-mfma-vgpr-form
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll:372:15: error: CHECK-NEXT: expected string not found in input
# | ; CHECK-NEXT: v_mov_b64_e32 v[26:27], s[4:5]
# |               ^
# | <stdin>:890:18: note: scanning from here
# |  s_mov_b32 s5, s4
# |                  ^
# | <stdin>:891:2: note: possible intended match here
# |  v_mov_b64_e32 v[24:25], s[4:5]
# |  ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |           885:  .p2align 8 
# |           886:  .type illegal_mfma_after_rewrite,@function 
# |           887: illegal_mfma_after_rewrite: ; @illegal_mfma_after_rewrite 
# |           888: ; %bb.0: ; %entry 
# |           889:  s_mov_b32 s4, 0 
# |           890:  s_mov_b32 s5, s4 
# | next:372'0                      X error: no match found
# |           891:  v_mov_b64_e32 v[24:25], s[4:5] 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:372'1      ?                               possible intended match
# |           892:  ;;#ASMSTART 
# | next:372'0     ~~~~~~~~~~~~~
# |           893:  ; def s[0:3] 
# | next:372'0     ~~~~~~~~~~~~~~
# |           894:  ;;#ASMEND 
# | next:372'0     ~~~~~~~~~~~
# |           895:  v_mov_b32_e32 v16, 0x7fc00000 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           896:  v_mov_b64_e32 v[6:7], s[2:3] 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             .
# |             .
# |             .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

kerbowa mentioned this pull request Nov 26, 2025

[AMDGPU] Add scaffolding for ML focused scheduling strategy #169616

Draft

kerbowa requested a review from jrbyrnes November 26, 2025 06:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Add structural stall heuristic to scheduling strategies #169617

[AMDGPU] Add structural stall heuristic to scheduling strategies #169617

Uh oh!

kerbowa commented Nov 26, 2025

Uh oh!

kerbowa commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[AMDGPU] Add structural stall heuristic to scheduling strategies #169617

Are you sure you want to change the base?

[AMDGPU] Add structural stall heuristic to scheduling strategies #169617

Uh oh!

Conversation

kerbowa commented Nov 26, 2025

Uh oh!

kerbowa commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

🐧 Linux x64 Test Results

Failed Tests

LLVM

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants