Skip to content

Conversation

@kerbowa
Copy link
Member

@kerbowa kerbowa commented Nov 26, 2025

Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions from the
pending queue.

  • Add getStructuralStallCycles() to GCNSchedStrategy that computes the
    number of cycles an instruction must wait due to:

    • Resource conflicts on unbuffered resources (from the SchedModel)
    • Sequence-dependent hazards (from GCNHazardRecognizer)
  • Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
    of wait states until all hazards for an instruction are resolved,
    providing cycle-accurate hazard information for scheduling heuristics.

Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions from the
pending queue.

- Add getStructuralStallCycles() to GCNSchedStrategy that computes the
number of cycles an instruction must wait due to:
  - Resource conflicts on unbuffered resources (from the SchedModel)
  - Sequence-dependent hazards (from GCNHazardRecognizer)

- Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
of wait states until all hazards for an instruction are resolved,
providing cycle-accurate hazard information for scheduling heuristics.
Copy link
Member Author

kerbowa commented Nov 26, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/lib/Target/AMDGPU/AMDGPUMLSchedStrategy.cpp llvm/lib/Target/AMDGPU/AMDGPUMLSchedStrategy.h llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp llvm/lib/Target/AMDGPU/GCNSchedStrategy.h --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
index 048eeecac..10989b4e6 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -68,7 +68,8 @@ protected:
   /// invisible to scheduling heuristics. However, in certain scenarios (such as
   /// avoiding register spilling), it may be beneficial to consider scheduling
   /// these not-yet-ready instructions.
-  virtual bool tryPendingCandidate(SchedCandidate &Cand, SchedCandidate &TryCand,
+  virtual bool tryPendingCandidate(SchedCandidate &Cand,
+                                   SchedCandidate &TryCand,
                                    SchedBoundary *Zone) const;
 
   void printCandidateDecision(const SchedCandidate &Current,

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 166036 tests passed
  • 2837 tests skipped
  • 3 tests failed

Failed Tests

(click on a test name to see its output)

LLVM

LLVM.CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -start-before=machine-scheduler -verify-misched -o - /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -start-before=machine-scheduler -verify-misched -o - /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir:10:14: error: GCN-NEXT: is not on the line after the previous match
# |  ; GCN-NEXT: ; implicit-def: $vgpr25
# |              ^
# | <stdin>:17:2: note: 'next' match was here
# |  ; implicit-def: $vgpr25
# |  ^
# | <stdin>:9:25: note: previous match ended here
# |  ; implicit-def: $vgpr16
# |                         ^
# | <stdin>:10:1: note: non-matching line after previous match is here
# |  s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.exp.large.mir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |          .
# |          .
# |          .
# |         12:  ; implicit-def: $vgpr18 
# |         13:  s_lshl_b32 s18, s17, 7 
# |         14:  ; implicit-def: $vgpr20 
# |         15:  v_add_lshl_u32 v69, v18, s18, 1 
# |         16:  v_add_u32_e32 v18, s17, v20 
# |         17:  ; implicit-def: $vgpr25 
# | next:10      !~~~~~~~~~~~~~~~~~~~~~~  error: match on wrong line
# |         18:  v_and_b32_e32 v18, 0x1fffffff, v18 
# |         19:  ; implicit-def: $sgpr16 
# |         20:  v_mul_lo_u32 v18, v18, s16 
# |         21:  v_lshl_add_u32 v25, s17, 4, v25 
# |         22:  ; implicit-def: $sgpr0_sgpr1_sgpr2_sgpr3_sgpr4_sgpr5_sgpr6_sgpr7_sgpr8_sgpr9_sgpr10_sgpr11_sgpr12_sgpr13_sgpr14_sgpr15 
# |          .
# |          .
# |          .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

LLVM.CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn -mcpu=gfx90a -misched-cluster=0  < /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -mtriple=amdgcn -mcpu=gfx90a -misched-cluster=0
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck -check-prefix=GCN /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll:675:13: error: GCN-NEXT: is not on the line after the previous match
# | ; GCN-NEXT: s_waitcnt lgkmcnt(8)
# |             ^
# | <stdin>:538:2: note: 'next' match was here
# |  s_waitcnt lgkmcnt(8)
# |  ^
# | <stdin>:535:59: note: previous match ended here
# |  ; sched_group_barrier mask(0x00000100) size(40) SyncID(0)
# |                                                           ^
# | <stdin>:536:1: note: non-matching line after previous match is here
# |  v_mfma_f32_32x32x1f32 a[0:31], v2, v1, a[0:31]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |           .
# |           .
# |           .
# |         533:  v_mfma_f32_32x32x1f32 a[32:63], v2, v1, a[32:63] 
# |         534:  v_add_u32_e32 v0, s1, v0 
# |         535:  ; sched_group_barrier mask(0x00000100) size(40) SyncID(0) 
# |         536:  v_mfma_f32_32x32x1f32 a[0:31], v2, v1, a[0:31] 
# |         537:  v_mfma_f32_32x32x1f32 a[64:95], v2, v1, a[64:95] 
# |         538:  s_waitcnt lgkmcnt(8) 
# | next:675      !~~~~~~~~~~~~~~~~~~~  error: match on wrong line
# |         539:  v_mfma_f32_32x32x1f32 a[96:127], v2, v1, a[96:127] 
# |         540:  s_waitcnt lgkmcnt(0) 
# |         541:  v_mfma_f32_32x32x1f32 a[128:159], v2, v1, a[128:159] 
# |         542:  s_nop 11 
# |         543:  ds_write_b128 v0, a[60:63] offset:112 
# |           .
# |           .
# |           .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

LLVM.CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -verify-machineinstrs -mcpu=gfx942 -amdgpu-mfma-vgpr-form < /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/llc -verify-machineinstrs -mcpu=gfx942 -amdgpu-mfma-vgpr-form
# note: command had no output on stdout or stderr
# executed command: /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# .---command stderr------------
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll:372:15: error: CHECK-NEXT: expected string not found in input
# | ; CHECK-NEXT: v_mov_b64_e32 v[26:27], s[4:5]
# |               ^
# | <stdin>:890:18: note: scanning from here
# |  s_mov_b32 s5, s4
# |                  ^
# | <stdin>:891:2: note: possible intended match here
# |  v_mov_b64_e32 v[24:25], s[4:5]
# |  ^
# | 
# | Input file: <stdin>
# | Check file: /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |           885:  .p2align 8 
# |           886:  .type illegal_mfma_after_rewrite,@function 
# |           887: illegal_mfma_after_rewrite: ; @illegal_mfma_after_rewrite 
# |           888: ; %bb.0: ; %entry 
# |           889:  s_mov_b32 s4, 0 
# |           890:  s_mov_b32 s5, s4 
# | next:372'0                      X error: no match found
# |           891:  v_mov_b64_e32 v[24:25], s[4:5] 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | next:372'1      ?                               possible intended match
# |           892:  ;;#ASMSTART 
# | next:372'0     ~~~~~~~~~~~~~
# |           893:  ; def s[0:3] 
# | next:372'0     ~~~~~~~~~~~~~~
# |           894:  ;;#ASMEND 
# | next:372'0     ~~~~~~~~~~~
# |           895:  v_mov_b32_e32 v16, 0x7fc00000 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |           896:  v_mov_b64_e32 v[6:7], s[2:3] 
# | next:372'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |             .
# |             .
# |             .
# | >>>>>>
# `-----------------------------
# error: command failed with exit status: 1

--

If these failures are unrelated to your changes (for example tests are broken or flaky at HEAD), please open an issue at https://github.com/llvm/llvm-project/issues and add the infrastructure label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants