Skip to content

Bump compiler/amd-llvm with hotswap COMGR cherry-picks#5966

Closed
suryajasper wants to merge 2 commits into
ROCm:mainfrom
suryajasper:hotswap-llvm-bump
Closed

Bump compiler/amd-llvm with hotswap COMGR cherry-picks#5966
suryajasper wants to merge 2 commits into
ROCm:mainfrom
suryajasper:hotswap-llvm-bump

Conversation

@suryajasper

Copy link
Copy Markdown
Member

Summary

Cherry-pick 19 hotswap COMGR commits from amd-staging onto TheRock's
current compiler/amd-llvm pin (2abe93d58c8). The LLVM/Clang/LLD
compiler baseline is unchanged — only amd/comgr/ files are modified.

The branch users/sujasper/therock-hotswap-cherrypick-v2 on
ROCm/llvm-project contains the cherry-picks. A follow-up PR will bump
rocm-systems and add the THEROCK_BUILD_COMGR_HOTSWAP option to
actually enable the hotswap build.

What is HotSwap?

HotSwap is an AMDGPU binary translation system in COMGR that rewrites
GPU kernel code objects at load time for gfx1250 B0-to-A0 stepping
compatibility. The patcher is inert on non-gfx1250 targets — gfx942
CI tests are unaffected.

Cherry-picked commits

All 19 commits are already merged upstream on amd-staging:

# amd-staging SHA PR Description
1 fb7d9a433cd9 #2476 Guard operand access in bumpNextWaitDscnt
2 75c6512d6baa #2379 WMMA instruction splitting for B0-to-A0
3 7587b27c7078 #2438 Transpiler scaffolding + COMGR_ENABLE_HOTSWAP_TRANSPILE option
4 dc693256ad2c #2491 Dispatcher vtable + .def registry, Windows support
5 7fee2c7f8cc1 #2370 Tensor load trampoline patches
6 3b177f1c39b4 #2584 Preserve drain s_wait_dscnt in bumpNextWaitDscnt
7 4e092e80d6ef #2583 ElfView::getKernelLdsSize helper
8 aac106b7146e #2302 DS_ADDTID trampoline patch
9 412f238f2435 #2281 Split non-stride64 DS 2-address ops
10 52000a48805a #2328 F32 ↔ UE5M3 scratch patches
11 4299d1f6e6c5 #2844 NFC: use find_last_unset_in
12 42ece7d0b405 #2731 Scaled WMMA scratch patches
13 1e764caf137a #2873 Read SGPR count from msgpack metadata
14 da9e9b4749ae #2929 Shared ELF test helper
15 cdc39e017156 #2936 HSA_TOOLS_LIB runtime tool
16 19b1bfadeb12 #2945 Drain after ds_*_2addr split
17 d31eb22b0d18 #2947 Skip cluster_load with SADDR addressing
18 840a2a0fe321 #2956 Add missing deps to hotswap unit-tests
19 2ad4c8ac0d5d #2526 Fix LLVM include paths in RaiserScaffoldingTests

Why cherry-pick instead of bumping to amd-staging HEAD?

amd-staging is ~5800 commits ahead of TheRock's current pin. A previous
attempt (PR #5949) showed that the full bump introduces intractable test
regressions (hipRTC clearLLVMOptions() bug, SPIRV translator
incompatibility) unrelated to hotswap. Cherry-picking keeps the compiler
baseline identical and isolates validation to just the hotswap code.

Local validation (MI300X, gfx942)

  • 50/50 hotswap COMGR lit tests pass
  • Build succeeds with COMGR_ENABLE_HOTSWAP_TRANSPILE=ON and HOTSWAP_BUILD_TOOL=ON
  • Both libamd_comgr.so and libamd_comgr_hotswap_tool.so produced
  • Hotswap patcher is inert on gfx942 — no behavioral change for gfx94X tests

Test plan

  • CI compiler-runtime stage builds successfully
  • CI runtime-tests stage builds successfully
  • gfx94X-dcgpu hip-tests pass with no regressions
  • Windows build not broken (hotswap tool is not built without the enable flag)

🤖 Generated with Claude Code

Update compiler/amd-llvm to users/sujasper/therock-hotswap-cherrypick-v2:
TheRock main pin (2abe93d58c8) + 19 cherry-picked hotswap COMGR commits
from amd-staging. The LLVM/Clang/LLD compiler baseline is unchanged.

Cherry-picked commits (oldest first):

 1. fb7d9a433cd9 [Comgr] Guard operand access in bumpNextWaitDscnt (ROCm#2476)
 2. 75c6512d6baa [AMDGPU] comgr: WMMA instruction splitting (ROCm#2379)
 3. 7587b27c7078 [Comgr][hotswap] Transpiler scaffolding (ROCm#2438)
    — adds COMGR_ENABLE_HOTSWAP_TRANSPILE option and src/hotswap/ subdir
 4. dc693256ad2c [AMDGPU] comgr: Dispatcher vtable + .def registry (ROCm#2491)
 5. 7fee2c7f8cc1 [AMDGPU] comgr: Tensor load trampoline patches (ROCm#2370)
 6. 3b177f1c39b4 [AMDGPU] comgr: Preserve drain s_wait_dscnt (ROCm#2584)
 7. 4e092e80d6ef [AMDGPU] comgr: ElfView::getKernelLdsSize helper (ROCm#2583)
 8. aac106b7146e [AMDGPU] comgr: DS_ADDTID trampoline patch (ROCm#2302)
 9. 412f238f2435 [AMDGPU] comgr: Split non-stride64 DS 2-addr ops (ROCm#2281)
10. 52000a48805a [AMDGPU] comgr: F32 <-> UE5M3 scratch patches (ROCm#2328)
11. 4299d1f6e6c5 [NFC][Comgr][Hotswap] Use find_last_unset_in (ROCm#2844)
12. 42ece7d0b405 [AMDGPU] comgr: Scaled WMMA scratch patches (ROCm#2731)
13. 1e764caf137a [AMDGPU] comgr: Read SGPR from msgpack metadata (ROCm#2873)
14. da9e9b4749ae [Comgr][hotswap][test] Shared ELF test helper (ROCm#2929)
15. cdc39e017156 [Comgr][hotswap] HSA_TOOLS_LIB runtime tool (ROCm#2936)
16. 19b1bfadeb12 [Comgr][hotswap] Drain after ds_*_2addr split (ROCm#2945)
17. d31eb22b0d18 [Comgr][hotswap] Skip cluster_load SADDR (ROCm#2947)
18. 840a2a0fe321 [Comgr] Add missing deps to hotswap unit-tests (ROCm#2956)
19. 2ad4c8ac0d5d Fix LLVM include paths in RaiserScaffoldingTests (ROCm#2526)

All commits are already merged upstream on amd-staging. Only amd/comgr/
files are touched — no LLVM, Clang, or LLD changes. The hotswap patcher
is inert on non-gfx1250 targets (gfx942 CI tests are unaffected).

Validated locally on MI300X (gfx942): 50/50 hotswap lit tests pass.

Co-Authored-By: Claude <noreply@anthropic.com>

@lamb-j lamb-j left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prepares the A0-B0 portion of hotswap.

This is currently flag-gated to OFF by default. A follow-up PR to TheRock will turn things ON

Cherry-pick ROCm/llvm-project#2964 onto the hotswap branch: replaces
llvm::StringRef and llvm::ELF usage in comgr-hotswap-tool-detect.h with
std::string and raw ELF constants. This ensures the hotswap tool can
dlopen cleanly when COMGR is built with fully-hidden static LLVM symbols.

Co-Authored-By: Claude <noreply@anthropic.com>
@lamb-j

lamb-j commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

See #5989

@lamb-j lamb-j closed this Jun 19, 2026
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants