Skip to content

[Bug]: CK WMMA kernels fail on gfx1150 — amd_wmma.hpp missing from gfx115x enablement #3724

@Peterc3-dev

Description

@Peterc3-dev

Description

CK WMMA kernels crash at runtime on gfx1150 (Ryzen AI 9 HX 370, Radeon 890M, RDNA 3.5 Strix Point):

hip_code_object.cpp:400: StatCO::getStatFunc: Assertion 'err == hipSuccess' failed

Root Cause

PR #2065 added gfx115x compile-time support to ck.hpp, device_prop.hpp, and ck_tile/core/config.hpp, but missed include/ck/utility/amd_wmma.hpp.

In ROCm 7.2.x, amd_wmma.hpp lines 12-14:

#if defined(__gfx1100__) || defined(__gfx1101__) || defined(__gfx1102__) || \
    defined(__gfx1103__) || defined(__gfx11_generic__)
#define __gfx11__

gfx1150/1151/1152 are missing. All __builtin_amdgcn_wmma_* intrinsics are gated on __gfx11__, so they compile to no-ops (the else branch assigns to ignore). When MIOpen selects a CK WMMA solver based on is_gfx11_supported() (which correctly returns true from ck.hpp), the kernel contains no WMMA instructions and the code object fails to load.

The Fix Already Exists on develop

On develop, amd_wmma.hpp correctly includes gfx115x:

#if defined(__gfx1100__) || defined(__gfx1101__) || defined(__gfx1102__) || \
    defined(__gfx1103__) || defined(__gfx1150__) || defined(__gfx1151__) || \
    defined(__gfx1152__) || defined(__gfx1153__) || defined(__gfx11_generic__)
#define __gfx11__

Request: Backport this fix to release/rocm-rel-7.2.

Reproduction

  1. Build CK with -DGPU_TARGETS=gfx1150 on ROCm 7.2.0
  2. Run any WMMA-based kernel (e.g., MIOpen CK convolution solver)
  3. Observe StatCO::getStatFunc assertion failure

Environment

  • Hardware: AMD Ryzen AI 9 HX 370, Radeon 890M (gfx1150)
  • OS: CachyOS, kernel 6.19.5
  • ROCm: 7.2.0
  • MIOpen: Built from amd-develop with gfx1150 in legacy CK whitelist

Related

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions