-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hipFuncGetAttributes returns inaccurate maxThreadsPerBlock #1662
Comments
here's a reproducer:
returns
on gfx-900 with hcc (HIP on ubuntu 16.04, hip_base version 2.8.19361.4078-rocm-rel-2.9-6-cbe6b65) |
mangupta
pushed a commit
that referenced
this issue
Mar 18, 2020
This PR takes ensures that the maxThreadsPerBlock returned by hipFuncGetAttributes is both a multiple of the warp size and that the register usage of the maximum block does not exceed the number of available registers. Fixes #1662
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
With the HCC backend, using
hipFuncGetAttributes
on a__global__
function does not always return an accurate upper bound of the maximum block size. Neither does it return a multiple of the warp size. As a consequence, when iterating over possible kernel parameters allowed byattr.maxThreadsPerBlock
, I get an error:### HCC STATUS_CHECK Error: HSA_STATUS_ERROR_INVALID_ISA (0x100f) at file:mcwamp_hsa.cpp line:1194
I believe this is because because the code in
hip_module.cpp
hardcodes a fixed number of registers per CU (65536), and does not distinguish between SGPR and VGPR usage of a kernel function. Instead, it should do something more sensible like inhipOccupancyMaxPotentialBlockSize
.The text was updated successfully, but these errors were encountered: