Skip to content

Conversation

@okakarpa
Copy link
Collaborator

Cherry-pick of #2328

If compiling with HIPCC (i.e `__HIPCC__` is
[defined](https://rocm.docs.amd.com/projects/HIP/en/docs-develop/how-to/hip_porting_guide.html#compiler-defines-summary)):
* Define `C10_WARP_SIZE` to be non-constexpr `at::cuda::warp_size()` for
host-compilation pass (as compared to `static constexpr int
C10_WARP_SIZE = 1;` set in
538a57d)
* Define `C10_WARP_SIZE` to be constexpr `64` for `__GFX9__`, and `32`
otherwise, for device-compilation pass

If not compiling with HIPCC:
* Define `C10_WARP_SIZE` to be non-constexpr `at::cuda::warp_size()` 

For host-compilation cases where we need a constexpr value of warp size
(eg. launch bounds), use `C10_WARP_SIZE_STATIC`, defined as `64` (Better
to err on 64 for launch bounds)

Fixes SWDEV-542227

---------

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Jul 14, 2025

Jenkins build for f2c120147200e966fda0c9ab6256bb4c56ddc677 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@jithunnair-amd jithunnair-amd changed the title [AUTOGENERATED] [rocm7.0_internal_testing] [release/2.6] Improve C10_WARP_SIZE compatibility [AUTOGENERATED] [rocm7.0_internal_testing] Improve C10_WARP_SIZE compatibility Jul 16, 2025
@jithunnair-amd jithunnair-amd marked this pull request as ready for review July 16, 2025 04:12
@jithunnair-amd jithunnair-amd merged commit 0d083bb into rocm7.0_internal_testing Jul 16, 2025
0 of 2 checks passed
@jithunnair-amd jithunnair-amd deleted the autogenerated/rocm7.0_internal_testing_cherry-pick_pr-2328 branch July 16, 2025 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants