Skip to content

Conversation

@okakarpa
Copy link
Collaborator

Cherry-pick of #256

* replace c10_warp_size in fused rope

* replace c10_warp_size in fused softmax

* replace c10_warp_size in group batch norm

* replace c10_warp_size in multiheadattention

* replace c10_warp_size in tramsducer

* replace c10_warp_size in xentropy

* replace c10_warp_size in sync batch normalization

* replace c10_warp_size in group batch norm

* replace warp_size in multihead attention
@okakarpa okakarpa mentioned this pull request Jul 15, 2025
@amd-sriram amd-sriram self-assigned this Jul 15, 2025
@amd-sriram amd-sriram marked this pull request as ready for review July 15, 2025 16:17
@amd-sriram amd-sriram merged commit f956a66 into release/1.6.0 Jul 15, 2025
@amd-sriram amd-sriram deleted the autogenerated/release/1.6.0_cherry-pick_pr-256 branch July 15, 2025 16:17
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Jul 22, 2025
Fixing the C10_warpsize issue. replacing the macros with
at::cuda::warp_size() - ROCm/apex#243

[Replaced warpsize with C10_WARP_SIZE
](ROCm/apex@7d9850a)
- ROCm/apex#252

Fix all warp size in apex -ROCm/apex#259

Fixes : https://ontrack-internal.amd.com/browse/SWDEV-541725
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants