Skip to content

Conversation

@amd-sriram
Copy link

@amd-sriram amd-sriram commented Jul 8, 2025

1 ) [master] Added AITER as a submodule and use in fused_rope.py - ROCm/apex@d533e3f

  1. Not using warpSize as a constexpr in nhwc_batch_norm_kernel.h. replaces the variable use with the correct values based on the architecture we're running on. - ROCm/apex@7f38d9d

  2. Replacing c10_warp_size with platform based warp_size values

Fixes

  1. https://ontrack-internal.amd.com/browse/SWDEV-496182
  2. https://ontrack-internal.amd.com/browse/SWDEV-541770
  3. https://ontrack-internal.amd.com/browse/SWDEV-541725

1 ) [master] Added AITER as a submodule and use in fused_rope.py - ROCm/apex@d533e3f

2) Not using warpSize as a constexpr in nhwc_batch_norm_kernel.h. replaces the variable use with the correct values based on the architecture we're running on. - ROCm/apex@7f38d9d
@amd-sriram amd-sriram self-assigned this Jul 8, 2025
@pruthvistony pruthvistony merged commit c3f758e into rocm7.0_internal_testing Jul 8, 2025
0 of 2 checks passed
@pruthvistony pruthvistony deleted the amd-sriram-patch-4 branch July 8, 2025 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants