Skip to content

Conversation

@jerrymannil
Copy link
Collaborator

@jerrymannil jerrymannil commented Aug 26, 2025

This change removes need for fences in global_reduce by converting the stores to reduce_buffer[] into atomics+return. This is crucial for perf in architectures with split caches (e.g. MI300), where fences are inherently costly.

cherry-pick of pytorch#161180

Cherry-picked to release/2.8 branch via #2585

Cherry-picked to rocm7.1_internal_testing branch via #2586

@jerrymannil jerrymannil self-assigned this Aug 26, 2025
@jerrymannil jerrymannil merged commit cbf75ac into release/2.7 Aug 26, 2025
@jerrymannil jerrymannil deleted the jerrymannil_patch branch August 26, 2025 20:47
@jerrymannil
Copy link
Collaborator Author

! cherry-pick --onto release/2.8 rocm7.1_internal_testing

dhonnappa-amd pushed a commit that referenced this pull request Aug 27, 2025
This change removes need for fences in global_reduce by converting the
stores to reduce_buffer[] into atomics+return. This is crucial for perf
in architectures with split caches (e.g. MI300), where fences are
inherently costly.

cherry-pick of pytorch#161180
dhonnappa-amd pushed a commit that referenced this pull request Aug 27, 2025
This change removes need for fences in global_reduce by converting the
stores to reduce_buffer[] into atomics+return. This is crucial for perf
in architectures with split caches (e.g. MI300), where fences are
inherently costly.

cherry-pick of pytorch#161180
@dhonnappa-amd
Copy link

Created branch autogenerated/release/2.8_cherry-pick_pr-2584 and #2585

Created branch autogenerated/rocm7.1_internal_testing_cherry-pick_pr-2584 and #2586

Comment processed by Build

jerrymannil added a commit that referenced this pull request Aug 27, 2025
Cherry-pick of #2584

Co-authored-by: Jerry Mannil <65309407+jerrymannil@users.noreply.github.com>
jerrymannil added a commit that referenced this pull request Aug 27, 2025
…uce (#2586)

Cherry-pick of #2584

Co-authored-by: Jerry Mannil <65309407+jerrymannil@users.noreply.github.com>
jerrymannil added a commit that referenced this pull request Sep 5, 2025
…uce (#2586)

Cherry-pick of #2584

Co-authored-by: Jerry Mannil <65309407+jerrymannil@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants