Skip to content

Conversation

dhonnappa-amd
Copy link

Cherry-pick of #2597

cherry-pick of pytorch#161700

Our compiler is generating inefficient code for the offsetCalc in
certain situations. The root-cause for this needs to be identified. For
now specialized unrolling based on 'dims' notably helps perf.

Fixes SWDEV-545713, SWDEV-545710
@jerrymannil jerrymannil marked this pull request as ready for review September 3, 2025 20:17
@jerrymannil jerrymannil self-assigned this Sep 3, 2025
@jerrymannil jerrymannil merged commit 681e60e into rocm7.1_internal_testing Sep 3, 2025
1 check passed
@jerrymannil jerrymannil deleted the autogenerated/rocm7.1_internal_testing_cherry-pick_pr-2597 branch September 3, 2025 20:17
jerrymannil added a commit that referenced this pull request Sep 5, 2025
…ptimization (#2600)

Cherry-pick of #2597

Co-authored-by: Jerry Mannil <65309407+jerrymannil@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants