Skip to content

Conversation

@okakarpa
Copy link
Collaborator

Cherry-pick of #2374

…2432) (#2374)

cherry-pick of
pytorch@e4adf5d

We need -fgpu-rdc for projects such as DeepEP + rocSHMEM. The default of
-no-gpu-rdc doesn't work for such cases.

As per
pytorch#152432 (comment):
"rocshmem shares the same global variable in different files, as deepEP
uses CUDAExtention to build the project
https://github.com/deepseek-ai/DeepEP/blob/65e2a700f0330f3fb1c26f49a0250d1f9d0ac1e3/setup.py#L51
and depends on rocshmem, this -fgpu-rdc is needed. The current logic in
Pytorch prevents users from overriding this flag."

Pull Request resolved: pytorch#152432
Approved by: https://github.com/jeffdaily

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
@rocm-repo-management-api
Copy link

Jenkins build for 0ce51328ec7f35ac40fee649e0e245352c806846 commit is in progress
Links: Blue Ocean view / Build artifacts

@jerrymannil jerrymannil marked this pull request as ready for review July 16, 2025 17:26
@jerrymannil jerrymannil merged commit a97f45c into rocm7.0_internal_testing Jul 16, 2025
0 of 2 checks passed
@jerrymannil jerrymannil deleted the autogenerated/rocm7.0_internal_testing_cherry-pick_pr-2374 branch July 16, 2025 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants