[symm_mem] Move all symm mem code into a dedicated folder #155573

fduwjj · 2025-06-10T17:02:03Z

Stack from ghstack (oldest at bottom):

-> [symm_mem] Move all symm mem code into a dedicated folder #155573

We arrive at a point when so many files are related to symmetric memory and files are scattered around in the cpp side. Let's first put all related code (symmetric memory related) into a separate folder. We can do further refactoring later if needed.

cc @H-Huang @awgu @wanchaol @fegin @wz337 @wconstab @d4l3k

[ghstack-poisoned]

pytorch-bot · 2025-06-10T17:02:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155573

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 2 Pending

As of commit 4dc5fd2 with merge base 31405a6 ():
💚 Looks good so far! There are no failures yet. 💚

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

⏳ pull / linux-jammy-py3-clang12-executorch / build (gh) (#150261)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torch/csrc/distributed/c10d/symm_mem/intra_node_comm.hpp

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k [ghstack-poisoned]

We arrive at a point when so many files are related to symmetric memory and files are scattered around in the cpp side. Let's first put all related code (symmetric memory related) into a separate folder. We can do further refactoring later if needed. cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k [ghstack-poisoned]

ghstack-source-id: 4f5d3c0 Pull Request resolved: #155573

fegin

stamp to unblock

fduwjj · 2025-06-10T19:45:30Z

@pytorchbot merge

d4l3k

LGTM

pytorchmergebot · 2025-06-10T19:47:29Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…#155823) We moved all symm_mem code into a folder ([CudaDMAConnectivity](#155573)) but somehow forgot update for CudaDMAConnectivity in the CMakeList. Users see errors: RuntimeError: DMA connectivity detector for cuda over nvlink is not available while torch.distributed.init_process_group(backend=backend). So this PR should fix it. Pull Request resolved: #155823 Approved by: https://github.com/Skylion007

[symm_mem] Move all symm mem code into a dedicated folder

276380b

[ghstack-poisoned]

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (c10d) release notes category labels Jun 10, 2025

fduwjj marked this pull request as draft June 10, 2025 17:02

Skylion007 reviewed Jun 10, 2025

View reviewed changes

torch/csrc/distributed/c10d/symm_mem/intra_node_comm.hpp Outdated Show resolved Hide resolved

Update on "[symm_mem] Move all symm mem code into a dedicated folder"

696b4b9

cc H-Huang awgu wanchaol fegin wz337 wconstab d4l3k [ghstack-poisoned]

fduwjj added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 10, 2025

fduwjj requested review from d4l3k, fegin, kwen2501, ngimel and wconstab June 10, 2025 17:18

fduwjj marked this pull request as ready for review June 10, 2025 17:18

fduwjj added a commit that referenced this pull request Jun 10, 2025

[symm_mem] Move all symm mem code into a dedicated folder

5bd95e5

ghstack-source-id: 4f5d3c0 Pull Request resolved: #155573

fegin approved these changes Jun 10, 2025

View reviewed changes

d4l3k approved these changes Jun 10, 2025

View reviewed changes

pytorchmergebot added the merging label Jun 10, 2025

pytorchmergebot added the Merged label Jun 10, 2025

pytorchmergebot closed this in ffc6cbf Jun 10, 2025

pytorchmergebot removed the merging label Jun 10, 2025

This was referenced Jun 11, 2025

[SymmMem] Enable NVSHMEM for Triton #155506

Closed

[symm_mem] Add nccl as a backend for symmetric memory #155740

Closed

[symm_mem] Update CMakeList to reflect code moving a dedicated folder #155823

Closed

github-actions bot deleted the gh/fduwjj/145/head branch July 13, 2025 02:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[symm_mem] Move all symm mem code into a dedicated folder #155573

[symm_mem] Move all symm mem code into a dedicated folder #155573

Uh oh!

fduwjj commented Jun 10, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jun 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

fegin left a comment

Uh oh!

fduwjj commented Jun 10, 2025

Uh oh!

d4l3k left a comment

Uh oh!

pytorchmergebot commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[symm_mem] Move all symm mem code into a dedicated folder #155573

[symm_mem] Move all symm mem code into a dedicated folder #155573

Uh oh!

Conversation

fduwjj commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/155573

⏳ No Failures, 2 Pending

Uh oh!

Uh oh!

fegin left a comment

Choose a reason for hiding this comment

Uh oh!

fduwjj commented Jun 10, 2025

Uh oh!

d4l3k left a comment

Choose a reason for hiding this comment

Uh oh!

pytorchmergebot commented Jun 10, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fduwjj commented Jun 10, 2025 •

edited

Loading

pytorch-bot bot commented Jun 10, 2025 •

edited

Loading