[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

XilunWu · 2024-03-27T23:16:15Z

Stack from ghstack (oldest at bottom):

Summary
Always wrap local tensor into a LocalShardsWrapper. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over LocalShardsWrapper, users need to extend its __torch_dispatch__.

Test
torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even

Result

Row-wise even sharding example in DTensor
         Col 0-15
-------  ----------
Row 0-1  cuda:0
Row 2-3  cuda:1
Row 4-5  cuda:2
Row 6-7  cuda:3

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-03-27T23:16:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/122843

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit cc5525a with merge base e70bf23 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 46a3209 Pull Request resolved: #122843

wanchaol

Please address the previous PR comment as it might change how this PR is implemented, and re-request review once done :)

…dsWrapper" **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 chauhang [ghstack-poisoned]

wz337

LGTM!

…ticipating ranks only (#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: #122853 Approved by: https://github.com/wz337 ghstack dependencies: #120265, #121392, #122843

…ytorch#122843) **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` Pull Request resolved: pytorch#122843 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392

…ticipating ranks only (pytorch#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: pytorch#122853 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392, pytorch#122843

…ytorch#122843) **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` Pull Request resolved: pytorch#122843 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392

…ticipating ranks only (pytorch#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: pytorch#122853 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392, pytorch#122843

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper

46e2f19

[ghstack-poisoned]

XilunWu mentioned this pull request Mar 27, 2024

[dtensor][2/N] add torchrec table-wise sharding example #120265

Closed

XilunWu mentioned this pull request Mar 27, 2024

[dtensor][3/N] add torchrec row-wise uneven sharding example #121392

Closed

pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Mar 27, 2024

XilunWu added a commit that referenced this pull request Mar 27, 2024

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper

91e050b

ghstack-source-id: 46a3209 Pull Request resolved: #122843

XilunWu requested review from wanchaol and zainhuda March 27, 2024 23:26

XilunWu mentioned this pull request Mar 28, 2024

[dtensor][5/N] have table-wise sharding use LocalShardsWrapper on participating ranks only #122853

Closed

wanchaol reviewed Apr 4, 2024

View reviewed changes

XilunWu requested review from fegin, tianyu-l, wanchaol and wz337 April 12, 2024 10:04

wz337 approved these changes Apr 16, 2024

View reviewed changes

XilunWu added the topic: not user facing topic category label Apr 16, 2024

pytorchmergebot added the Merged label Apr 16, 2024

pytorchmergebot closed this in d419fcd Apr 16, 2024

github-actions bot deleted the gh/XilunWu/74/head branch May 31, 2024 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

Uh oh!

XilunWu commented Mar 27, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 27, 2024 •

edited

Loading

Uh oh!

wanchaol left a comment

Uh oh!

wz337 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

Uh oh!

Conversation

XilunWu commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/122843

✅ No Failures

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

wz337 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

XilunWu commented Mar 27, 2024 •

edited

Loading

pytorch-bot bot commented Mar 27, 2024 •

edited

Loading