[dtensor][2/N] add torchrec table-wise sharding example #120265

XilunWu · 2024-02-21T00:21:25Z

Stack from ghstack (oldest at bottom):

Summary
This PR serves as a start of this effort by adding an example test that represents TorchRec's ShardingType.TABLE_WISE using DTensor.

Test
torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-02-21T00:21:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120265

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e469287 with merge base e70bf23 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

ghstack-source-id: 8906dba Pull Request resolved: #120265

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

ghstack-source-id: 98babaa Pull Request resolved: #120265

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

ghstack-source-id: 5e9a2be Pull Request resolved: #120265

XilunWu · 2024-03-05T10:49:04Z

TBD: what is the best way to represent table-wise sharding on non-participating ranks? i.e. shall we have the same sub-mesh on all ranks?

We call from_local on empty tensor with submesh [current_rank]
We call from_local on empty tensor with submesh [actual_hold_table_rank]

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

ghstack-source-id: d9d7877 Pull Request resolved: #120265

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

wanchaol

lgtm!

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

**Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven` Pull Request resolved: #121392 Approved by: https://github.com/wanchaol ghstack dependencies: #120265

…122843) **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` Pull Request resolved: #122843 Approved by: https://github.com/wz337 ghstack dependencies: #120265, #121392

…ticipating ranks only (#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: #122853 Approved by: https://github.com/wz337 ghstack dependencies: #120265, #121392, #122843

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: pytorch#120265 Approved by: https://github.com/wanchaol

…#121392) **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven` Pull Request resolved: pytorch#121392 Approved by: https://github.com/wanchaol ghstack dependencies: pytorch#120265

…ytorch#122843) **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` Pull Request resolved: pytorch#122843 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392

…ticipating ranks only (pytorch#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: pytorch#122853 Approved by: https://github.com/wz337 ghstack dependencies: pytorch#120265, pytorch#121392, pytorch#122843

**Summary** This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: pytorch#120265 Approved by: https://github.com/wanchaol

**Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven` Pull Request resolved: #121392 Approved by: https://github.com/wanchaol ghstack dependencies: #120265

…122843) **Summary** Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even` **Result** ``` Row-wise even sharding example in DTensor Col 0-15 ------- ---------- Row 0-1 cuda:0 Row 2-3 cuda:1 Row 4-5 cuda:2 Row 6-7 cuda:3 ``` Pull Request resolved: #122843 Approved by: https://github.com/wz337 ghstack dependencies: #120265, #121392

…ticipating ranks only (#122853) **Summary** We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`. **Test** `torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise` Pull Request resolved: #122853 Approved by: https://github.com/wz337 ghstack dependencies: #120265, #121392, #122843

[dtensor][2/N] add torchrec table-wise sharding example

7c4764e

[ghstack-poisoned]

XilunWu mentioned this pull request Feb 21, 2024

[dtensor][1/N] add torchrec even row-wise sharding example #120260

Closed

pytorch-bot bot added the topic: not user facing topic category label Feb 21, 2024

github-actions bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Feb 21, 2024

Update on "[dtensor][2/N] add torchrec table-wise sharding example"

cb5bcff

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

Update on "[dtensor][2/N] add torchrec table-wise sharding example"

f623cd0

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

XilunWu added a commit that referenced this pull request Feb 21, 2024

[dtensor][2/N] add torchrec table-wise sharding example

345e490

ghstack-source-id: 8906dba Pull Request resolved: #120265

Update on "[dtensor][2/N] add torchrec table-wise sharding example"

deb47d8

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

github-actions bot added the ciflow/inductor label Feb 22, 2024

Update on "[dtensor][2/N] add torchrec table-wise sharding example"

c20ac6e

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

XilunWu added a commit that referenced this pull request Feb 22, 2024

[dtensor][2/N] add torchrec table-wise sharding example

1884307

ghstack-source-id: 98babaa Pull Request resolved: #120265

XilunWu requested review from wanchaol and zainhuda February 22, 2024 10:06

XilunWu mentioned this pull request Mar 5, 2024

[dtensor][debug] allow visualize_sharding to print header #121179

Closed

Update on "[dtensor][2/N] add torchrec table-wise sharding example"

a24c6cd

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225 [ghstack-poisoned]

This was referenced Mar 5, 2024

[dtensor][debug] have visualize_sharding correctly print for sub-mesh DTensor #121216

Closed

[dtensor][debug] force visualize_sharding not to print for empty tensors #121217

Closed

XilunWu added a commit that referenced this pull request Mar 5, 2024

[dtensor][2/N] add torchrec table-wise sharding example

6811c66

ghstack-source-id: 5e9a2be Pull Request resolved: #120265

XilunWu mentioned this pull request Mar 7, 2024

[dtensor][debug] visualize_sharding only compute offset on the first rank in mesh #121385

Closed

XilunWu added a commit that referenced this pull request Mar 7, 2024

[dtensor][2/N] add torchrec table-wise sharding example

1a38ca2

ghstack-source-id: d9d7877 Pull Request resolved: #120265

XilunWu mentioned this pull request Mar 7, 2024

[dtensor][3/N] add torchrec row-wise uneven sharding example #121392

Closed

XilunWu added 3 commits March 7, 2024 23:09

wanchaol approved these changes Mar 26, 2024

View reviewed changes

This was referenced Mar 27, 2024

[dtensor][4/N] have row-wise sharding always use LocalShardsWrapper #122843

Closed

[dtensor][5/N] have table-wise sharding use LocalShardsWrapper on participating ranks only #122853

Closed

pytorchmergebot closed this in 9d3543d Apr 16, 2024

pytorchmergebot added the Merged label Apr 16, 2024

github-actions bot deleted the gh/XilunWu/63/head branch May 31, 2024 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dtensor][2/N] add torchrec table-wise sharding example #120265

[dtensor][2/N] add torchrec table-wise sharding example #120265

Uh oh!

XilunWu commented Feb 21, 2024 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Feb 21, 2024 •

edited

Loading

Uh oh!

XilunWu commented Mar 5, 2024 •

edited

Loading

Uh oh!

wanchaol left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[dtensor][2/N] add torchrec table-wise sharding example #120265

[dtensor][2/N] add torchrec table-wise sharding example #120265

Uh oh!

Conversation

XilunWu commented Feb 21, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120265

✅ No Failures

Uh oh!

XilunWu commented Mar 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wanchaol left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

XilunWu commented Feb 21, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Feb 21, 2024 •

edited

Loading

XilunWu commented Mar 5, 2024 •

edited

Loading