Skip to content

Conversation

XilunWu
Copy link
Contributor

@XilunWu XilunWu commented Feb 21, 2024

Stack from ghstack (oldest at bottom):

Summary
This PR serves as a start of this effort by adding an example test that represents TorchRec's ShardingType.TABLE_WISE using DTensor.

Test
torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang

Copy link

pytorch-bot bot commented Feb 21, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120265

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e469287 with merge base e70bf23 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Feb 21, 2024
cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
XilunWu added a commit that referenced this pull request Feb 21, 2024
cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
XilunWu added a commit that referenced this pull request Feb 22, 2024
@XilunWu XilunWu requested review from wanchaol and zainhuda February 22, 2024 10:06
cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
@XilunWu
Copy link
Contributor Author

XilunWu commented Mar 5, 2024

TBD: what is the best way to represent table-wise sharding on non-participating ranks? i.e. shall we have the same sub-mesh on all ranks?

  1. We call from_local on empty tensor with submesh [current_rank]
  2. We call from_local on empty tensor with submesh [actual_hold_table_rank]

**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
XilunWu added 3 commits March 7, 2024 23:09
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

cc mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse H-Huang kwen2501 awgu penguinwu fegin wanchaol fduwjj wz337 tianyu-l wconstab yf225

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Apr 16, 2024
**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven`

Pull Request resolved: #121392
Approved by: https://github.com/wanchaol
ghstack dependencies: #120265
pytorchmergebot pushed a commit that referenced this pull request Apr 16, 2024
…122843)

**Summary**
Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even`

**Result**
```
Row-wise even sharding example in DTensor
         Col 0-15
-------  ----------
Row 0-1  cuda:0
Row 2-3  cuda:1
Row 4-5  cuda:2
Row 6-7  cuda:3
```

Pull Request resolved: #122843
Approved by: https://github.com/wz337
ghstack dependencies: #120265, #121392
pytorchmergebot pushed a commit that referenced this pull request Apr 16, 2024
…ticipating ranks only (#122853)

**Summary**
We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

Pull Request resolved: #122853
Approved by: https://github.com/wz337
ghstack dependencies: #120265, #121392, #122843
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request Apr 22, 2024
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

Pull Request resolved: pytorch#120265
Approved by: https://github.com/wanchaol
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request Apr 22, 2024
…#121392)

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven`

Pull Request resolved: pytorch#121392
Approved by: https://github.com/wanchaol
ghstack dependencies: pytorch#120265
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request Apr 22, 2024
…ytorch#122843)

**Summary**
Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even`

**Result**
```
Row-wise even sharding example in DTensor
         Col 0-15
-------  ----------
Row 0-1  cuda:0
Row 2-3  cuda:1
Row 4-5  cuda:2
Row 6-7  cuda:3
```

Pull Request resolved: pytorch#122843
Approved by: https://github.com/wz337
ghstack dependencies: pytorch#120265, pytorch#121392
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request Apr 22, 2024
…ticipating ranks only (pytorch#122853)

**Summary**
We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

Pull Request resolved: pytorch#122853
Approved by: https://github.com/wz337
ghstack dependencies: pytorch#120265, pytorch#121392, pytorch#122843
petrex pushed a commit to petrex/pytorch that referenced this pull request May 3, 2024
**Summary**
This PR serves as a start of this effort by adding an example test that represents TorchRec's `ShardingType.TABLE_WISE` using DTensor.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

Pull Request resolved: pytorch#120265
Approved by: https://github.com/wanchaol
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-uneven`

Pull Request resolved: #121392
Approved by: https://github.com/wanchaol
ghstack dependencies: #120265
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
…122843)

**Summary**
Always wrap local tensor into a `LocalShardsWrapper`. This is for uniformity and it leads to easiness on adoption of DTensor as a wrapper for local shard(s) representation. To support more tensor ops over `LocalShardsWrapper`, users need to extend its `__torch_dispatch__`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e row-wise-even`

**Result**
```
Row-wise even sharding example in DTensor
         Col 0-15
-------  ----------
Row 0-1  cuda:0
Row 2-3  cuda:1
Row 4-5  cuda:2
Row 6-7  cuda:3
```

Pull Request resolved: #122843
Approved by: https://github.com/wz337
ghstack dependencies: #120265, #121392
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
…ticipating ranks only (#122853)

**Summary**
We wrap DTensor's local tensor in `LocalShardsWrapper` for torchrec's table-wise sharding. The exception is on non-participating ranks: for non-participating ranks, the local tensor is an empty torch.Tensor object. The reason of this design is to avoid complexity on supporting empty tensor case on `LocalShardsWrapper`.

**Test**
`torchrun --standalone --nnodes=1 --nproc-per-node=4 torch/distributed/_tensor/examples/torchrec_sharding_example.py -e table-wise`

Pull Request resolved: #122853
Approved by: https://github.com/wz337
ghstack dependencies: #120265, #121392, #122843
@github-actions github-actions bot deleted the gh/XilunWu/63/head branch May 31, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor Merged oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants