New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[PT-D][Tensor parallelism] Add documentations for TP #94421

Closed

fduwjj wants to merge 4 commits into gh/fduwjj/71/base from gh/fduwjj/71/head

Contributor

fduwjj commented Feb 8, 2023 •

edited

Stack from ghstack (oldest at bottom):

-> [PT-D][Tensor parallelism] Add documentations for TP #94421

This is far from completed and we will definitely polish it down the road.


          [PT-D][Tensor parallelism] Add documentations for TP

7c58fdd

[ghstack-poisoned]

pytorch-bot bot commented Feb 8, 2023 •

edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94421

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8d99683:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

fduwjj added a commit that referenced this pull request


          [PT-D][Tensor parallelism] Add documentations for TP

91f0d8c

ghstack-source-id: 06caff66222400a86a803e30ca9d09afe24a6aba
Pull Request resolved: #94421

fduwjj marked this pull request as draft

February 8, 2023 18:24

fduwjj added the release notes: distributed (dtensor) label

fduwjj marked this pull request as ready for review

February 8, 2023 18:26

fduwjj added the ciflow/trunk label

fduwjj requested a review from wanchaol

February 8, 2023 18:27

fduwjj marked this pull request as draft

February 8, 2023 18:29


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

ef98f57

[ghstack-poisoned]

fduwjj added a commit that referenced this pull request


          [PT-D][Tensor parallelism] Add documentations for TP

7efb0aa

ghstack-source-id: f105a1ade4ccac6ea9a8337d34e1830d12099132
Pull Request resolved: #94421

fduwjj marked this pull request as ready for review

February 8, 2023 20:17

fduwjj requested review from wz337, kumpera and XilunWu

February 8, 2023 20:17

wz337 approved these changes

View reviewed changes

Contributor

wz337 left a comment

LGTM

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

wanchaol reviewed

View reviewed changes

Contributor

wanchaol left a comment

first pass, let's add the experimental line as this is prototype release.

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst

+                :members:
+              We also enabled 2D parallelism to integrate with ``FullyShardedDataParallel``.
+              Users just need to call the following API explicitly:

Contributor

wanchaol Feb 8, 2023

I remembered we have a FSDP extension, Is TP automatically register the extension now?

Also, I wonder if we should give a small code snippet showing how the 2-D parallel look like

Contributor Author

fduwjj Feb 8, 2023 •

edited

The registrations is in the is_available. Let me send a follow-up PR for this one.

docs/source/distributed.tensor.parallel.rst



		.. currentmodule:: torch.distributed.tensor.parallel.fsdp
		.. autofunction:: is_available

Contributor

wanchaol Feb 8, 2023

do we really need to add this API to the doc? I remembered is_available is introduced when we are in tau, but since now it's pytorch I think fsdp should always be available?

Contributor Author

fduwjj Feb 8, 2023

Yes, because of 2D hook registration.

Contributor Author

fduwjj Feb 8, 2023

Will send a follow-up PR to address the naming of this one.

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

596c896


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]

fduwjj requested review from mrshenli, zhaojuanmao, rohan-varma, H-Huang, awgu and kwen2501 as code owners

February 8, 2023 22:37

fduwjj added a commit that referenced this pull request


          [PT-D][Tensor parallelism] Add documentations for TP

034fc59

ghstack-source-id: 8fa4b414901367acdd0ac76b0844a1eb59ee5c67
Pull Request resolved: #94421

Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot rebase

Collaborator

pytorchmergebot commented Feb 8, 2023

@pytorchbot successfully started a rebase job. Check the current status here


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

8d99683


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]

Collaborator

pytorchmergebot commented Feb 8, 2023

Successfully rebased gh/fduwjj/71/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/94421)

pytorchmergebot pushed a commit that referenced this pull request


          [PT-D][Tensor parallelism] Add documentations for TP

eacbb63

ghstack-source-id: d03f0b1bf33d5f3f662d1f574a35828bbc336330
Pull Request resolved: #94421

Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot merge

Collaborator

pytorchmergebot commented Feb 8, 2023

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

41e3189

facebook-github-bot deleted the gh/fduwjj/71/head branch

June 8, 2023 17:12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

wanchaol wanchaol left review comments

wz337 wz337 approved these changes

kumpera Awaiting requested review from kumpera

XilunWu Awaiting requested review from XilunWu

mrshenli Awaiting requested review from mrshenli

zhaojuanmao Awaiting requested review from zhaojuanmao

rohan-varma Awaiting requested review from rohan-varma

H-Huang Awaiting requested review from H-Huang

awgu Awaiting requested review from awgu

kwen2501 Awaiting requested review from kwen2501