-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move DTensor from tau to PyTorch #576
Labels
Comments
pytorchmergebot
pushed a commit
to pytorch/pytorch
that referenced
this issue
Nov 19, 2022
This is part of TP Beta Release efforts. ref: pytorch/PiPPy#576 Pull Request resolved: #89242 Approved by: https://github.com/wanchaol
pytorchmergebot
pushed a commit
to pytorch/pytorch
that referenced
this issue
Nov 22, 2022
This is part of TP Beta Release efforts. ref: pytorch/PiPPy#576 Pull Request resolved: #89467 Approved by: https://github.com/wanchaol
This was referenced Nov 28, 2022
kulinseth
pushed a commit
to kulinseth/pytorch
that referenced
this issue
Dec 10, 2022
This is part of TP Beta Release efforts. ref: pytorch/PiPPy#576 Pull Request resolved: pytorch#89242 Approved by: https://github.com/wanchaol
kulinseth
pushed a commit
to kulinseth/pytorch
that referenced
this issue
Dec 10, 2022
…h#89467) This is part of TP Beta Release efforts. ref: pytorch/PiPPy#576 Pull Request resolved: pytorch#89467 Approved by: https://github.com/wanchaol
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We plan to move the DTensor implementation from pytorch/tau to pytorch/pytorch. DTensor has been developing under the pytorch/tau repo in this half. Working in an out of tree repo has allowed us to move fast and quickly prototype features with very short turnaround time, but we want to move to core for the following reasons:
Detailed folder structure after the move
We will move the DTensor implementation from tau to PyTorch, concretely this involves:
tau/spmd/tensor
->torch/distributed/_tensor
tau/spmd/tensor/parallel
->torch/distributed/tensor_parallel
in subsequent PRs.tau/test/spmd/tensor
->test/distributed/_tensor
tau/spmd/testing/common_utils.py
->torch/testing/_internal/common_dtensor.py
Note, we no longer need lagging_op_db things after we move to PyTorch so only comm_utils.py is needed.
All of the imports in tau shall be preserved, which means we will leave a empty folder
tau/spmd/tensor/
with__init__.py
file to still import the DTensor public APIs, and leave this folder as a experimental folder for the compiler stack to use it for quick hacks like ops support and experimental features, etc.Move Logistics
Phase 1
The DTensor source of truth will remain pytorch/tau, so continue submitting PRs to this repro even after those lands.
Any new changes will be copied over by a script (rsync) until phase 2, we will run a script before the beginning of Phase 2.
After these initial PRs land, DTensor will be working in core. The initial PR will have testing disable, but then we will enable tests in smaller followup PRs.
Phase 2 (target date 11/25/2022)
At this switch over point, the source of truth will become PyTorch core.
tau/spmd/tensor/
with__init__.py
file to still import the DTensor public APIs, and leave this folder as a experimental folder for the compiler stack to use it for quick hacks like ops support and experimental features, etc.How you can help
To be discussed
Shall we continue using pytorch/tau for DTensor issue tracking together with the spmd compiler stack?
The text was updated successfully, but these errors were encountered: