[FSDP] Another fix for `DTensor`, `use_orig_params=True` #89845

awgu · 2022-11-29T16:29:59Z

Stack from ghstack (oldest at bottom):

-> [FSDP] Another fix for DTensor, use_orig_params=True #89845

The issue for test_2d_parallel.py is that DTensor does not support the idiom param.data = view where view is a DTensor. To work around this, we do not preserve the parameter variable param and instead create a new parameter variable altogether via nn.Parameter(view). Preserving the parameter variable when unsharded was not a strict requirement -- it just made sense to do that if we are already doing that when sharded, where it is a strict requirement to support the optimizer step. The sharded case is not an issue for 2D because sharded implies local tensor, not DTensor.

[ghstack-poisoned]

pytorch-bot · 2022-11-29T16:30:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89845

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5ff0f38:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 6198bcfb367bc123a32a0e131c9b56dd64421584 Pull Request resolved: #89845

awgu · 2022-11-29T17:21:23Z

@pytorchbot rebase -s

pytorchmergebot · 2022-11-29T17:23:15Z

@pytorchbot successfully started a rebase job. Check the current status here

The issue for `test_2d_parallel.py` is that `DTensor` does not support the idiom `param.data = view` where `view` is a `DTensor`. To work around this, we do not preserve the parameter variable `param` and instead create a new parameter variable altogether via `nn.Parameter(view)`. Preserving the parameter variable when unsharded was not a strict requirement -- it just made sense to do that if we are already doing that when _sharded_, where it _is_ a strict requirement to support the optimizer step. The sharded case is not an issue for 2D because sharded implies local tensor, not `DTensor`. [ghstack-poisoned]

pytorchmergebot · 2022-11-29T17:23:30Z

Successfully rebased gh/awgu/219/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/89845)

ghstack-source-id: 364d0f87a9b29618fa5c3ea81954d1d3d1046781 Pull Request resolved: #89845

awgu · 2022-11-29T22:27:48Z

@pytorchbot merge

pytorchmergebot · 2022-11-29T22:29:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

facebook-github-bot · 2022-12-02T01:32:06Z

This pull request has been reverted by 6efedfd. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

awgu · 2022-12-03T03:38:19Z

We will re-land this once DTensor removes its dependency on torchgen.

The issue for `test_2d_parallel.py` is that `DTensor` does not support the idiom `param.data = view` where `view` is a `DTensor`. To work around this, we do not preserve the parameter variable `param` and instead create a new parameter variable altogether via `nn.Parameter(view)`. Preserving the parameter variable when unsharded was not a strict requirement -- it just made sense to do that if we are already doing that when _sharded_, where it _is_ a strict requirement to support the optimizer step. The sharded case is not an issue for 2D because sharded implies local tensor, not `DTensor`. Pull Request resolved: pytorch#89845 Approved by: https://github.com/zhaojuanmao

) This is a reland of #89845 with nothing changed. This should avoid the internal breakage now that `DTensor` does not import `torchgen` (#90106). Pull Request resolved: #90562 Approved by: https://github.com/fduwjj

[FSDP] Another fix for DTensor, use_orig_params=True

b3758b5

[ghstack-poisoned]

awgu requested review from mrshenli, zhaojuanmao, pritamdamania87, rohan-varma, H-Huang and kwen2501 as code owners November 29, 2022 16:30

pytorch-bot bot added the release notes: distributed (fsdp) release notes category label Nov 29, 2022

awgu added a commit that referenced this pull request Nov 29, 2022

[FSDP] Another fix for DTensor, use_orig_params=True

beff9b9

ghstack-source-id: 6198bcfb367bc123a32a0e131c9b56dd64421584 Pull Request resolved: #89845

zhaojuanmao approved these changes Nov 29, 2022

View reviewed changes

awgu added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 29, 2022

pytorchmergebot pushed a commit that referenced this pull request Nov 29, 2022

[FSDP] Another fix for DTensor, use_orig_params=True

1950ae1

ghstack-source-id: 364d0f87a9b29618fa5c3ea81954d1d3d1046781 Pull Request resolved: #89845

pytorchmergebot added the Merged label Nov 29, 2022

pytorchmergebot closed this in c599cf2 Nov 29, 2022

facebook-github-bot added the Reverted label Dec 2, 2022

awgu mentioned this pull request Dec 10, 2022

[Reland][FSDP] Another fix for DTensor, use_orig_params=True #90562

Closed

facebook-github-bot deleted the gh/awgu/219/head branch June 8, 2023 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FSDP] Another fix for `DTensor`, `use_orig_params=True` #89845

[FSDP] Another fix for `DTensor`, `use_orig_params=True` #89845

awgu commented Nov 29, 2022 •

edited by pytorchmergebot

pytorch-bot bot commented Nov 29, 2022 •

edited

awgu commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

awgu commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

facebook-github-bot commented Dec 2, 2022

awgu commented Dec 3, 2022

[FSDP] Another fix for DTensor, use_orig_params=True #89845

[FSDP] Another fix for DTensor, use_orig_params=True #89845

Conversation

awgu commented Nov 29, 2022 • edited by pytorchmergebot

pytorch-bot bot commented Nov 29, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89845

✅ No Failures

awgu commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

awgu commented Nov 29, 2022

pytorchmergebot commented Nov 29, 2022

Merge started

facebook-github-bot commented Dec 2, 2022

awgu commented Dec 3, 2022

[FSDP] Another fix for `DTensor`, `use_orig_params=True` #89845

[FSDP] Another fix for `DTensor`, `use_orig_params=True` #89845

awgu commented Nov 29, 2022 •

edited by pytorchmergebot

pytorch-bot bot commented Nov 29, 2022 •

edited