[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepare the support of use_orig_param #89898

fegin · 2022-11-30T07:23:43Z

Stack from ghstack (oldest at bottom):

Motivation:
Restructure some APIs in _optim_state_dict.py to allow better future extension, mostly for supporting use_orig_params. NO logic change in this PR.

…e the support of use_orig_param [ghstack-poisoned]

pytorch-bot · 2022-11-30T07:23:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89898

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit 18d5617:

The following jobs have failed:

linux-bionic-py3_7-clang8-xla / test (xla, 1, 1, linux.2xlarge)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…t to prepare the support of use_orig_param" [ghstack-poisoned]

…t to prepare the support of use_orig_param" **Motivation:** Restructure some APIs in _optim_state_dict.py to allow better future extension, mostly for supporting use_orig_params. NO logic change in this PR. [ghstack-poisoned]

awgu

Thanks for this reorganization!

awgu · 2022-11-30T20:14:28Z

torch/distributed/fsdp/_optim_utils.py

-        :class:`list`.
+def _map_param_id_to_optim_keys(
+    optim_state_dict: Dict[str, Any],
+    group: Optional[dist.ProcessGroup],


General comment: While refactoring optimizer state dict, it could be worthwhile to keep an eye out for process group usage (mainly looking for any accidental default process group usage). This is in preparation for HSDP, where every process group usage now should map to the intra-node process group from HSDP.

cc: @rohan-varma

awgu · 2022-11-30T20:16:00Z

torch/distributed/fsdp/_optim_utils.py

+    return r0_param_id_to_optim_state_key, optim_state_key_to_param_id
+
+
+def _process_param_groups(


nit: I am wondering, should we call this _unflatten_param_groups to be more specific?

…t to prepare the support of use_orig_param" **Motivation:** Restructure some APIs in _optim_state_dict.py to allow better future extension, mostly for supporting use_orig_params. NO logic change in this PR. [ghstack-poisoned]

fegin · 2022-12-05T21:00:12Z

@pytorchbot merge -f "The failing test is unrelated"

pytorchmergebot · 2022-12-05T21:01:45Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…e the support of use_orig_param (pytorch#89898) **Motivation:** Restructure some APIs in _optim_state_dict.py to allow better future extension, mostly for supporting use_orig_params. NO logic change in this PR. Pull Request resolved: pytorch#89898 Approved by: https://github.com/awgu

[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepar…

00e52a6

…e the support of use_orig_param [ghstack-poisoned]

fegin requested review from mrshenli, zhaojuanmao, pritamdamania87, rohan-varma, H-Huang, awgu and kwen2501 as code owners November 30, 2022 07:23

pytorch-bot bot added the release notes: distributed (fsdp) release notes category label Nov 30, 2022

This was referenced Nov 30, 2022

[FSDP][optim_state_dict][2/N] Add _get_fqn_to_fsdp_param_info to map from original FQN to flat_param #89899

Closed

[FSDP][optim_state_dict][3/N] Support use_orig_param optim_state_dict (non-broadcast version) #89900

Closed

fegin added 2 commits November 30, 2022 09:12

Update on "[FSDP][optim_state_dict][1/N] Restructure _optim_state_dic…

de19d40

…t to prepare the support of use_orig_param" [ghstack-poisoned]

awgu approved these changes Nov 30, 2022

View reviewed changes

fegin requested a review from wanchaol as a code owner November 30, 2022 20:19

This was referenced Dec 1, 2022

[FSDP][optim_state_dict][4/N] Remove the unused _get_flat_param_to_fsdp_module API #89980

Closed

[FSDP][optim_state_dict][5/N] Remove optim_inputs for sharded state_dict. #89981

Closed

pytorchmergebot added the Merged label Dec 5, 2022

pytorchmergebot closed this in 72fdfad Dec 5, 2022

facebook-github-bot deleted the gh/fegin/47/head branch June 8, 2023 17:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepare the support of use_orig_param #89898

[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepare the support of use_orig_param #89898

fegin commented Nov 30, 2022 •

edited

pytorch-bot bot commented Nov 30, 2022 •

edited

awgu left a comment

awgu Nov 30, 2022

awgu Nov 30, 2022

fegin commented Dec 5, 2022

pytorchmergebot commented Dec 5, 2022

		return r0_param_id_to_optim_state_key, optim_state_key_to_param_id


		def _process_param_groups(

[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepare the support of use_orig_param #89898

[FSDP][optim_state_dict][1/N] Restructure _optim_state_dict to prepare the support of use_orig_param #89898

Conversation

fegin commented Nov 30, 2022 • edited

pytorch-bot bot commented Nov 30, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89898

❌ 1 Failures

awgu left a comment

Choose a reason for hiding this comment

awgu Nov 30, 2022

Choose a reason for hiding this comment

awgu Nov 30, 2022

Choose a reason for hiding this comment

fegin commented Dec 5, 2022

pytorchmergebot commented Dec 5, 2022

Merge started

fegin commented Nov 30, 2022 •

edited

pytorch-bot bot commented Nov 30, 2022 •

edited