Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSDP] Add set_state_dict_type API to setup state_dict_type without using context manager #86243

Closed
wants to merge 8 commits into from

Conversation

fegin
Copy link
Contributor

@fegin fegin commented Oct 4, 2022

Stack from ghstack (oldest at bottom):

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. set_state_dict_type allows users to do so.

Differential Revision: D40083670

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu

…sing context manager

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 4, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86243

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 Failures

As of commit 4de33ef:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: distributed (sharded) release notes category label Oct 4, 2022
fegin added a commit that referenced this pull request Oct 4, 2022
…sing context manager

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

ghstack-source-id: 169371556
Pull Request resolved: #86243
@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 4, 2022
Copy link
Member

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just looks like there is a valid lint. Thanks for adding this!


prev_state_dict_type = None
prev_state_dict_config = None
# Use default config a state_dict config is not set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if a state_dict config is not set"

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 4, 2022
…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 5, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 169453751

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 13, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170308364

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
@@ -2111,6 +2104,46 @@ def _get_training_state(
)
return next(iter(training_states))

@staticmethod
def set_state_dict_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a docstring?

…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 17, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170670792

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 18, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170765562

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
@fegin
Copy link
Contributor Author

fegin commented Oct 18, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule Distributed):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 18, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170829792

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 19, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170860829

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
…e without using context manager"

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Oct 19, 2022
…sing context manager

Pull Request resolved: #86243

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.
ghstack-source-id: 170899517

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
@fegin
Copy link
Contributor Author

fegin commented Oct 19, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule Distributed):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

@fegin
Copy link
Contributor Author

fegin commented Oct 19, 2022

@pytorchbot merge -f "failing tests are not related"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions
Copy link

Hey @fegin.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

sgrigory pushed a commit to sgrigory/pytorch that referenced this pull request Oct 28, 2022
…sing context manager (pytorch#86243)

FSDP.state_dict_type is a context manager. However, users may want to decide what state_dict is going to used during initialization. `set_state_dict_type` allows users to do so.

Differential Revision: [D40083670](https://our.internmc.facebook.com/intern/diff/D40083670/)
Pull Request resolved: pytorch#86243
Approved by: https://github.com/rohan-varma
@facebook-github-bot facebook-github-bot deleted the gh/fegin/31/head branch June 8, 2023 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (sharded) release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants