Skip to content

Conversation

@fegin
Copy link
Contributor

@fegin fegin commented Feb 10, 2023

Stack from ghstack (oldest at bottom):

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means _local_shard will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: D43205127

…pickling errors

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 10, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94637

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 Failures

As of commit 8b2f0e3:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix!

if valid_data_size > 0:
if flat_param._shard_numel_padded > 0:
flat_param = flat_param.narrow(0, 0, valid_data_size)
flat_param = flat_param.view(valid_data_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, maybe add a comment saying this will make it return a tensor that can be properly serialized?

@fegin fegin added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 10, 2023
…s to avoid pickling errors"

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Feb 12, 2023
…pickling errors

Pull Request resolved: #94637

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.
ghstack-source-id: 179983735

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)
@fegin
Copy link
Contributor Author

fegin commented Feb 12, 2023

@pytorchbot merge -f "The failing tests are not related."

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (fsdp) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants