-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync AveragedModel buffers when use_buffers=False #84054
Conversation
🔗 Helpful links
✅ No Failures (18 Pending)As of commit 8dea46b (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in this review @RangiLyu, thanks for doing this!
/easycla As part of the transition to the PyTorch Foundation, this project now requires contributions be covered under the new CLA. See #85559 for additional details. This comment will trigger a new check of this PR. If you are already covered, you will simply see a new "EasyCLA" check that passes. If you are not covered, a bot will leave a new comment with a link to sign. |
|
Hi sorry--since it's approved, feel free to comment with |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase by leaving the following comment on this PR: Details for Dev Infra teamRaised by workflow job |
@pytorchbot rebase |
You don't have permissions to rebase this PR, only people with write permissions may rebase PRs. |
@pytorchbot rebase |
@pytorchbot successfully started a rebase job. Check the current status here |
Successfully rebased |
8dea46b
to
ad4cd0e
Compare
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/84054
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ad4cd0e: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot merge |
Just set up the job so I didn't forget. Sorry about that @RangiLyu and thanks for the PR |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Hey @RangiLyu. |
Fixes pytorch#84053 As described in the issue, the AveragedModel will deep copy the model during initialization, which means that the buffers in the averaged model cannot be updated together with the model. One solution is to make the buffers equal to the source model every time when calling `update_parameters`. Pull Request resolved: pytorch#84054 Approved by: https://github.com/samdow
Fixes pytorch#84053 As described in the issue, the AveragedModel will deep copy the model during initialization, which means that the buffers in the averaged model cannot be updated together with the model. One solution is to make the buffers equal to the source model every time when calling `update_parameters`. Pull Request resolved: pytorch#84054 Approved by: https://github.com/samdow
Fixes pytorch#84053 As described in the issue, the AveragedModel will deep copy the model during initialization, which means that the buffers in the averaged model cannot be updated together with the model. One solution is to make the buffers equal to the source model every time when calling `update_parameters`. Pull Request resolved: pytorch#84054 Approved by: https://github.com/samdow
Fixes #84053
As described in the issue, the AveragedModel will deep copy the model during initialization, which means that the buffers in the averaged model cannot be updated together with the model.
One solution is to make the buffers equal to the source model every time when calling
update_parameters
.