-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[export] optimize unflattener #115364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[export] optimize unflattener #115364
Conversation
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules). Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol. Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115364
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit e42db18 with merge base 1224acc ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules). Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol. Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/) ghstack-source-id: 209551837 Pull Request resolved: #115364
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules). Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol. Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/) [ghstack-poisoned]
Pull Request resolved: #115364 Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules). Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol. ghstack-source-id: 209580532 @exported-using-ghexport Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/)
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules). Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol. Differential Revision: [D51929775](https://our.internmc.facebook.com/intern/diff/D51929775/) Pull Request resolved: pytorch#115364 Approved by: https://github.com/zhxchen17
Stack from ghstack (oldest at bottom):
Unflattening was slow on the APS FM model (which has thousands of nn.EmbeddingBag modules).
Quick glance at the profile shows 75% of time in unflattening was spent copying this node list, which is immutable and globally shared. So just passing it around as a tuple yields a 4x speedup lol.
Differential Revision: D51929775