-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix ONNXProgram.save to use torch.load(..., mmap=True) for large models #117295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ONNXProgram.save to use torch.load(..., mmap=True) for large models #117295
Conversation
During ONNXProgram.save, the implicit/explicit state_dict passed in must be loaded in memory in order to read each initializer and create an external tensor proto with them This PR ensures torch.load uses memory-map to support large models that cannot fit in memory [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117295
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d5c3f17 with merge base b4a3563 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
… large models" During ONNXProgram.save, the implicit/explicit state_dict passed in must be loaded in memory in order to read each initializer and create an external tensor proto with them This PR ensures torch.load uses memory-map to support large models that cannot fit in memory [ghstack-poisoned]
… large models" During ONNXProgram.save, the implicit/explicit state_dict passed in must be loaded in memory in order to read each initializer and create an external tensor proto with them This PR ensures torch.load uses memory-map to support large models that cannot fit in memory [ghstack-poisoned]
During ONNXProgram.save, the implicit/explicit state_dict passed in must be loaded in memory in order to read each initializer and create an external tensor proto with them This PR ensures torch.load uses memory-map to support large models that cannot fit in memory ghstack-source-id: 5b1b31e Pull Request resolved: #117295
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Do you have a case study show casing the efficiency/effectiveness before and after?
extra_state_dict = torch.load(path) | ||
# Loads checkpoint using memory-map on CPU to succeed with large models | ||
extra_state_dict = torch.load( | ||
path, map_location="cpu", mmap=True, weights_only=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noticed weights_only=True
is added. is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a security feature. when weights_only=False
, malicious pickled checkpoints can execute code on the machine.
If the checkpoint can be loaded with weights_only=True
, we should - but I am experimenting with it
Not yet. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
During ONNXProgram.save, the implicit/explicit state_dict passed in must
be loaded in memory in order to read each initializer and create an
external tensor proto with them
This PR ensures torch.load uses memory-map to support large models that
cannot fit in memory