Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Activation Checkpointing] Investigate pin_memory for CPU offload #86097

Open
rohan-varma opened this issue Oct 3, 2022 · 0 comments
Open

[Activation Checkpointing] Investigate pin_memory for CPU offload #86097

rohan-varma opened this issue Oct 3, 2022 · 0 comments
Labels
module: checkpoint Related to torch.utils.checkpoint oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@rohan-varma
Copy link
Member

rohan-varma commented Oct 3, 2022

🚀 The feature, motivation and pitch

@awgu had a good point here: #85459 (comment) that we shouldn't assume we have unlimited space in the pinned memory region, right now save_on_cpu does pin_memory=True in a hardcoded way, we should investigate performance implications of this and improve our intuition.

Alternatives

No response

Additional context

No response

cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang @kwen2501

@rohan-varma rohan-varma added oncall: distributed Add this issue/PR to distributed oncall triage queue module: checkpoint Related to torch.utils.checkpoint triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: checkpoint Related to torch.utils.checkpoint oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

1 participant