Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default hooks to save tensors on CPU #61928

Closed
wants to merge 34 commits into from

Conversation

Varal7
Copy link
Contributor

@Varal7 Varal7 commented Jul 20, 2021

Stack from ghstack:

Fix #57100.
Creates a context-manager torch.autograd.graph.save_on_cpu() under which all tensors saved during the forward pass are moved* to cpu, then copied back to the appropriate device for the backward pass.

*If the tensor was already on cpu, the entire operation is a no op.

If the user so desires, we move the tensor to pin_memory during packing so that the unpacking can be done asynchronously.

With the current PR, hooks are registered globally, across threads. In the near future, we want to make these hooks thread-local.

See benchmark and note about training large models

Differential Revision: D29848526

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jul 20, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 679572c (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Jul 20, 2021
Fix #57100

ghstack-source-id: 2d3c9f987a9f865078184f8dc04ad075c9a2bdac
Pull Request resolved: #61928
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Jul 21, 2021
Fix #57100

ghstack-source-id: 9285a37caf37f90fd65b309235d7027504ce4ca5
Pull Request resolved: #61928
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Fix #57100.
Creates a context manager `torch.autograd.graph.save_on_cpu()` under which all tensors saved during the forward pass are actually copied to cpu, then copied back to the appropriate device for the backward pass.


[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Jul 21, 2021
Fix #57100

ghstack-source-id: a9e56df9653a4240b4d382f7dbec3080a3a07a3d
Pull Request resolved: #61928
@Varal7 Varal7 marked this pull request as draft July 22, 2021 14:01
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs

ghstack-source-id: 4c86248f5818546f5c82117a306e61d328389bb7
Pull Request resolved: #62410
@Varal7
Copy link
Contributor Author

Varal7 commented Aug 3, 2021

@Varal7 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Fix #57100.
Creates a function `torch.autograd.graph.set_save_on_cpu_hooks()` which can be used to register default hooks under which all tensors saved during the forward pass are moved* to cpu, then copied back to the appropriate device for the backward pass.

*If the tensor was already on cpu, the entire operation is a no op.

If the tensor is on GPU, we move the tensor to `pin_memory` during packing so that the unpacking can be done asynchronously.

With the current PR, hooks are set with `torch.autograd.graph.set_save_on_cpu_hooks()` and unset with `torch.autograd.graph.reset_saved_tensors_default_hooks`. In the near future, we want to make these hooks thread-local and expose a context manager `torch.autograd.graph.save_on_cpu`.

See [benchmark](#61928 (comment)) and [note about training large models](#61928 (comment))

Differential Revision: [D29848526](https://our.internmc.facebook.com/intern/diff/D29848526)



[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
Fix #57100.
Creates a function `torch.autograd.graph.set_save_on_cpu_hooks()` which can be used to register default hooks under which all tensors saved during the forward pass are moved* to cpu, then copied back to the appropriate device for the backward pass.

*If the tensor was already on cpu, the entire operation is a no op.

If the tensor is on GPU, we move the tensor to `pin_memory` during packing so that the unpacking can be done asynchronously.

With the current PR, hooks are set with `torch.autograd.graph.set_save_on_cpu_hooks()` and unset with `torch.autograd.graph.reset_saved_tensors_default_hooks`. In the near future, we want to make these hooks thread-local and expose a context manager `torch.autograd.graph.save_on_cpu`.

See [benchmark](#61928 (comment)) and [note about training large models](#61928 (comment))

Differential Revision: [D29848526](https://our.internmc.facebook.com/intern/diff/D29848526)



[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs

ghstack-source-id: c711d4ff378cb6096d9e454c9b5c89b66bbe44f9
Pull Request resolved: #62410
@Varal7
Copy link
Contributor Author

Varal7 commented Aug 3, 2021

@Varal7 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@Varal7 Varal7 requested a review from zou3519 August 3, 2021 15:20
Comment on lines +16 to +21
storage = torch.empty(
tensor.size(),
dtype=tensor.dtype,
layout=tensor.layout,
pin_memory=(torch.cuda.is_available() and not tensor.is_sparse))
storage.copy_(tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we do storage = tensor.to(device='cpu', pin_memory=True)?

Also nit: we should rename storage into something else; storage can be confused with torch.Storage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that pin_memory is an acceptable argument of torch.tensor.to

Ok for storage rename.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that pin_memory is an acceptable argument of torch.tensor.to

Good point, thanks for the clarification

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would something like storage = tensor.to("cpu", non_blocking=True).pin_memory() work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean 2 copies

Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs

ghstack-source-id: 5f554c225627491e60560e7d4a8b4511c70637c9
Pull Request resolved: #62410
@facebook-github-bot
Copy link
Contributor

@Varal7 merged this pull request in 9beb279.

Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs

ghstack-source-id: 936111085ab4102b3ee20605594c00a6cce7cb8b
Pull Request resolved: #62410
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Differential Revision: [D29990129](https://our.internmc.facebook.com/intern/diff/D29990129)

[ghstack-poisoned]
Varal7 added a commit that referenced this pull request Aug 3, 2021
This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs

ghstack-source-id: 4a98e10627dc2147d1a3b8cf69f9628f1d1d4655
Pull Request resolved: #62410
IvanKobzarev added a commit that referenced this pull request Aug 3, 2021
IvanKobzarev added a commit that referenced this pull request Aug 3, 2021
This reverts commit 9beb279.

ghstack-source-id: f4ad32c72c88e0e6b618f0535de52dec684f3871
Pull Request resolved: #62693
facebook-github-bot pushed a commit that referenced this pull request Aug 4, 2021
Summary:
Pull Request resolved: #62410

This PR adds docstrings for CPU hooks introduced in #61928.

Also uncomments the warning about pinned memory in CUDA semantics docs.

Depends on: #62361.

For now docstrings are an orphan page at https://docs-preview.pytorch.org/62410/generated/torch.autograd.graph.set_save_on_cpu_hooks.html#torch-autograd-graph-set-save-on-cpu-hooks

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29990129

Pulled By: Varal7

fbshipit-source-id: 7a98eeee6a0abb11e2c2d9169cd1aa35ad7ba3f4
@facebook-github-bot facebook-github-bot deleted the gh/varal7/18/head branch August 7, 2021 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants