Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors by VitalyFedyunin · Pull Request #18952 · pytorch/pytorch

VitalyFedyunin · 2019-04-05T15:45:43Z

Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.

Supported functions:

torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)

Part of the bigger: Remove Storage plan.

Now compatible with both torch scripts:
_1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)
and
_1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))

Same checked for all similar functions rand_like, empty_like and others

It is fixed version of #18455

…s. (pytorch#18455) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Pull Request resolved: pytorch#18455 Reviewed By: ezyang Differential Revision: D14672084 Pulled By: VitalyFedyunin fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

VitalyFedyunin · 2019-04-05T20:00:43Z

@pytorchbot retest this please

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…memory_try_3

ezyang · 2019-04-11T14:30:52Z

aten/src/ATen/native/native_functions.yaml

    CUDA: _cudnn_rnn_backward

- func: _cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType dtype, Layout layout, Device device) -> Tensor
+- func: _cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType dtype, Layout layout, Device device, bool pin_memory=False) -> Tensor


This one isn't optional, but the other ones are. What's the difference?

Doesn't the fact that these are not optional imply that you might break JIT scripts which mention any of the non-optional variants here?

I wish I could do it optional, however native function and jit generators only knows how to wrap all non-optional arguments to TensorOptions or all optional arguments to TensorOptions. There is no easy way to get combination of them (without major rewrite, which makes no sense as we migration off this generation anyway).

I think my real question is, why weren't all of these optional in the first place?

This is part of the bigger what we do with tensor options discussion.

Doesn't the fact that these are not optional imply that you might break JIT scripts which mention any of the non-optional variants here?

Answering my question, no, JIT scripts should not be broken, because all occurrences of pin_memory are given default arguments.

I think there should still be a Note talking about the inconsistency of pin_memory in these sites. Put it in native_functions.yaml (or maybe the README in native)

There's a lot of discussion about this function when it doesn't even matter (outside of not breaking JIT) -- no one calls this function directly. So presumably we can just do whatever makes our job easier here (giving everything a default, not using TensorOptions, etc. as long as it doesn't break JIT).

That's my fault, I stuck the PR comment on the very first occurrence that had this problem. If you look at the rest of native_functions.yaml there are a bunch of other, more public, functions that look this way too. (In the end it's a moot point, I don't believe there is any BC-breakage here, regardless of the status of the function.)

aten/src/ATen/native_parse.py

torch/onnx/symbolic.py

tools/pyi/gen_pyi.py

tools/autograd/templates/python_torch_functions.cpp

ezyang · 2019-04-11T14:45:27Z

Wasn't there going to be a test for TorchScript BC in this case?

ezyang

see comments

VitalyFedyunin · 2019-04-11T14:51:48Z

Wasn't there going to be a test for TorchScript BC in this case?

Done manually with binary files.

ezyang · 2019-04-11T17:55:24Z

Done manually with binary files.

The general expectation is that if you did manual testing to verify if a change worked, you should describe in the PR how exactly you tested it ;) (That's what "Test Plan" in FB infrastructure is all about). But I think there's a case to be made for actually have a real test in the test suite for this case, so that we can avoid breaking it in the future. What ended up being difficult about constructing an ad hoc JIT IR with the missing kwarg field for this?

VitalyFedyunin · 2019-04-11T18:59:25Z

What ended up being difficult about constructing an ad hoc JIT IR with the missing kwarg field for this?

This type of the test will cover only this particular tiny BC (much more easier to test manually). Making proper test will require or storing binary, or inlining all zipped files inside of test code (including attributes.pkl binary)

ezyang · 2019-04-11T19:08:00Z

This type of the test will cover only this particular tiny BC (much more easier to test manually)

I mean, sure, it will only cover this particular BC, but the point of adding regression tests when a regression happens is that we have some updated priors that this particular aspect of the system is likely to break. And it doesn't seem unreasonable to me that it will break exactly the same way the next time we need to add another option to TensorOptions.

torch/csrc/utils/tensor_new.cpp

VitalyFedyunin · 2019-04-11T20:18:48Z

This type of the test will cover only this particular tiny BC (much more easier to test manually)

I mean, sure, it will only cover this particular BC, but the point of adding regression tests when a regression happens is that we have some updated priors that this particular aspect of the system is likely to break. And it doesn't seem unreasonable to me that it will break exactly the same way the next time we need to add another option to TensorOptions.

Will land #19174 first (need stamp ;) )

ezyang · 2019-04-11T20:21:51Z

You don't have to wait for #19174 to land first before landing this one; your assurance is good enough for me :)

ezyang

I didn't heavily rereview the Python arg parser code, let me know if you want a careful audit.

…memory_try_3

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…models. (#19174) Summary: Helps to test #18952 Pull Request resolved: #19174 Differential Revision: D14899474 Pulled By: VitalyFedyunin fbshipit-source-id: a4854ad44da28bd0f5115ca316e6078cbfe29d0d

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…memory_try_3

facebook-github-bot

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-04-16T18:17:39Z

@VitalyFedyunin merged this pull request in 1c5073f.

…s (#18952) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Now compatible with both torch scripts: ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)` and ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))` Same checked for all similar functions `rand_like`, `empty_like` and others It is fixed version of #18455 Pull Request resolved: pytorch/pytorch#18952 Differential Revision: D14801792 Pulled By: VitalyFedyunin fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba

…models. (pytorch#19174) Summary: Helps to test pytorch#18952 Pull Request resolved: pytorch#19174 Differential Revision: D14899474 Pulled By: VitalyFedyunin fbshipit-source-id: a4854ad44da28bd0f5115ca316e6078cbfe29d0d

pytorch#18952) Summary: Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary. Supported functions: ```python torch.rand_like(t, pin_memory=True) torch.randn_like(t, pin_memory=True) torch.empty_like(t, pin_memory=True) torch.full_like(t, 4, pin_memory=True) torch.zeros_like(t, pin_memory=True) torch.ones_like(t, pin_memory=True) torch.tensor([10,11], pin_memory=True) torch.randn(3, 5, pin_memory=True) torch.rand(3, pin_memory=True) torch.zeros(3, pin_memory=True) torch.randperm(3, pin_memory=True) torch.empty(6, pin_memory=True) torch.ones(6, pin_memory=True) torch.eye(6, pin_memory=True) torch.arange(3, 5, pin_memory=True) ``` Part of the bigger: `Remove Storage` plan. Now compatible with both torch scripts: ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)` and ` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))` Same checked for all similar functions `rand_like`, `empty_like` and others It is fixed version of pytorch#18455 Pull Request resolved: pytorch#18952 Differential Revision: D14801792 Pulled By: VitalyFedyunin fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba

Majdoddin · 2024-11-22T20:44:21Z

@VitalyFedyunin What is the status of this PR please? What stops it to go to stable API?

VitalyFedyunin added 2 commits April 5, 2019 08:38

Change pin_memory default to False

7468b8f

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Apr 5, 2019

facebook-github-bot reviewed Apr 5, 2019

View reviewed changes

Linter fixes.

16e441c

facebook-github-bot reviewed Apr 5, 2019

View reviewed changes

VitalyFedyunin changed the title ~~[WIP] Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors~~ Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors Apr 5, 2019

VitalyFedyunin added 2 commits April 9, 2019 08:17

Merge branch 'master' of https://github.com/pytorch/pytorch into pin_…

7d729d8

…memory_try_3

Merge branch 'master' of https://github.com/pytorch/pytorch into pin_…

7066571

…memory_try_3