Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() #20496

yf225 · 2019-05-14T18:46:18Z

Previously, the caller of shallow_copy_and_detach() is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of shallow_copy_and_detach() to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought.

For similar reasons, we want to enforce users of shallow_copy_and_detach() to pass allow_tensor_metadata_change to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought.

c10/core/TensorImpl.h

aten/src/ATen/OpaqueTensorImpl.h

torch/csrc/autograd/variable.cpp

gchanan

I'm pretty confused. This is the second time this PR has added a parameter to the function and always passed in the same value. What was the point of adding the parameter, then? Is this setting up some future improvement?

gchanan · 2019-05-15T17:47:36Z

aten/src/ATen/OpaqueTensorImpl.h

@@ -80,13 +80,14 @@ struct CAFFE2_API OpaqueTensorImpl : public TensorImpl {
 // 1. the AutogradMeta pointer, because it is unique for each Variable.
 // 2. the version counter, because although it lives in TensorImpl, the version counter is managed
 // by autograd, and the call sites of `shallow_copy_and_detach()` (from autograd) should decide what
-// the version counter should be for each new TensorImpl. See NOTE [ Version Counter Sharing ] for details.
+// the version counter should be for each new TensorImpl, by passing the correct version counter as


I don't know what "the version counter should be for each new TensorImpl" is supposed to mean.

I still don't know what this means. Can you just say something like "the version counter is set to the passed inversion_counter.

gchanan · 2019-05-15T17:53:32Z

torch/csrc/autograd/variable.cpp

@@ -170,7 +170,7 @@ void Variable::Impl::set_data(const at::Tensor &new_data) {
  device_opt_ = new_data.device();
  type_id_ = new_data.dispatch_type().type_id();

-  auto new_data_impl_copy = new_data.getIntrusivePtr()->shallow_copy_and_detach();
+  auto new_data_impl_copy = new_data.getIntrusivePtr()->shallow_copy_and_detach(/*version_counter=*/0);


I don't understand -- don't you want to pass in exactly what saved_version_counter is below?

Sorry it was an oversight -- fixed.

gchanan · 2019-05-15T18:28:56Z

aten/src/ATen/OpaqueTensorImpl.h

@@ -80,13 +80,14 @@ struct CAFFE2_API OpaqueTensorImpl : public TensorImpl {
 // 1. the AutogradMeta pointer, because it is unique for each Variable.
 // 2. the version counter, because although it lives in TensorImpl, the version counter is managed
 // by autograd, and the call sites of `shallow_copy_and_detach()` (from autograd) should decide what
-// the version counter should be for each new TensorImpl. See NOTE [ Version Counter Sharing ] for details.
+// the version counter should be for each new TensorImpl, by passing the correct version counter as


I still don't know what this means. Can you just say something like "the version counter is set to the passed inversion_counter.

gchanan · 2019-05-15T18:29:58Z

torch/csrc/autograd/variable.h

@@ -608,7 +607,7 @@ inline Variable make_variable(
      !data.is_variable(),
      "Must not create a new variable from a variable, use its .data()");
  if (data.defined()) {
-    auto data_impl_copy = data.getIntrusivePtr()->shallow_copy_and_detach();
+    auto data_impl_copy = data.getIntrusivePtr()->shallow_copy_and_detach(/*version_counter=*/0);
    data_impl_copy->set_allow_tensor_metadata_change(allow_tensor_metadata_change);


are there cases where we need to not set_allow_tensor_metadata_change and then set it later? Or is it always "set" correctly at the shallow_copy_and_detach step?

It's always "set" correctly at the shallow_copy_and_detach step except for this one case:

pytorch/torch/csrc/autograd/variable.h

Lines 587 to 601 in da3e74b

inline Variable make_variable_consuming(

at::Tensor data,

bool requires_grad = false,

bool allow_tensor_metadata_change = true) {

TORCH_CHECK(

!data.is_variable(),

"Must not create a new variable from a variable, use its .data()");

if (data.defined()) {

AT_ASSERT(data.getIntrusivePtr().use_count() == 1);

data.unsafeGetTensorImpl()->set_allow_tensor_metadata_change(allow_tensor_metadata_change);

auto autograd_meta = c10::guts::make_unique<Variable::AutogradMeta>();

return Variable(c10::make_intrusive<Variable::Impl>(std::move(data), std::move(autograd_meta), requires_grad));

}

return Variable();

}

I think we should add allow_tensor_metadata_change as another parameter into shallow_copy_and_detach(), to zip up this API further.

Update: I added the allow_tensor_metadata_change parameter to shallow_copy_and_detach().

gchanan · 2019-05-15T18:31:11Z

aten/src/ATen/OpaqueTensorImpl.h

@@ -99,6 +100,7 @@ c10::intrusive_ptr<TensorImpl> shallow_copy_and_detach() const override {
  impl->is_contiguous_ = is_contiguous_;
  impl->is_wrapped_number_ = is_wrapped_number_;
  impl->reserved_ = reserved_;
+  impl->set_version_counter(version_counter);


are there now calls to set_version_counter that aren't contained in shallow_copy_and_detach?

There are still two of them:

https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.cpp#L196. For this one, because diff_view_meta->base_ can be set to base or base.base() depending on whether base is already a view, we don't know the right value for the view's version counter when we call shallow_copy_and_detach in https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L549, and we will have to wait until this step to set the version counter to the correct value.

https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/saved_variable.cpp#L87. For this one, since the shallow_copy_and_detach call is inside make_variable(), we can either allow passing version_counter as parameter to make_variable(), or treat this as a one-off case and allow calling set_version_counter outside of shallow_copy_and_detach.

Ok, it sounds like both of these cases should go away soon:

We shouldn't need make_variable, because everything will be a variable. And if we have a similar API -- we can just pass the version counter, as you mentioned.

I'm not sure why we can't just save the variable, instead of the tensor in the future -- maybe there is some infinite loop thing?

Sounds great, yes we will be able to simplify this after we make everything a variable.

I think currently we don't save the original variable because of https://bit.ly/2w367de (I wasn't able to use the Github link in this comment).

I added the task for looking into this in #13638.

facebook-github-bot

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-05-16T04:11:58Z

@yf225 merged this pull request in 456b889.

…hallow_copy_and_detach() (#20496) Summary: Previously, the caller of `shallow_copy_and_detach()` is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of `shallow_copy_and_detach()` to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought. For similar reasons, we want to enforce users of `shallow_copy_and_detach()` to pass `allow_tensor_metadata_change` to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought. Pull Request resolved: pytorch/pytorch#20496 Differential Revision: D15363620 Pulled By: yf225 fbshipit-source-id: a65e74738b10452668d6dc644b43aad5b3d8c9e6

yf225 requested a review from gchanan May 14, 2019 18:46

pytorchbot added module: autograd Related to torch.autograd, and the autograd engine in general module: internals Related to internal abstractions in c10 and ATen oncall: quantization Quantization support in PyTorch labels May 14, 2019

yf225 mentioned this pull request May 14, 2019

Variable/Tensor Merge Proposal #13638

Closed

22 tasks

[WIP]

9769e08

yf225 force-pushed the shallow_copy_version_counter_param branch from 2f2470f to 9769e08 Compare May 15, 2019 12:16

gchanan requested changes May 15, 2019

View reviewed changes

c10/core/TensorImpl.h Outdated Show resolved Hide resolved

aten/src/ATen/OpaqueTensorImpl.h Outdated Show resolved Hide resolved

torch/csrc/autograd/variable.cpp Outdated Show resolved Hide resolved

Will Feng added 2 commits May 15, 2019 11:16

fix comments

f3c6172

[WIP]

61932ef

yf225 changed the title ~~Add ShallowCopyVersionCounterMode to shallow_copy_and_detach()~~ Require passing version counter to shallow_copy_and_detach() May 15, 2019

Will Feng added 2 commits May 15, 2019 13:39

fix comments

6b383b9

clean up

8f45168

gchanan requested changes May 15, 2019

View reviewed changes

fix bug

b77ad7e

gchanan reviewed May 15, 2019

View reviewed changes

clean up

4549960

yf225 force-pushed the shallow_copy_version_counter_param branch from 35023e2 to 4549960 Compare May 15, 2019 18:41

Pass allow_tensor_metadata_change to shallow_copy_and_detach()

ed07b5a

yf225 changed the title ~~Require passing version counter to shallow_copy_and_detach()~~ Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() May 15, 2019

clean up

539d66d

gchanan approved these changes May 15, 2019

View reviewed changes

facebook-github-bot reviewed May 15, 2019

View reviewed changes

facebook-github-bot closed this in 456b889 May 16, 2019

facebook-github-bot added the merged label May 16, 2019

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() #20496

Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() #20496

yf225 commented May 14, 2019 •

edited

gchanan left a comment

gchanan May 15, 2019

gchanan May 15, 2019

gchanan May 15, 2019

yf225 May 15, 2019

gchanan May 15, 2019

gchanan May 15, 2019

yf225 May 15, 2019

yf225 May 15, 2019

gchanan May 15, 2019

yf225 May 15, 2019

gchanan May 15, 2019

yf225 May 15, 2019

facebook-github-bot left a comment

facebook-github-bot commented May 16, 2019

	inline Variable make_variable_consuming(
	at::Tensor data,
	bool requires_grad = false,
	bool allow_tensor_metadata_change = true) {
	TORCH_CHECK(
	!data.is_variable(),
	"Must not create a new variable from a variable, use its .data()");
	if (data.defined()) {
	AT_ASSERT(data.getIntrusivePtr().use_count() == 1);
	data.unsafeGetTensorImpl()->set_allow_tensor_metadata_change(allow_tensor_metadata_change);
	auto autograd_meta = c10::guts::make_unique<Variable::AutogradMeta>();
	return Variable(c10::make_intrusive<Variable::Impl>(std::move(data), std::move(autograd_meta), requires_grad));
	}
	return Variable();
	}

Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() #20496

Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() #20496

Conversation

yf225 commented May 14, 2019 • edited

gchanan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented May 16, 2019

yf225 commented May 14, 2019 •

edited