Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

soulitzer · 2021-06-22T00:58:00Z

Codegen checks are being added in #60286 in VariableType to enforce that for non-inplace, non-CompositeImplicitAutograd ops the output has TensorImpl with use_count of 1, and non-view ops have output with StorageImpl use_count of 1.

Notes:

StorageImpl tests are ordered before TensorImpl tests, so a StorageImpl failure may hide a TensorImpl failure
If TensorImpl use_count > 1, we are always able to construct a test that would cause StorageImpl to also fail (though this test may not actually exist in the real world).

These functions return tensors with TensorImpl use_count > 1:

`_embedding_bag`, `_embedding_bag_forward_only`,
`lu_unpack`, `_cudnn_rnn_backward`
`q_per_channel_scales`, `q_per_channel_zero_points`,

These 'non-view' functions return tensors with StorageImpl `use_count != 1:

'thnn_conv2d_forward', 'slow_conv3d_forward', 'channel_shuffle',

Functions that return tensors with StorageImpl use_count != 1 (which may also have TensorImpl use_count > 1:

 `dequantize_self`, `cudnn_rnn`

Edit: There were more entries in these lists, but all the convolution backward failures + batch_norm failures are actually due to the use_count being equal to 0 (i.e., the returned tensor is undefined) from the output mask. This should be expected, so in these cases (all cases) we should just relax the test to test for use_count <= 1

cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7

The text was updated successfully, but these errors were encountered:

albanD · 2021-06-22T13:00:54Z

Is there a simple answer to why these fail? Like they return some Tensors that are expected to be used somewhere else?

… and `.*conv.*_backward.*` (#61139) Summary: Temporary fix for fb-internal tests. This and similar failures are being discussed here: #60426 Applies the below changes: - This may seem counter intuitive because storage check comes before tensor check, but if TensorImpl use count is not enforced, we should also not enforce storage use count. If an op returns one of its inputs as-is, it is possible for this input to already be aliased with another tensor, and hence would have StorageImpl use count greater than one. - Also clarify in description that use_count is not necessarily > 1, use_count may but not necessarily return one of its inputs as-is. - Allow usage of regex in skip list Pull Request resolved: #61139 Reviewed By: malfet, Varal7 Differential Revision: D29564917 Pulled By: soulitzer fbshipit-source-id: 806b7177117a573dd12f161cc80dcadac892f9d0

Summary: Fixes internal test failure encountered internally For context see: #60426 Pull Request resolved: #61386 Reviewed By: malfet Differential Revision: D29601031 Pulled By: soulitzer fbshipit-source-id: 3592ca45a01e7bbaa804ab5404338191154f0fbc

Summary: Previously we require tensor use count to be exactly 1. We should actually allow for use count to be zero as well. Use count is zero when an undefined tensor is returned, and this is common in backward functions that have multiple outputs. In this PR I also remove some entries from the skip list that should be covered by this change: they return multiple tensors AND are backward functions. Batch norm is also known to return undefined tensors when `training=False`. Related issue: #60426 Pull Request resolved: #61414 Reviewed By: albanD Differential Revision: D29614687 Pulled By: soulitzer fbshipit-source-id: ab0892aed4bd1346b50b0a9552ffcc3287ac96af

soulitzer · 2021-07-17T02:52:17Z

For q_per_channel_scales and q_per_channel_zero_points the scale and zero_point tensors are returned are saved as fields of the Quantizer. This is mostly OK because quantized tensors don't require grad. We should fix the alias information for the JIT though.
For dequantize_self, the original tensor is returned as-is if it's already a float tensor. This should not be problematic because the backward is not implemented anyway, but we should fix the alias information for the JIT.
lu_unpack returns a tuple of undefined tensors when unpack_data and unpack_pivots are both false. This is already addressed, so we can just remove from the list.
channel_shuffle can cause issues because the output may be aliased with input, but there is no view_info Ensure channel_shuffle does not returns a view #61794.
_embedding_bag and _embedding_bag_forward_only. In _embedding_bag_cpu_impl_out, the statement max_indices = bag_size is executed in certain modes. This can be fixed easily, but its not problematic either because the tensors are integral. (Maybe alias information can be added here as well to indicate that the two outputs are possibly aliased).
_cudnn_rnn is fine because the output with use_count > 1 is only used internally (only so that it could be saved for backward).
_cudnn_rnn_backward can sometimes return undefined tensors, this has already been addressed
thnn_conv2d_forward, slow_conv3d_forward

albanD · 2021-07-19T21:01:31Z

This is OK because quantized tensors don't require grad.

Note that the alias informations are pretty important for the jit as well.
So if the alias informations are wrong, we should flag these as it might still be a problem, even when autograd is not involved.

soulitzer · 2021-07-20T02:38:17Z

Good point.
Looking at _cudnn_rnn more closely, currently it takes in as parameter weight_buf an optional tensor which gets unwrapped into a possibly undefined tensor. If the tensor is defined, a new tensor referencing the same TensorImpl is returned; otherwise, a zero-filled tensor is returned.

Discussion of next steps:

The output doesn't always alias the input and the input is undefined in that case, so not sure if it is possible to mark as a view. It might require some more effort to properly address this
For the purposes of JIT it seems fine to update the alias info
From the autograd side, there are no concerns with set_history and overwriting grad_fn since the weight_buf input does not have gradient (this also implies that double backward is not possible since weight_buf is an intermediate output saved for backward, so that is not a concern either).

albanD · 2021-07-20T15:46:27Z

so not sure if it is possible to mark as a view

For the non-AutogradExplicit functions, it is ok to be conservative with the alias info. For example, things like .contiguous() or .reshape() always mark the input/output to be aliased even though it might not always be true. The same should be used here I think.

VitalyFedyunin added module: autograd Related to torch.autograd, and the autograd engine in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 24, 2021

soulitzer self-assigned this Jul 1, 2021

soulitzer mentioned this issue Jul 1, 2021

Skip codegen checks for dequantize_self, lu_unpack, _cudnn_rnn, and .*conv.*_backward.* #61139

Closed

soulitzer changed the title ~~Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl use_count > 1~~ Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl or StorageImpl use_count != 1 Jul 6, 2021

soulitzer mentioned this issue Jul 7, 2021

Skip _cudnn_rnn_backward in codegen check #61386

Closed

soulitzer mentioned this issue Jul 8, 2021

Relax use-count check to allow for 0 #61414

Closed

soulitzer changed the title ~~Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl or StorageImpl use_count != 1~~ Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 Jul 12, 2021

soulitzer mentioned this issue Jul 16, 2021

Ensure channel_shuffle does not returns a view #61794

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

soulitzer commented Jun 22, 2021 •

edited

albanD commented Jun 22, 2021

soulitzer commented Jul 17, 2021 •

edited

albanD commented Jul 19, 2021

soulitzer commented Jul 20, 2021

albanD commented Jul 20, 2021

Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

Comments

soulitzer commented Jun 22, 2021 • edited

albanD commented Jun 22, 2021

soulitzer commented Jul 17, 2021 • edited

albanD commented Jul 19, 2021

soulitzer commented Jul 20, 2021

albanD commented Jul 20, 2021

soulitzer commented Jun 22, 2021 •

edited

soulitzer commented Jul 17, 2021 •

edited