Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426

Open
soulitzer opened this issue Jun 22, 2021 · 5 comments
Assignees
Labels
module: autograd Related to torch.autograd, and the autograd engine in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@soulitzer
Copy link
Contributor

soulitzer commented Jun 22, 2021

Codegen checks are being added in #60286 in VariableType to enforce that for non-inplace, non-CompositeImplicitAutograd ops the output has TensorImpl with use_count of 1, and non-view ops have output with StorageImpl use_count of 1.

Notes:

  • StorageImpl tests are ordered before TensorImpl tests, so a StorageImpl failure may hide a TensorImpl failure
  • If TensorImpl use_count > 1, we are always able to construct a test that would cause StorageImpl to also fail (though this test may not actually exist in the real world).

These functions return tensors with TensorImpl use_count > 1:

`_embedding_bag`, `_embedding_bag_forward_only`,
`lu_unpack`, `_cudnn_rnn_backward`
`q_per_channel_scales`, `q_per_channel_zero_points`,

These 'non-view' functions return tensors with StorageImpl `use_count != 1:

'thnn_conv2d_forward', 'slow_conv3d_forward', 'channel_shuffle',

Functions that return tensors with StorageImpl use_count != 1 (which may also have TensorImpl use_count > 1:

 `dequantize_self`, `cudnn_rnn`

Edit: There were more entries in these lists, but all the convolution backward failures + batch_norm failures are actually due to the use_count being equal to 0 (i.e., the returned tensor is undefined) from the output mask. This should be expected, so in these cases (all cases) we should just relax the test to test for use_count <= 1

cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7

@albanD
Copy link
Collaborator

albanD commented Jun 22, 2021

Is there a simple answer to why these fail? Like they return some Tensors that are expected to be used somewhere else?

@VitalyFedyunin VitalyFedyunin added module: autograd Related to torch.autograd, and the autograd engine in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 24, 2021
@soulitzer soulitzer self-assigned this Jul 1, 2021
@soulitzer soulitzer changed the title Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl use_count > 1 Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl or StorageImpl use_count != 1 Jul 6, 2021
facebook-github-bot pushed a commit that referenced this issue Jul 7, 2021
… and `.*conv.*_backward.*` (#61139)

Summary:
Temporary fix for fb-internal tests. This and similar failures are being discussed here:
#60426

Applies the below changes:
 - This may seem counter intuitive because storage check comes before tensor check, but if TensorImpl use count is not enforced, we should also not enforce storage use count. If an op returns one of its inputs as-is, it is possible for this input to already be aliased with another tensor, and hence would have StorageImpl use count greater than one.
 - Also clarify in description that use_count is not necessarily > 1, use_count may but not necessarily return one of its inputs as-is.
 - Allow usage of regex in skip list

Pull Request resolved: #61139

Reviewed By: malfet, Varal7

Differential Revision: D29564917

Pulled By: soulitzer

fbshipit-source-id: 806b7177117a573dd12f161cc80dcadac892f9d0
facebook-github-bot pushed a commit that referenced this issue Jul 8, 2021
Summary:
Fixes internal test failure encountered internally

For context see: #60426

Pull Request resolved: #61386

Reviewed By: malfet

Differential Revision: D29601031

Pulled By: soulitzer

fbshipit-source-id: 3592ca45a01e7bbaa804ab5404338191154f0fbc
facebook-github-bot pushed a commit that referenced this issue Jul 9, 2021
Summary:
Previously we require tensor use count to be exactly 1. We should actually allow for use count to be zero as well. Use count is zero when an undefined tensor is returned, and this is common in backward functions that have multiple outputs.

In this PR I also remove some entries from the skip list that should be covered by this change: they return multiple tensors AND are backward functions. Batch norm is also known to return undefined tensors when `training=False`.

Related issue: #60426

Pull Request resolved: #61414

Reviewed By: albanD

Differential Revision: D29614687

Pulled By: soulitzer

fbshipit-source-id: ab0892aed4bd1346b50b0a9552ffcc3287ac96af
@soulitzer soulitzer changed the title Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl or StorageImpl use_count != 1 Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 Jul 12, 2021
@soulitzer
Copy link
Contributor Author

soulitzer commented Jul 17, 2021

  • For q_per_channel_scales and q_per_channel_zero_points the scale and zero_point tensors are returned are saved as fields of the Quantizer. This is mostly OK because quantized tensors don't require grad. We should fix the alias information for the JIT though.
  • For dequantize_self, the original tensor is returned as-is if it's already a float tensor. This should not be problematic because the backward is not implemented anyway, but we should fix the alias information for the JIT.
  • lu_unpack returns a tuple of undefined tensors when unpack_data and unpack_pivots are both false. This is already addressed, so we can just remove from the list.
  • channel_shuffle can cause issues because the output may be aliased with input, but there is no view_info Ensure channel_shuffle does not returns a view #61794.
  • _embedding_bag and _embedding_bag_forward_only. In _embedding_bag_cpu_impl_out, the statement max_indices = bag_size is executed in certain modes. This can be fixed easily, but its not problematic either because the tensors are integral. (Maybe alias information can be added here as well to indicate that the two outputs are possibly aliased).
  • _cudnn_rnn is fine because the output with use_count > 1 is only used internally (only so that it could be saved for backward).
  • _cudnn_rnn_backward can sometimes return undefined tensors, this has already been addressed
  • thnn_conv2d_forward, slow_conv3d_forward

@albanD
Copy link
Collaborator

albanD commented Jul 19, 2021

This is OK because quantized tensors don't require grad.

Note that the alias informations are pretty important for the jit as well.
So if the alias informations are wrong, we should flag these as it might still be a problem, even when autograd is not involved.

@soulitzer
Copy link
Contributor Author

Good point.
Looking at _cudnn_rnn more closely, currently it takes in as parameter weight_buf an optional tensor which gets unwrapped into a possibly undefined tensor. If the tensor is defined, a new tensor referencing the same TensorImpl is returned; otherwise, a zero-filled tensor is returned.

Discussion of next steps:

  • The output doesn't always alias the input and the input is undefined in that case, so not sure if it is possible to mark as a view. It might require some more effort to properly address this
  • For the purposes of JIT it seems fine to update the alias info
  • From the autograd side, there are no concerns with set_history and overwriting grad_fn since the weight_buf input does not have gradient (this also implies that double backward is not possible since weight_buf is an intermediate output saved for backward, so that is not a concern either).

@albanD
Copy link
Collaborator

albanD commented Jul 20, 2021

so not sure if it is possible to mark as a view

For the non-AutogradExplicit functions, it is ok to be conservative with the alias info. For example, things like .contiguous() or .reshape() always mark the input/output to be aliased even though it might not always be true. The same should be used here I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: autograd Related to torch.autograd, and the autograd engine in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants