New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output of non-inplace, non-CompositeImplicitAutograd op has TensorImpl > 1 or StorageImpl use_count != 1 #60426
Comments
Is there a simple answer to why these fail? Like they return some Tensors that are expected to be used somewhere else? |
… and `.*conv.*_backward.*` (#61139) Summary: Temporary fix for fb-internal tests. This and similar failures are being discussed here: #60426 Applies the below changes: - This may seem counter intuitive because storage check comes before tensor check, but if TensorImpl use count is not enforced, we should also not enforce storage use count. If an op returns one of its inputs as-is, it is possible for this input to already be aliased with another tensor, and hence would have StorageImpl use count greater than one. - Also clarify in description that use_count is not necessarily > 1, use_count may but not necessarily return one of its inputs as-is. - Allow usage of regex in skip list Pull Request resolved: #61139 Reviewed By: malfet, Varal7 Differential Revision: D29564917 Pulled By: soulitzer fbshipit-source-id: 806b7177117a573dd12f161cc80dcadac892f9d0
Summary: Previously we require tensor use count to be exactly 1. We should actually allow for use count to be zero as well. Use count is zero when an undefined tensor is returned, and this is common in backward functions that have multiple outputs. In this PR I also remove some entries from the skip list that should be covered by this change: they return multiple tensors AND are backward functions. Batch norm is also known to return undefined tensors when `training=False`. Related issue: #60426 Pull Request resolved: #61414 Reviewed By: albanD Differential Revision: D29614687 Pulled By: soulitzer fbshipit-source-id: ab0892aed4bd1346b50b0a9552ffcc3287ac96af
|
Note that the alias informations are pretty important for the jit as well. |
Good point. Discussion of next steps:
|
For the non-AutogradExplicit functions, it is ok to be conservative with the alias info. For example, things like .contiguous() or .reshape() always mark the input/output to be aliased even though it might not always be true. The same should be used here I think. |
Codegen checks are being added in #60286 in
VariableType
to enforce that for non-inplace, non-CompositeImplicitAutograd ops the output has TensorImpl with use_count of 1, and non-view ops have output with StorageImpl use_count of 1.Notes:
These functions return tensors with TensorImpl
use_count > 1
:These 'non-view' functions return tensors with StorageImpl `use_count != 1:
Functions that return tensors with StorageImpl
use_count
!= 1 (which may also have TensorImpluse_count > 1
:Edit: There were more entries in these lists, but all the convolution backward failures + batch_norm failures are actually due to the use_count being equal to 0 (i.e., the returned tensor is undefined) from the output mask. This should be expected, so in these cases (all cases) we should just relax the test to test for
use_count <= 1
cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7
The text was updated successfully, but these errors were encountered: