New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC-0010: TensorRef #16
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
we unsafely use reclaim/release to ensure that no refcount bump occurs | ||
on construction/destruction of the object. We verified with Godbolt | ||
that the compiler is able to eliminate the `base_` destructor. This | ||
makes it possible to take out a `const Tensor&`, which makes it easier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as the Tensor
is purely internal to TensorRef
, this could be safe since we have control over the invariants and where refcounting occurs. But allowing users to get a const Tensor&
out of it means that the "internal" Tensor object could be copied outside of your control and you have to be much more careful about correctness, especially since Tensor doesn't seem to know if it's borrowed or not. I saw one of your earlier proposals had Tensor know about this, maybe we should do that?
Or, if the const Tensor&
conversion is only there to avoid having to rewrite all ops at once, maybe there's a regex/codemod that would allow us to rewrite them more easily and then not allow TensorRef -> const Tensor& conversions? Or, if we manage to eliminate most but not all call sites, we might be able to make it a TensorRef -> Tensor (by value) conversion and be ok with the refcount if somebody does that. Seems like a less dangerous design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But allowing users to get a const Tensor& out of it means that the "internal" Tensor object could be copied outside of your control and you have to be much more careful about correctness
A copy of const Tensor&
to Tensor
is OK, because this always induces a refcount bump! Most of the bad situations @swolchok was able to come with required you to have a mutable reference/pointer to the internal Tensor, which we just forbid here.
Or, if the const Tensor& conversion is only there to avoid having to rewrite all ops at once, maybe there's a regex/codemod that would allow us to rewrite them more easily and then not allow TensorRef -> const Tensor& conversions?
Yeah. I allude to this in the second alternative proposal, where for Intel ABI reasons, it is much better if the long term API doesn't permit conversion to const Tensor&
(so we can make the class trivial). So ideally we'd get rid of this BC crutch eventually.
* Instead of trying to force the above implementation of `TensorRef` to | ||
be used everywhere, it could instead be a utility class used in | ||
limited situations to improve interoperability with code that expects | ||
a `const Tensor&` when you don't have a `Tensor` available. The true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow why there are two objects - a trivial and a nontrivial one. Can you give some more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nontrivial TensorRef class is given in this proposal. The trivial class is:
class TrivialTensorRef {
TensorImpl* impl_;
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see. I like the trivial one much better from a design point of view ;) But I guess you agree on that point and the non-trivial one only exists for the bc reasons discussed above.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
A variation of this has been implemented in pytorch/pytorch#55685 |
As with the |
This proposal introduces a new class
TensorRef
, to replace allplaces in our codebase where we currently use
const Tensor&
. Thedistinguishing characteristics of this class are:
c10::ArrayRef
orstd::string_view
)const Tensor&
(i.e., it can be introduced incrementally)
Rendered
Signed-off-by: Edward Z. Yang ezyang@fb.com