New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[te] Fix bugs with shift operators #49271
Conversation
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) ghstack-source-id: 118453866 Pull Request resolved: #49271
💊 CI failures summary and remediationsAs of commit a7bd0b6 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_6_gcc5_4_build (1/3)Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (lhs->dtype() != rhs->dtype()) { | ||
throw malformed_input("bad dtype in And"); | ||
if (lhs.dtype() != rhs.dtype()) { | ||
throw malformed_input("lhs/rhs dtype mismatch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity: these throws used to be in a c-tor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, throwing in an Expr
ctor is bad news because of the way KernelScopedObject
and KernelArena
work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
if (lhs->dtype() != rhs->dtype()) { | ||
throw malformed_input("bad dtype in And"); | ||
if (lhs.dtype() != rhs.dtype()) { | ||
throw malformed_input("lhs/rhs dtype mismatch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to double check, we can't do `(uint64_t)42 << (uint8_t)5 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the TE dsl requires operand types to match, and when coming from pytorch we enforce that at the kernel
translation layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Pull Request resolved: #49271 Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. ghstack-source-id: 118467480 Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/)
throw malformed_input("bad dtype in Lshift"); | ||
} | ||
} | ||
: BitwiseOpNode(lhs, rhs, IRNodeType::kLshift) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where can I find the definition of whatever operation that maps to kLshift
? Intuitively I thought shift operators would take a tensor as 1st operand and an integer as 2nd operand (e.g., tensorB = tensorA << 1
). Or does it really support sth like tensorB = tensorA << tensorC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
native_functions.yaml is a good source - it lists all (or majority?) of the ops that we have in pytorch.
Another useful list is what ops we accept from the fuser: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/tensorexpr_fuser.cpp#L137-L140
And one more useful place is what we know how to lower to TE:
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/tensorexpr/kernel.cpp#L879
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZolotukhinM lshift
is defined in native_functions.yaml
. And it supports the 2nd operand to be of tensor type or scalar type.
- func: __ilshift__.Scalar(Tensor(a!) self, Scalar other) -> Tensor(a!)
use_c10_dispatcher: full
variants: method
dispatch:
CPU, CUDA: __ilshift__
- func: __ilshift__.Tensor(Tensor(a!) self, Tensor other) -> Tensor(a!)
use_c10_dispatcher: full
variants: method
dispatch:
CPU, CUDA: __ilshift__
In the constructor of kLshift
, we assume both operands to be tensor types. What happens if the 2nd operand is a scalar? Is it promoted to tensor types somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Lshift
constructor actually just takes Expr
s, which can either be a scalar or a tensor element. For the scalar case we'd generate something like lhs(i) << scalar
, for the tensor case we'd have lhs(i) << rhs(i)
. (where i
is standing in for whatever indexing expression we have for the tensor).
In building the TE from a jit::Graph (kernel.cpp
) we have a lookup function called tensorOrConstant
that will lookup a jit::Value
in a map and return the appropriate Expr
, whether it's a tensor or a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, also, I don't think we handle non-constant scalars in the TE codegen. I don't remember why but it wouldn't surprise me if there are a ton of corner cases since scalars are coming from the python type system.
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Pull Request resolved: #49271 Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. ghstack-source-id: 118556876 Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/)
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/) [ghstack-poisoned]
Pull Request resolved: #49271 Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. ghstack-source-id: 118594998 Differential Revision: [D25512052](https://our.internmc.facebook.com/intern/diff/D25512052/)
Summary: Pull Request resolved: pytorch#49271 Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. ghstack-source-id: 118594998 Test Plan: `buck test //caffe2/test:jit` Differential Revision: D25512052 fbshipit-source-id: f3ca16f208c427cd3d740e8971302d8d504240fb
This pull request has been merged in f4e15c4. |
Summary: Pull Request resolved: pytorch#49396 Pull Request resolved: pytorch#49271 Two things: 1. These throw exceptions in their constructor, which causes a segfault (*), so move the exceptions to ::make. 2. They technically support FP types but the rules are complicated so let's not bother. (*) The reason for the segfault: all Exprs including these inherit from KernelScopedObject, whose constructor adds the object to a list for destruction at the end of the containing KernelArena's lifetime. But if the derived-class constructor throws, the object is deleted even though it's still in the KernelArena's list. So when the KernelArena is itself deleted, it double-frees the pointer and dies. I've also fixed And, Or, and Xor in this diff. ghstack-source-id: 118594998 Test Plan: `buck test //caffe2/test:jit` Reviewed By: bwasti Differential Revision: D25512052 fbshipit-source-id: 42670b3be0cc1600dc5cda6811f7f270a2c88bba
Stack from ghstack:
Two things:
move the exceptions to ::make.
bother.
(*) The reason for the segfault: all Exprs including these inherit from
KernelScopedObject, whose constructor adds the object to a list for destruction
at the end of the containing KernelArena's lifetime. But if the derived-class
constructor throws, the object is deleted even though it's still in the
KernelArena's list. So when the KernelArena is itself deleted, it double-frees
the pointer and dies. I've also fixed And, Or, and Xor in this diff.
Differential Revision: D25512052