-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix layout broadcasting arithm ops #4951
Fix layout broadcasting arithm ops #4951
Conversation
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [9043057]: BUILD STARTED |
CI MESSAGE: [9043057]: BUILD FAILED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [9048568]: BUILD STARTED |
CI MESSAGE: [9048568]: BUILD PASSED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
* @brief Checks if the child is a (possibly improper) suffix of the parent. | ||
* The child must not be longer than the parent. | ||
*/ | ||
inline bool HasSuffix(const TensorLayout &parent, const TensorLayout &child) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More common name: EndsWith
inline bool HasSuffix(const TensorLayout &parent, const TensorLayout &child) { | |
inline bool EndsWith(const TensorLayout &parent, const TensorLayout &child) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, nice name.
* The child must not be longer than the parent. | ||
*/ | ||
inline bool HasSuffix(const TensorLayout &parent, const TensorLayout &child) { | ||
assert(parent.size() >= child.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary and actually harmful. Obviously a longer string is not a suffix of a shorter one.
assert(parent.size() >= child.size()); | |
if (parent.size() < child.size()) | |
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [9058724]: BUILD STARTED |
if (!types_layout_inferred_) { | ||
result_type_id_ = PropagateTypes<Backend>(*expr_, ws); | ||
result_layout_ = GetCommonLayout<Backend>(*expr_, ws); | ||
// sample_ndim is assumed to be constant between iterations. | ||
if (result_layout_.size() != result_shape_.sample_dim()) { | ||
// If the largest dimentional tensors do not have layout set but some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If the largest dimentional tensors do not have layout set but some | |
// If the tensors with the most dimensions do not have a layout set but some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -134,7 +134,7 @@ TEST(ArithmeticOpsTest, TreePropagationLayoutError) { | |||
ptr->Resize({{1}, {2}}, DALI_INT32); | |||
} | |||
in[0]->SetLayout(TensorLayout()); | |||
in[1]->SetLayout(TensorLayout("HW")); | |||
in[1]->SetLayout(TensorLayout("DH")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we should test both cases - the old one should not throw, but instead extend the layout to "DHW"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have tests for that in python though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the case.
/** | ||
* @brief Checks if the child is a (possibly improper) suffix of the parent. | ||
*/ | ||
inline bool EndsWith(const TensorLayout &parent, const TensorLayout &child) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:
parent
and child
don't capture the relation of the arguments well.
inline bool EndsWith(const TensorLayout &parent, const TensorLayout &child) { | |
inline bool EndsWith(const TensorLayout &what, const TensorLayout &with) { |
or
inline bool EndsWith(const TensorLayout &parent, const TensorLayout &child) { | |
inline bool EndsWith(const TensorLayout &whole, const TensorLayout &suffix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
!build |
CI MESSAGE: [9059365]: BUILD STARTED |
CI MESSAGE: [9059365]: BUILD PASSED |
Fix layout broadcasting in arithm ops * Allow layouts of different lenghts as long as they share common suffix * Use empty layout if the inferred one is too short --------- Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Category:
Bug fix (non-breaking change which fixes an issue)
Description:
Without this change, it is possible to produce tensors whose layout does not match their dimensionality.
The arithm ops layout propagation works by picking just any non-empty layout and making sure that any other operands have either empty or equal layout. This leads to two issues:
a
of shape (1, 2, 3, 4) and empty layout, andb
of shape (4,) and layout "C". The output ends up to have 4dims and layout "C".The fix is to 1. check if the picked layout is not too short and if so, discard it, 2. allow for different layouts as long as one if suffix of the other.
Additional information:
It would be nice to follow-up the layout propagation with more involved mechanism that would:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-3556