-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Unity][Op] Dynamic Strided Slice #14548
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
1c7086c
to
d328a0e
Compare
d328a0e
to
f04a0d6
Compare
include/tvm/topi/transform.h
Outdated
for (int i = 0; i < ndim; i++) { | ||
length = if_then_else(indices[0] == i, data->shape[i], length); | ||
} | ||
return GetLength(begin(indices), end(indices), strides(indices), length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Through the internal discussion with @junrushao, we confirmed that this should comply with WIP PR #14278.
Not sure what you meant by above. Is the whole point being able to avoid repeated compute definition for shape func like this? i.e., automatically derive a shape func from the original te compute alone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR lets the legalizer define the output shape of dynamic slice and pass it to TOPI, we wanted to make sure this is not a breaking change for PR #14278 as it leverages the output computation logics in TOPI. We checked that since there are already operators like reshape
that TOPI takes its output shape defined elsewhere, this change won't put any extra complication to his PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the question is whether or not we consider shape funcs a part of "op definition". This is the case for Relay, and one of the reason adding a new op in Relay is complicated. I don't know what the goal of #14278 is, but the current discussion in this PR suggests shape funcs need not be implemented under topi. It seems more like "implementation details" of the legalizer.
So unless #14278 is aiming to generate an op legalizer definition, I think this work is completely decoupled from #14278.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one of the main goals of #14278 is to automate the current repetitive & tedious op registration process including legalization, struct_info, etc. Unlike current flow which requires us to look at multiple different sites and do the manual job, #14278 will allow us to look at the single place to put all those information so that the parser will handle the rest. I believe legalization function should be there, too.
I think the question is whether or not we consider shape funcs a part of "op definition".
You brought up the very good points. I see that shape func is a part of op definition just like legalization function for each operator. When #14278 lands, shape function can be registered together only if necessary so that we don't complicate the op registration unnecessarily.
but the current discussion in this PR suggests shape funcs need not be implemented under topi.
I forgot to mention earlier, but #14278 deduce output shape of operator by using the existing shape computation logic in TOPI to automatically generate the struct_info logics. Not sure if this is the hard requirement tho. @junrushao, would it be okay to put the shape computation in elsewhere, for example legalizer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one of the main goals of #14278 is to automate the current repetitive & tedious op registration process including legalization
ok, but you are introducing shape_func_dynamic_strided_slice
which looks like a repetitive definition of output shape logic to me. Are you saying that #14278 is going to remove that definition? Otherwise I still don't get what you meant when you say this PR "complies" with #14278.
Probably what's not clear to me is this: Is #14278 going to auto-generate shape funcs? Otherwise, I don't understand how #14278 and shape funcs definition will work together, but I can leave that question for future after #14278 lands.
|
||
// The output shape will depend on the runtime value in begin/end/stride tensors. | ||
// TODO(tvm-team): Extract more compile-time info when those tensors are constants. | ||
return TensorStructInfo(data_sinfo->dtype, n_axis); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about this. It seems we can only express slicing with either "all static" or "all dynamic" axes. But partially static / dynamic slicing is very common in practice (e.g., slicing only along the dynamic "batch" (the number of detected boxes) dimension object detection models). If shape func needs to be implemented via topi, I don't see an way to express such partially-static output shape.
For Relay I added a hacky workaround #8165 for this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is current limitation of this PR so I left it as TODO.
Actually, I have a question regarding #8165. Can te::Tensor
contain symbolic variables? Maybe I missed but couldn't find any relevant test cases.
Please correct me if I'm wrong, based on my current understanding, data within te::Tensor
will be certain values rather than symbolic variables. So I cannot see how partially static output shape will be represented if we go through F1->F2 route below.
// F1
inline te::Tensor dynamic_strided_slice(const te::Tensor& x, const te::Tensor& begin,
const te::Tensor& end, const te::Tensor& strides, ...)
{
// ...
Array<PrimExpr> begin_expr, end_expr, strides_expr;
for (int64_t i = 0; i < num_dynamic_axes; ++i) {
auto ind = make_const(index_dtype, i);
begin_expr.push_back(begin(ind)); // <- Can `begin(ind)` be symbolic in practice?
end_expr.push_back(end(ind));
strides_expr.push_back(strides(ind));
}
// Call F2
return dynamic_strided_slice(x, begin_expr, end_expr, strides_expr, name, tag);
}
// F2
inline Tensor dynamic_strided_slice(const Tensor& x, const Array<PrimExpr>& begin,
const Array<PrimExpr>& end, const Array<PrimExpr>& strides, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think te::Tensor
can be created from symbolic values, but we cannot extract such values from the tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Is there any example code that I can try?
Also, if we can create the te::Tensor
with symbolic variables, may I ask why we cannot extract those values? Can't we access the value by using the index in the above example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming "symbolic variable" you mentioned is just another PrimExpr
, I expect we can fill in a tensor by that value, just like any other PrimExpr
. Why not try Can begin(ind) be symbolic in practice?
thing?
Can't we access the value by using the index in the above example?
What I meant was, as soon as we put a symbolic expression in a TE tensor, we lost all symbolic-specific information. You can index it, but what you get is an opaque value. So we cannot exploit any symbolic information about it.
So output shape represented by TE Tensor loses symbolic or constant shape information. That's why I think it is better to directly generate shape func via TIR.
# 1. Insert shape function | ||
output_shape = bb.normalize( | ||
bb.call_te( | ||
topi.shape_func_dynamic_strided_slice, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be an alternative way to implement shape funcs? Because te tensor cannot express "constant"-ness. See my related comment above as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered two options: TE and TIR, and chose TE since it is easier to write.
I did not consider the hybrid script since I did not want users to learn new thing only for the shape function.
Would hybrid script help in this case? Also, I'm open for other suggestions as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about TVMScript. Why not implement shape func in TVMScript, which should be able to represent partially-static shape as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but don't we need Array<PrimExpr>
to represent the partially-static shape? If this is true, I don't think this is currently supported in TVMScript. AFAIK, we usually carry those forms around within op attribute.
Also, if we go with TVMScript, do we require users to write the TIR primfunc by hand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is true, I don't think this is currently supported in TVMScript
hmm interesting. Don't we create TVMScript with shapes like (n
, 16) all the time? I hope the output of a shape func can also be expressed like that.
Also, if we go with TVMScript, do we require users to write the TIR primfunc by hand?
Yes but writing in TVMScript is only required if a user wants to exploit "partially-static" shape. I think this is a relatively rare and advanced use case. te.compute
-based definition should always be possible.
And other than dynamic slicing, I cannot think of other ops that may want to implement their shape func in TVMScript. Maybe reshape
and other shape-changing ops might apply but I haven't met a real-world use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that "partially-static" cases are common. However, writing TIR manually might make it more challenging for contributors to develop data-dependent ops. Simplifying the shape function implementation could be beneficial. I think several ops like reshape or max_pool2d could be used as partially-dynamic, such as allowing dynamic or partially-dynamic strides in max_pool even I haven't encountered the real-world use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@masahi Yeah, I believe that is the shape field in struct info. In this case, we need to pass Array<PrimExpr>
as an argument of call_tir
for the shape func. I think later transition to PrimValue
would be helpful in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on our discussion today, decided to revisit when #14278 lands. Left the TODO comment.
call.args[0], | ||
call.args[1], | ||
call.args[2], | ||
call.args[3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions:
- Why
shape_func_dynamic_strided_slice
needs to be implemented in topi / c++? This seems like a one-off function, why not just createte.compute
here in python? - Are we introducing a standard API for shape func (like Relay), or is each op going to pick any API as it sees fit (like this example)? In the latter case I don't see any reason shape funcs need to belong to topi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we introducing a standard API for shape func (like Relay), or is each op going to pick any API as it sees fit (like this example)? In the latter case I don't see any reason shape funcs need to belong to topi.
The latter case.
Great point. I put it in TOPI-side because I viewed TOPI as the registry of TEs and developers can easily check related implementations in the single file. I don't have any strong preference here, so I'm open to hear what folks prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no standard API for shape func, and there is no "registration" for shape funcs specifically, I think it is better not to introduce a rigid convention, in particular shape funcs be implemented in topi / c++.
Since they are only used by the legalizer, I view shape funcs in Relax simply as implementation details, one-off function. So developers should be able to express data-dependent output shape computation logic in any way in anywhere, including TVMScript.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is very good point. we can continue this discussion here: #14548 (comment)
const te::Tensor& end, const te::Tensor& strides, | ||
Array<PrimExpr> output_shape, | ||
std::string name = "T_strided_slice_dynamic", | ||
std::string tag = kInjective) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use the existing definition of dynamic_strided_slice
?
tvm/include/tvm/topi/transform.h
Lines 648 to 651 in 4e07a8e
inline Tensor dynamic_strided_slice(const Tensor& x, const Array<PrimExpr>& begin, | |
const Array<PrimExpr>& end, const Array<PrimExpr>& strides, | |
std::string name = "T_dynamic_strided_slice", | |
std::string tag = kInjective) { |
i.e., Why can't it create the output shape inside this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry maybe I'm confused about the existing implementation of dynamic_strided_slice
and its relation with strided_slice
shape func.
If dynamic_strided_slice
can calculate the right output shape from its dynamic input, why do we need a shape func for it at all... And why this new definition of dynamic_strided_slice
needs to have its output shape calculated by the shape func ahead of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do you have an answer to my original question? Why can't we simply create the output shape by
Array<PrimExpr> out_shape;
for (size_t i = 0; i < num_slice_axes; ++i) {
auto d = indexdiv(end[i] - begin[i], strides[i]);
if (d->IsInstance<tvm::IntImmNode>()) {
// Preserve static dimension if possible
out_shape.push_back(d);
} else {
out_shape.push_back(tvm::tir::Var("dim"));
}
}
?
Sorry if I'm missing something obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apology. It seems like I misread your comment.
The problem was how to insert match cast logic and bind those variables. I couldn't see a way to insert them in TOPI side. So I did it in the legalizer and it became explicit and straightforward.
tvm/python/tvm/relax/transform/legalize_ops/index.py
Lines 77 to 88 in 4901e3d
# 2. Convert tensor to shape and match cast with new symbolic vars | |
# Get shape length | |
ndim = int(output_shape.struct_info.shape[0]) | |
output_shape = bb.emit( | |
Call( | |
ExternFunc("vm.builtin.tensor_to_shape"), | |
[output_shape], | |
sinfo_args=[ShapeStructInfo(ndim=ndim)], | |
) | |
) | |
output_shape_vars = [tir.Var("s", "int64") for i in range(ndim)] | |
bb.match_cast(output_shape, ShapeStructInfo(output_shape_vars)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, again, sounds odd to me to talk about legalizer vs topi distinction. Since dynamic_strided_slice
and the shape func are only meant to be used by the legalizer, why not just define them here? There is no need to be concerned about "TOPI side".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on our discussion today, I moved the shape func to legalizer.
da897af
to
9f4e648
Compare
@sunggg Can you send a follow-up PR? cc @tqchen |
I also noticed this and disabled the tests temporarily in my unity merge, @sunggg please followup |
Hi @sunggg, I noticed this PR failed the tests of
you can reproduce it with the latest unity branch.
Could you please fix it if you have time? PS: The other failure tests in the same file are not related to this PR, and I've fixed it in the branch |
This PR brings dynamic strided slice, which will be the first stab for data-dependent ops in Unity.
Overview
It consists of three parts (Op, TOPI, Legalization) and their test cases.
Data-dependent ops, like dynamic strided slice, could be tricky when we cannot automatically deduce their output shape.
In such cases, we cannot lower them since TE infra requires a concrete output shape, which should be defined with symbolic variables at least. Therefore, manual shape function registration is inevitable for those operators to let the compiler know how to compute their output shapes.
With this PR, users can register the shape func in TOPI and insert it with match cast mechanism during the legalization.
It's worth noting that for data-dependent ops, current TOPI creates symbolic variables whenever the shape value of certain dimension is unknown, then later mechanism handles the binding and so on. (see link)
However, with this PR, the legalizer would be in charge of creating symbolic variables and binding them explicitly, and then pass the output shape to TOPI so that TOPI can simply use it to define its compute.
Through the internal discussion with @junrushao, we confirmed that this should comply with WIP PR #14278.
Also, since this requires the change in existing
topi::dynamic_strided_slice
(see link), this PR creates relax namespace and implements its own version.Notes
relax.strided_slice
op that covers non-data-dependent scenarios. This is still useful since it can cover the current limitation ofrelax.dynamic_strided_slice
op: its shape analysis may not be informative enough for its users liketensor_to_shape
, which requires the known integer shape at compile-time.We will revisit their unification when we have a better understanding in the future.
topi::dynamic_strided_slice
to relax standards without fixing its current limitation. Therefore, it expects to perform preprocessing forbegin/end/strides
tensors to make them have the equal length with the dimension ofdata
tensor. PR [Unity][Op] introduceScatterElement
op #14493 would be helpful for such pre-processing.Discussion
relax.strided_slice
can already cover shape dynamism fordata
tensor. Any suggestion would be very helpful.cc. @yongwww @jwfromm @psrivas2 @slyubomirsky @junrushao @tqchen @MasterJH5574 @masahi