Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unity][Op] Dynamic Strided Slice #14548

Merged
merged 8 commits into from
Apr 12, 2023
Merged

Conversation

sunggg
Copy link
Contributor

@sunggg sunggg commented Apr 9, 2023

This PR brings dynamic strided slice, which will be the first stab for data-dependent ops in Unity.

Overview

It consists of three parts (Op, TOPI, Legalization) and their test cases.

Data-dependent ops, like dynamic strided slice, could be tricky when we cannot automatically deduce their output shape.
In such cases, we cannot lower them since TE infra requires a concrete output shape, which should be defined with symbolic variables at least. Therefore, manual shape function registration is inevitable for those operators to let the compiler know how to compute their output shapes.

With this PR, users can register the shape func in TOPI and insert it with match cast mechanism during the legalization.
It's worth noting that for data-dependent ops, current TOPI creates symbolic variables whenever the shape value of certain dimension is unknown, then later mechanism handles the binding and so on. (see link)
However, with this PR, the legalizer would be in charge of creating symbolic variables and binding them explicitly, and then pass the output shape to TOPI so that TOPI can simply use it to define its compute.

@register_legalize("relax.dynamic_strided_slice")
def _dynamic_strided_slice(bb: BlockBuilder, call: Call) -> Expr:
    # 1. Insert shape function
    output_shape = bb.normalize(
        bb.call_te(
            topi.shape_func_dynamic_strided_slice,
            call.args[0],
            call.args[1],
            call.args[2],
            call.args[3],
        )
    )
    # 2. Convert tensor to shape and match cast with new symbolic vars
    # Get shape length
    ndim = int(output_shape.struct_info.shape[0])
    output_shape = bb.emit(
        Call(
            ExternFunc("vm.builtin.tensor_to_shape"),
            [output_shape],
            sinfo_args=[ShapeStructInfo(ndim=ndim)],
        )
    )
    output_shape_vars = [tir.Var("s", "int64") for i in range(ndim)]
    bb.match_cast(output_shape, ShapeStructInfo(output_shape_vars))

   # 3. Pass the output shape vars to TOPI
    return bb.call_te(
        topi.dynamic_strided_slice,
        call.args[0],
        call.args[1],
        call.args[2],
        call.args[3],
        output_shape=output_shape_vars,
    )

Through the internal discussion with @junrushao, we confirmed that this should comply with WIP PR #14278.
Also, since this requires the change in existing topi::dynamic_strided_slice (see link), this PR creates relax namespace and implements its own version.

Notes

  • Currently, in relax, we already have relax.strided_slice op that covers non-data-dependent scenarios. This is still useful since it can cover the current limitation of relax.dynamic_strided_slice op: its shape analysis may not be informative enough for its users like tensor_to_shape, which requires the known integer shape at compile-time.
    We will revisit their unification when we have a better understanding in the future.
  • This PR adapts the Relay's topi::dynamic_strided_slice to relax standards without fixing its current limitation. Therefore, it expects to perform preprocessing for begin/end/strides tensors to make them have the equal length with the dimension of data tensor. PR [Unity][Op] introduce ScatterElement op #14493 would be helpful for such pre-processing.

Discussion

  • Currently, I named it "dynamic" strided slice only for simplicity and following the tradition. But, strictly speaking, data-dependent strided slice is more accurate since relax.strided_slice can already cover shape dynamism for data tensor. Any suggestion would be very helpful.
  • To maintain the relax version of TOPI, I created the relax namespace for now. Would there be a better way?

cc. @yongwww @jwfromm @psrivas2 @slyubomirsky @junrushao @tqchen @MasterJH5574 @masahi

@tvm-bot
Copy link
Collaborator

tvm-bot commented Apr 9, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

for (int i = 0; i < ndim; i++) {
length = if_then_else(indices[0] == i, data->shape[i], length);
}
return GetLength(begin(indices), end(indices), strides(indices), length);
Copy link
Member

@masahi masahi Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through the internal discussion with @junrushao, we confirmed that this should comply with WIP PR #14278.

Not sure what you meant by above. Is the whole point being able to avoid repeated compute definition for shape func like this? i.e., automatically derive a shape func from the original te compute alone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR lets the legalizer define the output shape of dynamic slice and pass it to TOPI, we wanted to make sure this is not a breaking change for PR #14278 as it leverages the output computation logics in TOPI. We checked that since there are already operators like reshape that TOPI takes its output shape defined elsewhere, this change won't put any extra complication to his PR.

Copy link
Member

@masahi masahi Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question is whether or not we consider shape funcs a part of "op definition". This is the case for Relay, and one of the reason adding a new op in Relay is complicated. I don't know what the goal of #14278 is, but the current discussion in this PR suggests shape funcs need not be implemented under topi. It seems more like "implementation details" of the legalizer.

So unless #14278 is aiming to generate an op legalizer definition, I think this work is completely decoupled from #14278.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one of the main goals of #14278 is to automate the current repetitive & tedious op registration process including legalization, struct_info, etc. Unlike current flow which requires us to look at multiple different sites and do the manual job, #14278 will allow us to look at the single place to put all those information so that the parser will handle the rest. I believe legalization function should be there, too.

I think the question is whether or not we consider shape funcs a part of "op definition".

You brought up the very good points. I see that shape func is a part of op definition just like legalization function for each operator. When #14278 lands, shape function can be registered together only if necessary so that we don't complicate the op registration unnecessarily.

but the current discussion in this PR suggests shape funcs need not be implemented under topi.

I forgot to mention earlier, but #14278 deduce output shape of operator by using the existing shape computation logic in TOPI to automatically generate the struct_info logics. Not sure if this is the hard requirement tho. @junrushao, would it be okay to put the shape computation in elsewhere, for example legalizer?

Copy link
Member

@masahi masahi Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one of the main goals of #14278 is to automate the current repetitive & tedious op registration process including legalization

ok, but you are introducing shape_func_dynamic_strided_slice which looks like a repetitive definition of output shape logic to me. Are you saying that #14278 is going to remove that definition? Otherwise I still don't get what you meant when you say this PR "complies" with #14278.

Probably what's not clear to me is this: Is #14278 going to auto-generate shape funcs? Otherwise, I don't understand how #14278 and shape funcs definition will work together, but I can leave that question for future after #14278 lands.


// The output shape will depend on the runtime value in begin/end/stride tensors.
// TODO(tvm-team): Extract more compile-time info when those tensors are constants.
return TensorStructInfo(data_sinfo->dtype, n_axis);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about this. It seems we can only express slicing with either "all static" or "all dynamic" axes. But partially static / dynamic slicing is very common in practice (e.g., slicing only along the dynamic "batch" (the number of detected boxes) dimension object detection models). If shape func needs to be implemented via topi, I don't see an way to express such partially-static output shape.

For Relay I added a hacky workaround #8165 for this issue.

Copy link
Contributor Author

@sunggg sunggg Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is current limitation of this PR so I left it as TODO.

Actually, I have a question regarding #8165. Can te::Tensor contain symbolic variables? Maybe I missed but couldn't find any relevant test cases.
Please correct me if I'm wrong, based on my current understanding, data within te::Tensor will be certain values rather than symbolic variables. So I cannot see how partially static output shape will be represented if we go through F1->F2 route below.

// F1
inline te::Tensor dynamic_strided_slice(const te::Tensor& x, const te::Tensor& begin,
                                        const te::Tensor& end, const te::Tensor& strides, ...)
{
     // ...
     Array<PrimExpr> begin_expr, end_expr, strides_expr;
     for (int64_t i = 0; i < num_dynamic_axes; ++i) {
         auto ind = make_const(index_dtype, i);
         begin_expr.push_back(begin(ind));    // <- Can `begin(ind)` be symbolic in practice?
         end_expr.push_back(end(ind));
         strides_expr.push_back(strides(ind));
      } 
      // Call F2
      return dynamic_strided_slice(x, begin_expr, end_expr, strides_expr, name, tag);
}

// F2
inline Tensor dynamic_strided_slice(const Tensor& x, const Array<PrimExpr>& begin,
                                    const Array<PrimExpr>& end, const Array<PrimExpr>& strides, ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think te::Tensor can be created from symbolic values, but we cannot extract such values from the tensor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Is there any example code that I can try?
Also, if we can create the te::Tensor with symbolic variables, may I ask why we cannot extract those values? Can't we access the value by using the index in the above example?

Copy link
Member

@masahi masahi Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming "symbolic variable" you mentioned is just another PrimExpr, I expect we can fill in a tensor by that value, just like any other PrimExpr. Why not try Can begin(ind) be symbolic in practice? thing?

Can't we access the value by using the index in the above example?

What I meant was, as soon as we put a symbolic expression in a TE tensor, we lost all symbolic-specific information. You can index it, but what you get is an opaque value. So we cannot exploit any symbolic information about it.

So output shape represented by TE Tensor loses symbolic or constant shape information. That's why I think it is better to directly generate shape func via TIR.

# 1. Insert shape function
output_shape = bb.normalize(
bb.call_te(
topi.shape_func_dynamic_strided_slice,
Copy link
Member

@masahi masahi Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be an alternative way to implement shape funcs? Because te tensor cannot express "constant"-ness. See my related comment above as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered two options: TE and TIR, and chose TE since it is easier to write.
I did not consider the hybrid script since I did not want users to learn new thing only for the shape function.
Would hybrid script help in this case? Also, I'm open for other suggestions as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about TVMScript. Why not implement shape func in TVMScript, which should be able to represent partially-static shape as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but don't we need Array<PrimExpr> to represent the partially-static shape? If this is true, I don't think this is currently supported in TVMScript. AFAIK, we usually carry those forms around within op attribute.

Also, if we go with TVMScript, do we require users to write the TIR primfunc by hand?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is true, I don't think this is currently supported in TVMScript

hmm interesting. Don't we create TVMScript with shapes like (n, 16) all the time? I hope the output of a shape func can also be expressed like that.

Also, if we go with TVMScript, do we require users to write the TIR primfunc by hand?

Yes but writing in TVMScript is only required if a user wants to exploit "partially-static" shape. I think this is a relatively rare and advanced use case. te.compute-based definition should always be possible.

And other than dynamic slicing, I cannot think of other ops that may want to implement their shape func in TVMScript. Maybe reshape and other shape-changing ops might apply but I haven't met a real-world use case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that "partially-static" cases are common. However, writing TIR manually might make it more challenging for contributors to develop data-dependent ops. Simplifying the shape function implementation could be beneficial. I think several ops like reshape or max_pool2d could be used as partially-dynamic, such as allowing dynamic or partially-dynamic strides in max_pool even I haven't encountered the real-world use cases.

Copy link
Contributor Author

@sunggg sunggg Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masahi Yeah, I believe that is the shape field in struct info. In this case, we need to pass Array<PrimExpr> as an argument of call_tir for the shape func. I think later transition to PrimValue would be helpful in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion today, decided to revisit when #14278 lands. Left the TODO comment.

call.args[0],
call.args[1],
call.args[2],
call.args[3],
Copy link
Member

@masahi masahi Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  • Why shape_func_dynamic_strided_slice needs to be implemented in topi / c++? This seems like a one-off function, why not just create te.compute here in python?
  • Are we introducing a standard API for shape func (like Relay), or is each op going to pick any API as it sees fit (like this example)? In the latter case I don't see any reason shape funcs need to belong to topi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we introducing a standard API for shape func (like Relay), or is each op going to pick any API as it sees fit (like this example)? In the latter case I don't see any reason shape funcs need to belong to topi.

The latter case.

Great point. I put it in TOPI-side because I viewed TOPI as the registry of TEs and developers can easily check related implementations in the single file. I don't have any strong preference here, so I'm open to hear what folks prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no standard API for shape func, and there is no "registration" for shape funcs specifically, I think it is better not to introduce a rigid convention, in particular shape funcs be implemented in topi / c++.

Since they are only used by the legalizer, I view shape funcs in Relax simply as implementation details, one-off function. So developers should be able to express data-dependent output shape computation logic in any way in anywhere, including TVMScript.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very good point. we can continue this discussion here: #14548 (comment)

const te::Tensor& end, const te::Tensor& strides,
Array<PrimExpr> output_shape,
std::string name = "T_strided_slice_dynamic",
std::string tag = kInjective) {
Copy link
Member

@masahi masahi Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use the existing definition of dynamic_strided_slice?

inline Tensor dynamic_strided_slice(const Tensor& x, const Array<PrimExpr>& begin,
const Array<PrimExpr>& end, const Array<PrimExpr>& strides,
std::string name = "T_dynamic_strided_slice",
std::string tag = kInjective) {

i.e., Why can't it create the output shape inside this function?

Copy link
Member

@masahi masahi Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry maybe I'm confused about the existing implementation of dynamic_strided_slice and its relation with strided_slice shape func.

If dynamic_strided_slice can calculate the right output shape from its dynamic input, why do we need a shape func for it at all... And why this new definition of dynamic_strided_slice needs to have its output shape calculated by the shape func ahead of time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so do you have an answer to my original question? Why can't we simply create the output shape by

  Array<PrimExpr> out_shape;

  for (size_t i = 0; i < num_slice_axes; ++i) {
    auto d = indexdiv(end[i] - begin[i], strides[i]);
    if (d->IsInstance<tvm::IntImmNode>()) {
      // Preserve static dimension if possible
      out_shape.push_back(d);
    } else {
      out_shape.push_back(tvm::tir::Var("dim"));
    }
  }

?

Sorry if I'm missing something obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apology. It seems like I misread your comment.

The problem was how to insert match cast logic and bind those variables. I couldn't see a way to insert them in TOPI side. So I did it in the legalizer and it became explicit and straightforward.

# 2. Convert tensor to shape and match cast with new symbolic vars
# Get shape length
ndim = int(output_shape.struct_info.shape[0])
output_shape = bb.emit(
Call(
ExternFunc("vm.builtin.tensor_to_shape"),
[output_shape],
sinfo_args=[ShapeStructInfo(ndim=ndim)],
)
)
output_shape_vars = [tir.Var("s", "int64") for i in range(ndim)]
bb.match_cast(output_shape, ShapeStructInfo(output_shape_vars))

Copy link
Member

@masahi masahi Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, again, sounds odd to me to talk about legalizer vs topi distinction. Since dynamic_strided_slice and the shape func are only meant to be used by the legalizer, why not just define them here? There is no need to be concerned about "TOPI side".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion today, I moved the shape func to legalizer.

@sunggg
Copy link
Contributor Author

sunggg commented Apr 12, 2023

@masahi @yongwww reflected your feedback. would you take another look?

@masahi masahi merged commit 7db0b98 into apache:unity Apr 12, 2023
@masahi
Copy link
Member

masahi commented Apr 13, 2023

@sunggg unity CI has been broken after merging this PR:
https://github.com/apache/tvm/commits/unity
https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-unity/detail/PR-14612/1/pipeline

Can you send a follow-up PR?

cc @tqchen

@tqchen
Copy link
Member

tqchen commented Apr 13, 2023

I also noticed this and disabled the tests temporarily in my unity merge, @sunggg please followup

@Hzfengsy
Copy link
Member

Hzfengsy commented Dec 5, 2023

Hi @sunggg, I noticed this PR failed the tests of test_relax_dynamic_strided_slice, saying

AttributeError: module 'tvm.topi' has no attribute 'shape_func_dynamic_strided_slice'

you can reproduce it with the latest unity branch.

pytest tests/python/topi/test_topi_transform.py::test_relax_dynamic_strided_slice

Could you please fix it if you have time?

PS: The other failure tests in the same file are not related to this PR, and I've fixed it in the branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants