Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

[Discuss][Parsing][IR] Parsing _ shape dimensions as tir::Any #259

Open
slyubomirsky opened this issue Sep 28, 2022 · 10 comments
Open

[Discuss][Parsing][IR] Parsing _ shape dimensions as tir::Any #259

slyubomirsky opened this issue Sep 28, 2022 · 10 comments

Comments

@slyubomirsky
Copy link
Collaborator

Presently, any wildcard dimension in a shape annotation (namely, written as _) is parsed as a fresh shape variable.

For example, let us consider the following function:

@tvm.script.ir_module
class TestShape:
    @R.function
    def main(x : Tensor((m, n, _), _)):
        s = (m, n)
        return s

If we dump its AST using relax.testing.dump_ast, we see that the shape of x is parsed like so:

shape_=ShapeExpr(
    values=[
        PrimExpr(value=`m: int64`),
        PrimExpr(value=`n: int64`),
        PrimExpr(value=`v_: int64`)
    ]
)

In particular, the _ dimension is parsed as an ordinary TIR variable with a mangled name. In terms of the semantics, this would still work out because the variable will never be used again.

However, I think it might make more sense to parse wildcard dimensions using tir::Any. This will avoid adding new variables to the namespace that aren't meant to be used and convey the intent that the dimension is not meant to be checked. Additionally, it will be easier to generate shape expressions programmatically (generating an Any node to represent a wildcard instead of using an ordainry variable with an unused name). It's a minor issue, but it could be helpful for writing passes and dealing with shape expressions in general.

Any thoughts from those working on the parser? @Hzfengsy @MasterJH5574 @yongwww

@tqchen
Copy link
Contributor

tqchen commented Sep 29, 2022

One of the main challenge here is that we might need to think about language semantics here. We know that match shape to a fresh var makes sense, and we can have equality comparison and other symbolic stuffs through airthmetic prover.

Likely code generator needs to be able to deal with scenarios where Any (e.g. discard the value)

The main issue to consider is when we do code rewrites. Say we take a program and do a transformation that inserts relu before x

Before

@tvm.script.ir_module
class TestShape:
    @R.function
    def main(x : Tensor((m, n, _), _)):
        s = (m, n)
        return s

After

@tvm.script.ir_module
class TestShape:
    @R.function
    def main(x : Tensor((m, n, v_), _)):
        v0 : Tensor(m, n, v_), _) = relu(x)
        s = (m, n)
        return (s, v0)

After relu is inserted, we want to preserve the invariance that x's shape equals v0, as a result v_ variables shared across x.shape and v0.shape. Having special semantics with Any might complicates how we write such a pass. While the fresh variable case simply allows us to copy x0.shape over

@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Sep 29, 2022

That's an interesting example. I would argue that if it's important for two variables to match, they should probably have explicit names--a wildcard signifies that the dimension shouldn't be matched at all. In the given example, v0 could simply not be given an explicit shape annotation at all (and therefore have the same shape expression as the RHS, including the wildcard).

It is true, though, that shape inference rules for operators may encounter some complexity in dealing with tir::Any nodes as opposed to ordinary shape variables, so perhaps that is a reason not to have another node in the AST to handle the case of wildcards.

It might be reasonable to treat _ as a special-case variable name in the parser. Not having wildcard as a special construct would also simplify the language semantics. My concern is that using the wildcards as ordinary variable names could lead to some confusing errors (e.g., if the wildcard vars end up nested inside more complicated shape expressions), though users could avoid that by naming their variables.

(Semantically, the only difference between a wildcard and a fresh variable is that a wildcard is literally a no-op during matching whereas a fresh var results in a var being added to the namespace.)

@tqchen
Copy link
Contributor

tqchen commented Sep 29, 2022

agree about the reasonings pt wildcard, the main tradeoff is the cost of introducing any and their handling, and clarify how passes handle them and possible usecases around them.

In the particular example, the before indeed signifies that user do not intend to use the vaiable. So as if the code is written by the user indeed having wildcard helps to signify that intention.

The main question is how can followup passes(e.g. those that insert relu) take that additional intention into consideration (which adds complexity to the passes. Ideally should ideally want to be able to keep tracks of invariant shape relations as much as possible. So the after is not the code that is written by the user, but generated by a pass.

If it implicitly have the same shape as RHS with wildcard, then it could leads to a missed opportunity in future optimizatoins. So we cannot prove v0.shape[2] == x.shape[2] because both of them are wildcard(Any). We cannot allow say for example memory sharing after proving the shapes are equal.

@slyubomirsky
Copy link
Collaborator Author

You could prove that the shapes are equal by adding a MatchShape binding, but that may be unnecessary complexity. I think the simplicity of using fresh vars is a good argument for keeping the present approach.

@tqchen
Copy link
Contributor

tqchen commented Sep 29, 2022

agreed with all the analysis, great discussions :)

@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Sep 29, 2022

I did think of one case that becomes a lot tougher without an explicit wildcard construct: Annotating explicit return shapes.

If we have a function with a signature like def f(x: Tensor((m, n), "float32"), y: Tensor((o, p), "float32") -> Tensor((m, p, _), "float32"): ..., it's a lot harder to figure out what the wildcard in the return shape should mean. The same would apply to using a var in the return shape that's not given in the input shapes. Should we disallow such cases and require the return shape to be RuntimeDepShape, then?

We would have to define some semantics for dealing with the unbound shape variable in the return shape (which would matter if we try to construct "shape functions" to figure out the shapes of call results).

@slyubomirsky
Copy link
Collaborator Author

slyubomirsky commented Sep 29, 2022

Another case where this kind of complexity can come up is SeqExprs, namely trying to assign a shape_ to them. Suppose there are some MatchShape bindings within the SeqExpr that introduce some shape variables and those new shape variables are used in the shape_ of the SeqExpr's body expression. If we go with lexical scoping (the shape variables are scoped to the SeqExpr), then we can't give the overall SeqExpr a shape_ that includes those shape variables. We could give it RuntimeDepShape in this case, but it would lose information compared to replacing the free shape variables with wildcards.

(Arguably neither way is satisfying, but letting shape variables escape the scope in which they were defined would be hard to reason about.)

@tqchen
Copy link
Contributor

tqchen commented Sep 30, 2022

Indeed, in the case of return type the wildcard gives more info(about rest of the dims) In the particular case it becomes the RuntimeDep in this case.

@slyubomirsky
Copy link
Collaborator Author

Relaxing the entire shape to RuntimeDepShape if we have a wildcard dimension is feasible to implement. I'm concerned it may lose some information compared to having an explicit wildcard for dimensions.

@tqchen
Copy link
Contributor

tqchen commented Sep 30, 2022

I agree, just some design tradeoffs to consider, e.g. the extra info and the overhead on introducing passes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants