-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Meta Schedule (AutoTIR) #5
[RFC] Meta Schedule (AutoTIR) #5
Conversation
CC: @jroesch @icemelon9 @kparzysz-quic @FrozenGene @jcf94 |
Sorry for the delay! I am getting back on this RFC and will be working on it in the rest of this week |
Updated according to the comments. Would you guys like to re-review? @tqchen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTGM
@FrozenGene @comaniac @tkonolige please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good now. I've mad some grammar suggestions.
My main remaining question is drawbacks. I'm sure there are some that you can think of. The biggest one I can think of is the fact that we will have to rewrite all our existing schedules to take advantage of the new infrastructure (I think). Also, will tuning be slower if we allow users to define their own search rules?
@tkonolige thanks for the review!
Given in most of the time we use sketch generation (in Ansor's terminology) to generate schedules automatically, we can just remove most of the schedules written in TE. Alternatively, we do need to rewrite all of the Ansor's sketch rules, including (defined in
Upstreaming those rules is part of upstreaming process, so I suppose it won't be a big problem :-)
The search rule is only executed once to obtain the search space before we explore it, and it is usually fairly fast (within a second), so If we only customize our own search rule (i.e. sketch rule in Ansor, schedule rule in meta schedule), we won't observe performance degradation |
@tkonolige @comaniac I updated with some explanation that we replay the trace repetitively to do random search, and some examples of trace-based analysis. Would you guys like to take another look? Thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrushao1994 thanks for the writeup here! i left a bunch of comments, approaching this from a bit of an "outsider" perspective (i haven't been avidly following your discuss forum post as much as I would have liked). A lot of these comments may seem a bit like i'm being dense, but in my experience, when something could be inferred from a design doc but isn't explicitly described, that's a scenario where readers are going to take away two separate ideas. Since we are now embarking on this new RFC process, we are now saying we don't expect readers to follow a bunch of conversation to understand the spirit of a system's design, so I think that means we now need to do our due diligence in ensuring that our RFC writeups are clear, and that we resolve controversy by clarifying language in the actual RFC rather than on threads.
@areusch Thank you sooo much for reviewing this RFC! I really appreciate it that you went over the text very carefully and provided very helpful suggestions! Just finished a pass revising the RFC. Would you like to take another look? Thanks a lot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrushao1994 i took another pass but didn't get through everything before i ran out of time. here are some comments on the latest iteration
@areusch @junrushao1994 do you mind follow up a bit on this. |
Thanks everyone. It would be great to work to land this RFC :) @junrushao1994 please try your best attempt to address the latest batch of comments from @areusch and then i think we can merge this in |
Sorry for the late reply! I will be working on it today |
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Cody Yu <comaniac0422@gmail.com> Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com> Co-authored-by: Andrew Reusch <areusch@gmail.com>
c446856
to
3edb2e9
Compare
Thanks everyone. This RFC is now accepted. Assigning RFC number 0001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tqchen @junrushao1994 adding a few comments i was in the middle of writing when this got merged, you guys can decide if you want to address i guess. i think the rest of my comments from latest round were addressed. i still think this would benefit from a system-level diagram, but i can no longer find that comment because GH has buried it.
## 2. Motivation | ||
|
||
**Scheduling.** In TVM, | ||
both TensorIR and TE allow direct or indirect IR transformation guided by a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we specify more about the IR transformation here? e.g. we need to answer the question: transform from what to what? otherwise you're confused unless you know TVM already.
I think it's confusing to explain because the transformation can either be from TE -> TensorIR or TensorIR -> TensorIR.
here's a stab: "TVM initially describes all model operators using an abstract description of the computation. Such abstractions can be described either in the Tensor Expression IR (the standard prior to Meta Scheduling) or in TensorIR (as a naïve computation). Through a process known as scheduling, TVM allows transformation of these IR to an imperative, optimized description of the implmented computation. Such transformation is guided by a developer-provided program..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Andrew! I can send a follow-up PR!
generate high-performance schedules, it is allowed to fork the trace into two, and the design space is | ||
the union of the forked traces. | ||
|
||
The trace is not strictly user-facing, but can be accessed and printed with the following syntax: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this isn't user-facing, how is the user supposed to invoke Meta Scheduler in a repeatable way? is there a serialization mechanism provided for the implementation (in particular decisions
dict below)?
Tracking issue: apache/tvm#8473