Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Meta Schedule (AutoTIR) #5

Merged
merged 1 commit into from
Jul 27, 2021

Conversation

junrushao
Copy link
Member

@junrushao junrushao commented Jun 8, 2021

Tracking issue: apache/tvm#8473

@tqchen tqchen added the status: need review RFC needs review label Jun 8, 2021
@tqchen tqchen self-assigned this Jun 8, 2021
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

CC: @jroesch @icemelon9 @kparzysz-quic @FrozenGene @jcf94

@junrushao junrushao changed the title [RFC] Meta Schedule (AutoTensorIR) [RFC] Meta Schedule (AutoTIR) Jun 9, 2021
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

Sorry for the delay! I am getting back on this RFC and will be working on it in the rest of this week

@junrushao
Copy link
Member Author

Updated according to the comments. Would you guys like to re-review? @tqchen

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM

@tqchen
Copy link
Member

tqchen commented Jul 1, 2021

@FrozenGene @comaniac @tkonolige please take another look

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good now. I've mad some grammar suggestions.

My main remaining question is drawbacks. I'm sure there are some that you can think of. The biggest one I can think of is the fact that we will have to rewrite all our existing schedules to take advantage of the new infrastructure (I think). Also, will tuning be slower if we allow users to define their own search rules?

rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

junrushao commented Jul 6, 2021

@tkonolige thanks for the review!

The biggest one I can think of is the fact that we will have to rewrite all our existing schedules to take advantage of the new infrastructure.

Given in most of the time we use sketch generation (in Ansor's terminology) to generate schedules automatically, we can just remove most of the schedules written in TE. Alternatively, we do need to rewrite all of the Ansor's sketch rules, including (defined in src/auto_scheduler/search_policy/sketch_policy_rules.h):

  • Always-Inline
  • Multi-Level-Tiling
  • Multi-Level-Tiling-with-Fusion
  • Add-Cache-Read
  • Add-Cache-Write
  • Add-RFactor
  • Simplify-Compute-with-Const-Tensor
  • Cross-Thread-Reduction
  • Special-Compute-Location

Upstreaming those rules is part of upstreaming process, so I suppose it won't be a big problem :-)

Also, will tuning be slower if we allow users to define their own search rules?

The search rule is only executed once to obtain the search space before we explore it, and it is usually fairly fast (within a second), so If we only customize our own search rule (i.e. sketch rule in Ansor, schedule rule in meta schedule), we won't observe performance degradation

@junrushao
Copy link
Member Author

@tkonolige @comaniac I updated with some explanation that we replay the trace repetitively to do random search, and some examples of trace-based analysis. Would you guys like to take another look? Thanks a lot!

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junrushao1994 thanks for the writeup here! i left a bunch of comments, approaching this from a bit of an "outsider" perspective (i haven't been avidly following your discuss forum post as much as I would have liked). A lot of these comments may seem a bit like i'm being dense, but in my experience, when something could be inferred from a design doc but isn't explicitly described, that's a scenario where readers are going to take away two separate ideas. Since we are now embarking on this new RFC process, we are now saying we don't expect readers to follow a bunch of conversation to understand the spirit of a system's design, so I think that means we now need to do our due diligence in ensuring that our RFC writeups are clear, and that we resolve controversy by clarifying language in the actual RFC rather than on threads.

rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
@junrushao
Copy link
Member Author

@areusch Thank you sooo much for reviewing this RFC! I really appreciate it that you went over the text very carefully and provided very helpful suggestions! Just finished a pass revising the RFC. Would you like to take another look? Thanks a lot!

Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junrushao1994 i took another pass but didn't get through everything before i ran out of time. here are some comments on the latest iteration

rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
rfcs/0001-meta-schedule-autotensorir.md Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Jul 21, 2021

@areusch @junrushao1994 do you mind follow up a bit on this.

@tqchen
Copy link
Member

tqchen commented Jul 24, 2021

Thanks everyone. It would be great to work to land this RFC :) @junrushao1994 please try your best attempt to address the latest batch of comments from @areusch and then i think we can merge this in

@junrushao
Copy link
Member Author

Sorry for the late reply! I will be working on it today

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Cody Yu <comaniac0422@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
Co-authored-by: Andrew Reusch <areusch@gmail.com>
@junrushao junrushao force-pushed the rfc-0001-meta-schedule-autotensorir branch from c446856 to 3edb2e9 Compare July 26, 2021 22:59
@junrushao
Copy link
Member Author

@areusch @tqchen Thanks for your time reviewing the RFC! Just addressed all the comments and please take another look when you guys got time :-)

@tqchen tqchen added status: accepted RFC is accepted and removed status: need review RFC needs review labels Jul 27, 2021
@tqchen
Copy link
Member

tqchen commented Jul 27, 2021

Thanks everyone. This RFC is now accepted. Assigning RFC number 0001

@tqchen tqchen merged commit 4e26cc6 into apache:main Jul 27, 2021
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen @junrushao1994 adding a few comments i was in the middle of writing when this got merged, you guys can decide if you want to address i guess. i think the rest of my comments from latest round were addressed. i still think this would benefit from a system-level diagram, but i can no longer find that comment because GH has buried it.

## 2. Motivation

**Scheduling.** In TVM,
both TensorIR and TE allow direct or indirect IR transformation guided by a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we specify more about the IR transformation here? e.g. we need to answer the question: transform from what to what? otherwise you're confused unless you know TVM already.

I think it's confusing to explain because the transformation can either be from TE -> TensorIR or TensorIR -> TensorIR.

here's a stab: "TVM initially describes all model operators using an abstract description of the computation. Such abstractions can be described either in the Tensor Expression IR (the standard prior to Meta Scheduling) or in TensorIR (as a naïve computation). Through a process known as scheduling, TVM allows transformation of these IR to an imperative, optimized description of the implmented computation. Such transformation is guided by a developer-provided program..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Andrew! I can send a follow-up PR!

generate high-performance schedules, it is allowed to fork the trace into two, and the design space is
the union of the forked traces.

The trace is not strictly user-facing, but can be accessed and printed with the following syntax:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this isn't user-facing, how is the user supposed to invoke Meta Scheduler in a repeatable way? is there a serialization mechanism provided for the implementation (in particular decisions dict below)?

MichaelJKlaiber referenced this pull request in MichaelJKlaiber/tvm-rfcs Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: accepted RFC is accepted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants