Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC][DISCUSS] Transition to Relay #2244

Closed
tqchen opened this issue Dec 6, 2018 · 11 comments
Closed

[RFC][DISCUSS] Transition to Relay #2244

tqchen opened this issue Dec 6, 2018 · 11 comments

Comments

@tqchen
Copy link
Member

tqchen commented Dec 6, 2018

Dear Community:

The relay IR(see #1673 for original RFC, and https://docs.tvm.ai/dev/relay_intro.html for a quick intro) is a proposed next-generation NNVMv2 IR support compilation for a richer set of programs and more formal optimizations. Thanks to the community's effort, as of now, the relay compilation flow is mostly on par with the existing NNVMv1 pipeline, with noticeable improvements. I am opening this RFC to propose transition to relay as a default IR for our compiler stack, and open discussion for how should we do it.

Possible Transition Plan

If we agree on moving on, the general plan is to keep NNVMv1 as maintenance mode after 0.5 release and start to use Relay as the default IR, this means:

  • We will still send in bugfixes
  • A backward compatible layer(nnvm.to_relay) will be added to make sure all nnvmv1 based pipeline can still make use of Relay.
  • So far the graph runtime module is shared between nnvmv1 and relay, so there is no problem regarding backward compatibility of runtime.
  • New Compiler and IR optimizations will be added to relay but not necessarily NNVMv1 to help us focus on the development effort.

As per community guideline, all major design discussions should be discussed in public, and hear inputs from everyone in the community, so please put your weights into this. In particular on:

  • Why should we move to relay, if you support it
  • What are your concerns about the transition
  • How can we do better in future improvements?
@tqchen
Copy link
Member Author

tqchen commented Dec 9, 2018

cc @dmlc/tvm-team

@yzhliu
Copy link
Member

yzhliu commented Dec 10, 2018

shall we run a set of test case to ensure there's no performance regression?

@tqchen
Copy link
Member Author

tqchen commented Dec 14, 2018

@yzhliu as of now, relay's set of optimization passes is on par with all the previous nnvm benchmarks we ran in https://github.com/dmlc/tvm/tree/master/apps/benchmark

@masahi
Copy link
Member

masahi commented Dec 14, 2018

What do we do about nnvm symbols that are called from topi? (NCHWc conv, Winograd, etc)

Or more generally, how should we update the existing alter_layout registrations in topi that assume NNVM v1 ? I assume alter_layout mechanics of NNVM v1 and Relay are not compatible, so there will be two registrations, one for v1 and another for Relay?

@tqchen
Copy link
Member Author

tqchen commented Dec 14, 2018

@merrymercy is working on alt_layout and i think he already have a solution

@merrymercy
Copy link
Member

merrymercy commented Dec 14, 2018

Probably we have to introduce something like F, which can be either nnvm.sym or relay.op.

@TaoLv
Copy link
Member

TaoLv commented Dec 17, 2018

How would this decision impact the development of MXNet? NNVM request will still be supported?

@tqchen
Copy link
Member Author

tqchen commented Dec 17, 2018

This will only impact the deep learning compiler support(which mxnet does not depend on), the core nnvm will be kept stable.

@wweic
Copy link
Contributor

wweic commented Dec 18, 2018

@tqchen Is it possible to release the benchmark code you did using relay so people can evaluate the effort to transition to relay?

@jroesch
Copy link
Member

jroesch commented Dec 22, 2018

I just wanted to write down my thoughts on this whole process. Sorry for the slow response been a busy couple weeks with paper deadlines and the TVM Conference.

I believe Relay offers multiple benefits as an intermediate representation, as well as quite a few compelling engineering reasons.

Most importantly Relay has a richer computational model, and is able to represent a wider set of deep learning computations. Providing the ability to represent and optimize programs which contain training loops, as well as advanced models.

Relay provides a rich and flexible type system allowing us to type the computations represented today in computation graphs as well as new operators, and functionality such as control-flow, recursion. We introduce a mechanism for explicit sharing via let bindings which are useful for control-flow and scoping. We can encode loops and control structures such as (map, filter, fold, etc).

Concurrently @MarisaKirisame, @slyubomirsky and I have been working on support for data types which allow the definition of differentiable functions over lists, trees, graphs, etc and the ability to optimize them (in the ML and compiler sense).

@MarisaKirisame and I have been working on a new AD algorithm (not yet merged, see #2321 for the first order version) which can compute Nth order gradients over control structures and data types.

From an engineering standpoint we have made many improvements, from the new attributes system to to new optimizations, and passes.

The new attribute system replaces NNVM's parameters. It provides the ability for users to provide typed attribute data (instead of strings), and can be accessed from Python. Attributes are constant, compile time information passed to operators that may be considered when doing analysis such as type checking, or code generation.

Another example is Relay supports arbitrary tensor constants. We type constants with tensor types, with 0-rank tensors representing scalars. Constants and their type rules allow us to remove specialized operators, typing, and execution support for scalars.

We can further use constants to do things like parameter specific optimization, we simply inline the parameters as constants and invoke the program optimizer. We can now treat all constant values uniformly regardless of whether they are parameters, constants in the program, or any other kind of user supplied information.

We are also able to build on Relay infrastructure, such as the program interpreter. We were able to build a constant evaluator by just using Relay's interpreter. We can execute arbitrary Relay programs, including the user of platform optimized operators. We can then inline the program result as a constant.

@joshpoll is furthering this line of thought by building a partial evaluator for Relay which can perform constant folding/evaluation style optimizations in the face of the unknown values.

I believe these are just a few examples of how Relay is helping us do more with less work. An important avenue is making it easier for people to test and evaluate Relay.

We have an internal version of https://github.com/dmlc/tvm/tree/master/apps/benchmark for an older version of Relay which I will work to upstream to aid this effort. If anyone is interested in helping out with this process I would welcome extra hands.

Going forward we hope to leverage the hard work done by the Relay community (both here at UW and our great collaborators elsewhere) to continue providing features and engineering benefits.

My personal development roadmap for Relay is to focus on error reporting, stability, and features for training. I would love to build a detailed version of the Relay roadmap after the holidays and involve those who are interested in the community.

@tqchen
Copy link
Member Author

tqchen commented Jan 14, 2019

Thanks for everyone who participated so far, seems that the current community is for the transition as long as we keep good performance regression and benchmark, which I fully agree on. I will close this issue and open new ones for specific actionable items

@tqchen tqchen closed this as completed Jan 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants