[RFC][DISCUSS] Transition to Relay #2244

tqchen · 2018-12-06T16:35:52Z

Dear Community:

The relay IR(see #1673 for original RFC, and https://docs.tvm.ai/dev/relay_intro.html for a quick intro) is a proposed next-generation NNVMv2 IR support compilation for a richer set of programs and more formal optimizations. Thanks to the community's effort, as of now, the relay compilation flow is mostly on par with the existing NNVMv1 pipeline, with noticeable improvements. I am opening this RFC to propose transition to relay as a default IR for our compiler stack, and open discussion for how should we do it.

Possible Transition Plan

If we agree on moving on, the general plan is to keep NNVMv1 as maintenance mode after 0.5 release and start to use Relay as the default IR, this means:

We will still send in bugfixes
A backward compatible layer(nnvm.to_relay) will be added to make sure all nnvmv1 based pipeline can still make use of Relay.
So far the graph runtime module is shared between nnvmv1 and relay, so there is no problem regarding backward compatibility of runtime.
New Compiler and IR optimizations will be added to relay but not necessarily NNVMv1 to help us focus on the development effort.

As per community guideline, all major design discussions should be discussed in public, and hear inputs from everyone in the community, so please put your weights into this. In particular on:

Why should we move to relay, if you support it
What are your concerns about the transition
How can we do better in future improvements?

tqchen · 2018-12-09T05:51:11Z

cc @dmlc/tvm-team

yzhliu · 2018-12-10T07:17:19Z

shall we run a set of test case to ensure there's no performance regression?

tqchen · 2018-12-14T22:04:53Z

@yzhliu as of now, relay's set of optimization passes is on par with all the previous nnvm benchmarks we ran in https://github.com/dmlc/tvm/tree/master/apps/benchmark

masahi · 2018-12-14T22:31:09Z

What do we do about nnvm symbols that are called from topi? (NCHWc conv, Winograd, etc)

Or more generally, how should we update the existing alter_layout registrations in topi that assume NNVM v1 ? I assume alter_layout mechanics of NNVM v1 and Relay are not compatible, so there will be two registrations, one for v1 and another for Relay?

tqchen · 2018-12-14T22:53:23Z

@merrymercy is working on alt_layout and i think he already have a solution

merrymercy · 2018-12-14T23:09:58Z

Probably we have to introduce something like F, which can be either nnvm.sym or relay.op.

TaoLv · 2018-12-17T06:43:42Z

How would this decision impact the development of MXNet? NNVM request will still be supported?

tqchen · 2018-12-17T16:05:38Z

This will only impact the deep learning compiler support(which mxnet does not depend on), the core nnvm will be kept stable.

wweic · 2018-12-18T22:33:17Z

@tqchen Is it possible to release the benchmark code you did using relay so people can evaluate the effort to transition to relay?

jroesch · 2018-12-22T08:43:34Z

I just wanted to write down my thoughts on this whole process. Sorry for the slow response been a busy couple weeks with paper deadlines and the TVM Conference.

I believe Relay offers multiple benefits as an intermediate representation, as well as quite a few compelling engineering reasons.

Most importantly Relay has a richer computational model, and is able to represent a wider set of deep learning computations. Providing the ability to represent and optimize programs which contain training loops, as well as advanced models.

Relay provides a rich and flexible type system allowing us to type the computations represented today in computation graphs as well as new operators, and functionality such as control-flow, recursion. We introduce a mechanism for explicit sharing via let bindings which are useful for control-flow and scoping. We can encode loops and control structures such as (map, filter, fold, etc).

Concurrently @MarisaKirisame, @slyubomirsky and I have been working on support for data types which allow the definition of differentiable functions over lists, trees, graphs, etc and the ability to optimize them (in the ML and compiler sense).

@MarisaKirisame and I have been working on a new AD algorithm (not yet merged, see #2321 for the first order version) which can compute Nth order gradients over control structures and data types.

From an engineering standpoint we have made many improvements, from the new attributes system to to new optimizations, and passes.

The new attribute system replaces NNVM's parameters. It provides the ability for users to provide typed attribute data (instead of strings), and can be accessed from Python. Attributes are constant, compile time information passed to operators that may be considered when doing analysis such as type checking, or code generation.

Another example is Relay supports arbitrary tensor constants. We type constants with tensor types, with 0-rank tensors representing scalars. Constants and their type rules allow us to remove specialized operators, typing, and execution support for scalars.

We can further use constants to do things like parameter specific optimization, we simply inline the parameters as constants and invoke the program optimizer. We can now treat all constant values uniformly regardless of whether they are parameters, constants in the program, or any other kind of user supplied information.

We are also able to build on Relay infrastructure, such as the program interpreter. We were able to build a constant evaluator by just using Relay's interpreter. We can execute arbitrary Relay programs, including the user of platform optimized operators. We can then inline the program result as a constant.

@joshpoll is furthering this line of thought by building a partial evaluator for Relay which can perform constant folding/evaluation style optimizations in the face of the unknown values.

I believe these are just a few examples of how Relay is helping us do more with less work. An important avenue is making it easier for people to test and evaluate Relay.

We have an internal version of https://github.com/dmlc/tvm/tree/master/apps/benchmark for an older version of Relay which I will work to upstream to aid this effort. If anyone is interested in helping out with this process I would welcome extra hands.

Going forward we hope to leverage the hard work done by the Relay community (both here at UW and our great collaborators elsewhere) to continue providing features and engineering benefits.

My personal development roadmap for Relay is to focus on error reporting, stability, and features for training. I would love to build a detailed version of the Relay roadmap after the holidays and involve those who are interested in the community.

tqchen · 2019-01-14T21:27:49Z

Thanks for everyone who participated so far, seems that the current community is for the transition as long as we keep good performance regression and benchmark, which I fully agree on. I will close this issue and open new ones for specific actionable items

tqchen added the status: RFC label Dec 6, 2018

tqchen closed this as completed Jan 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC][DISCUSS] Transition to Relay #2244

[RFC][DISCUSS] Transition to Relay #2244

tqchen commented Dec 6, 2018 •

edited

tqchen commented Dec 9, 2018

yzhliu commented Dec 10, 2018

tqchen commented Dec 14, 2018

masahi commented Dec 14, 2018 •

edited

tqchen commented Dec 14, 2018

merrymercy commented Dec 14, 2018 •

edited

TaoLv commented Dec 17, 2018

tqchen commented Dec 17, 2018 •

edited

wweic commented Dec 18, 2018

jroesch commented Dec 22, 2018 •

edited

tqchen commented Jan 14, 2019

[RFC][DISCUSS] Transition to Relay #2244

[RFC][DISCUSS] Transition to Relay #2244

Comments

tqchen commented Dec 6, 2018 • edited

Possible Transition Plan

tqchen commented Dec 9, 2018

yzhliu commented Dec 10, 2018

tqchen commented Dec 14, 2018

masahi commented Dec 14, 2018 • edited

tqchen commented Dec 14, 2018

merrymercy commented Dec 14, 2018 • edited

TaoLv commented Dec 17, 2018

tqchen commented Dec 17, 2018 • edited

wweic commented Dec 18, 2018

jroesch commented Dec 22, 2018 • edited

tqchen commented Jan 14, 2019

tqchen commented Dec 6, 2018 •

edited

masahi commented Dec 14, 2018 •

edited

merrymercy commented Dec 14, 2018 •

edited

tqchen commented Dec 17, 2018 •

edited

jroesch commented Dec 22, 2018 •

edited