New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC][DISCUSS] Transition to Relay #2244
Comments
cc @dmlc/tvm-team |
shall we run a set of test case to ensure there's no performance regression? |
@yzhliu as of now, relay's set of optimization passes is on par with all the previous nnvm benchmarks we ran in https://github.com/dmlc/tvm/tree/master/apps/benchmark |
What do we do about nnvm symbols that are called from topi? (NCHWc conv, Winograd, etc) Or more generally, how should we update the existing alter_layout registrations in topi that assume NNVM v1 ? I assume alter_layout mechanics of NNVM v1 and Relay are not compatible, so there will be two registrations, one for v1 and another for Relay? |
@merrymercy is working on alt_layout and i think he already have a solution |
Probably we have to introduce something like |
How would this decision impact the development of MXNet? NNVM request will still be supported? |
This will only impact the deep learning compiler support(which mxnet does not depend on), the core nnvm will be kept stable. |
@tqchen Is it possible to release the benchmark code you did using relay so people can evaluate the effort to transition to relay? |
I just wanted to write down my thoughts on this whole process. Sorry for the slow response been a busy couple weeks with paper deadlines and the TVM Conference. I believe Relay offers multiple benefits as an intermediate representation, as well as quite a few compelling engineering reasons. Most importantly Relay has a richer computational model, and is able to represent a wider set of deep learning computations. Providing the ability to represent and optimize programs which contain training loops, as well as advanced models. Relay provides a rich and flexible type system allowing us to type the computations represented today in computation graphs as well as new operators, and functionality such as control-flow, recursion. We introduce a mechanism for explicit sharing via Concurrently @MarisaKirisame, @slyubomirsky and I have been working on support for data types which allow the definition of differentiable functions over lists, trees, graphs, etc and the ability to optimize them (in the ML and compiler sense). @MarisaKirisame and I have been working on a new AD algorithm (not yet merged, see #2321 for the first order version) which can compute Nth order gradients over control structures and data types. From an engineering standpoint we have made many improvements, from the new attributes system to to new optimizations, and passes. The new attribute system replaces NNVM's parameters. It provides the ability for users to provide typed attribute data (instead of strings), and can be accessed from Python. Attributes are constant, compile time information passed to operators that may be considered when doing analysis such as type checking, or code generation. Another example is Relay supports arbitrary tensor constants. We type constants with tensor types, with 0-rank tensors representing scalars. Constants and their type rules allow us to remove specialized operators, typing, and execution support for scalars. We can further use constants to do things like parameter specific optimization, we simply inline the parameters as constants and invoke the program optimizer. We can now treat all constant values uniformly regardless of whether they are parameters, constants in the program, or any other kind of user supplied information. We are also able to build on Relay infrastructure, such as the program interpreter. We were able to build a constant evaluator by just using Relay's interpreter. We can execute arbitrary Relay programs, including the user of platform optimized operators. We can then inline the program result as a constant. @joshpoll is furthering this line of thought by building a partial evaluator for Relay which can perform constant folding/evaluation style optimizations in the face of the unknown values. I believe these are just a few examples of how Relay is helping us do more with less work. An important avenue is making it easier for people to test and evaluate Relay. We have an internal version of https://github.com/dmlc/tvm/tree/master/apps/benchmark for an older version of Relay which I will work to upstream to aid this effort. If anyone is interested in helping out with this process I would welcome extra hands. Going forward we hope to leverage the hard work done by the Relay community (both here at UW and our great collaborators elsewhere) to continue providing features and engineering benefits. My personal development roadmap for Relay is to focus on error reporting, stability, and features for training. I would love to build a detailed version of the Relay roadmap after the holidays and involve those who are interested in the community. |
Thanks for everyone who participated so far, seems that the current community is for the transition as long as we keep good performance regression and benchmark, which I fully agree on. I will close this issue and open new ones for specific actionable items |
Dear Community:
The relay IR(see #1673 for original RFC, and https://docs.tvm.ai/dev/relay_intro.html for a quick intro) is a proposed next-generation NNVMv2 IR support compilation for a richer set of programs and more formal optimizations. Thanks to the community's effort, as of now, the relay compilation flow is mostly on par with the existing NNVMv1 pipeline, with noticeable improvements. I am opening this RFC to propose transition to relay as a default IR for our compiler stack, and open discussion for how should we do it.
Possible Transition Plan
If we agree on moving on, the general plan is to keep NNVMv1 as maintenance mode after 0.5 release and start to use Relay as the default IR, this means:
As per community guideline, all major design discussions should be discussed in public, and hear inputs from everyone in the community, so please put your weights into this. In particular on:
The text was updated successfully, but these errors were encountered: