[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280

zhiics · 2019-11-08T04:44:55Z

A minimum runtime for external codegen as mentioned in #4258

tqchen · 2019-11-08T19:21:02Z

To followup on the discussion thread. While we understand that the current PR achieves the purpose of supporting external library by a different kind of wrapping, it does brings additional code that has duplicated functionality with DSOModule.

Unless there is a strong reason to do so, I think we should not introduce the new Extern runtime. Instead, let us document the DSOModule's calling convention clearly and rewrite compilers to generate functions that can be loaded by the DSO module. That is, instead of generating a function with the extern calling convention, generate a function that takes the following signature

// the original foo you intended to generate
void foo_(float* a, int N, float* b, int M, float* c) {
   bar(...);
   foobar(...);
}

// DSOModule compatible c function that can be compiled by gcc.
extern "C" int foo(TVMValue* value, int *type_code, int nargs) {
   CHECK_EQ(nargs, 2);
   DLTensor* arg0 = static_cast<DLTensor*>(value.v_handle);
   DLTensor* arg1 = static_cast<DLTensor*>(value.v_handle);

   foo_(static_cast<float*>(args0->data), args0->shape[0],
           static_cast<float*>(args1->data), args1->shape[0]));
}

We can just return the above code as a CSourceModule, compile it with gcc and link with to the other generated code. In this way, the generated code can enjoy all the benefits of our existing infra, pass through RPC and combined with other libs.

Rationale

The main rationale behind the suggestion is that we want to avoid code specializations and reduce future technical debts. As we can see the current PR introduces special changes to several locations to support the new mode. Which increases the cost of maintaining the separate loading mechanism.

In short: the best design is not achieved when there is nothing to add, but when there is nothing to take away

tqchen · 2019-11-08T19:30:44Z

As a followup thread about what we might be able to benefit from in terms of external module, it would be great if we have another function that generates something like a graph spec, which can be interpreted and calls into the library functions(just a bit like our graph runtime). We implement the SaveToBinary to serialize the graph spec. See also https://discuss.tvm.ai/t/standardize-graphruntime-exports-into-a-single-dll/4667

This kind of external module is something closer to support modules like tensorrt, or tf and would be a nice example to have.

zhiics · 2019-11-11T18:50:43Z

@tqchen I removed everything, but just kept C APIs as a CSourceModule. I think this could be a minimal runtime/example that we can use for external library that needs to produce DSO. For the ones that need artifacts like Json (e.g TRT), let me think a little more.

comaniac · 2019-11-12T18:43:45Z

@tqchen could you help review again? Here is a short summary of the current runtime after your suggestions:

The runtime takes only one .so file including host and external kernels.
The runtime is now unified with DSOModule so backend developers don't need to implement their own runtime (and GetFunction) anymore. Instead, they need to unpack TVMValue in the generated kernels.
The example in this PR demonstrates a case that user backend will generate third-party library calls. This is applicable to runtime-less libraries such as MKL-DNN.
For the external backend that has a runtime engine such as TRT, we can still codegen external functions like this example, construct the subgraph in that function, and invoke the TRT engine to execute it on the fly.

We will refine the other codegen PR based on this one after merging.

tqchen · 2019-11-12T21:51:27Z

Thanks @comaniac i feel that given it is already unified with the DSOModule, we don't have to call it a separate extern runtime, instead it is a way to interface external libraries into the current DSOModule runtime:)

It would still be great if we can add the other example that I mentioned, where we implement a customized runtime that have its own serialization mechanism(perhaps a graph like graph runtime) and the PackedFunc executes the corresponding functions by interpreting the serialized structure(e.g. graph). This kind of example will be closer to what people want, especially they want to serialize states that are other than sequence of API calls. @zhiics

zhiics · 2019-11-12T22:19:11Z

yes, we don't need to call this as a runtime now. It actually just uses DSOModule and CSourceModule. I think @comaniac was asking should we add this example (DSO style external library) first and then move to implement the customized runtime you are mentioning. Never mind, we can work on the customized runtime and update the PR. Thanks.

comaniac · 2019-11-13T22:19:12Z

It would still be great if we can add the other example that I mentioned, where we implement a customized runtime that have its own serialization mechanism(perhaps a graph like graph runtime) and the PackedFunc executes the corresponding functions by interpreting the serialized structure(e.g. graph). This kind of example will be closer to what people want, especially they want to serialize states that are other than sequence of API calls. @zhiics

I've added another example that mimics the graph-runtime-like programming model. This example assumes the external backend has its own runtime that accepts a graph representation such as JSON. Accordingly, the codegen would serialize the subgraph to a string like this code snippet (I use a very simple representation instead of a complete JSON format in this example for simplification). Note that the codegen can either embed the graph to kernel code, or put the graph to a separate file and read it back in runtime.

comaniac · 2019-11-14T22:21:59Z

cc @tqchen

tests/python/relay/test_external_runtime.py

yzhliu · 2019-11-15T22:46:35Z

@tqchen would you take a look again?

tqchen · 2019-11-15T23:30:51Z

@comaniac Thanks for the update! Can you move the ExampleJSONModule into a formal module that is compiled by the TVM runtime? Because normally, we will need to implement a SaveToBinary and LoadFromBinary instead of pumping things into a DSO file

comaniac · 2019-11-16T00:15:09Z

@comaniac Thanks for the update! Can you move the ExampleJSONModule into a formal module that is compiled by the TVM runtime? Because normally, we will need to implement a SaveToBinary and LoadFromBinary instead of pumping things into a DSO file

The current implementation expects external codegens to write the JSON string directly to the DSO module so that we can have a unified interface. Do you mean we should also provide the following JSON module other than DSO module for developers to use:

JSONModule : runtime::Module {
    void SaveToBinary();
    void LoadFromFile();
};

Now, user would have two JSON files if the user invoke json_module.SaveToBinary() and relay.build()? Would this mean that we have to change API of graph_runtime.create() to include the second JSON file?

Also, we will generate the following function in the .so.

// JSONModule compatible c function that can be compiled by gcc.
extern "C" int foo(TVMValue* value, int *type_code, int nargs, std::string json_file) {
   CHECK_EQ(nargs, 3);
   DLTensor* arg0 = static_cast<DLTensor*>(value.v_handle);
   DLTensor* arg1 = static_cast<DLTensor*>(value.v_handle);

   // Parse json_file and use "foo" as the key to get the subgraph JSON.
  std::string subgraph_json = ...;

   foo_(static_cast<float*>(args0->data), args0->shape[0],
           static_cast<float*>(args1->data), args1->shape[0]),
           subgraph_json);
}

So that the user codegen is expected to generate the following:

// the original foo you intended to generate
void foo_(float* a, int N, float* b, int M, float* c, std::string subgraph_json) {
    // Launch the external engine with subgraph_json and arguements.
}

tqchen · 2019-11-16T00:21:58Z

The current interfacing example already looks great. For the JSONModule usecase, i was thinking about something like

ExampleJSONModule : runtime::Module {
    void SaveToBinary();
    void LoadFromFile();
    PackedFunc GetFunction(name) {
       if (name="tensortrun") {
         for each elem in json:
             if (elem.name == "func1") {
                run_trt_func1(func, elem.args);
             }  
            ...
      }
    }
};

In this case, because JSONModule directly interprets the json and calls into the runtime API, we no longer needs the gcc related components. This is an approach that will work better say if the serialized code is a TF blob plus some meta data and we just want to call into TF runtime.

tqchen · 2019-11-16T00:22:55Z

Also note that my comment about ExampleJSONModule is mainly to serve as example ways to integrate for other external runtimes. The module itself is not meant to be used in production

zhiics · 2019-11-16T23:23:03Z

@tqchen Thanks for the suggestions. I made a gtest for the simple example JSON runtime. Hopefully, this is good now. Please take a look when you have time.

zhiics · 2019-11-21T20:45:50Z

@u99127 Thanks for your comment.

@tqchen No worries. We actually had the same impression that the example should be put under contrib. But it looks that app dir also fine as this is really just an example. But anyways, we now moved it to contrib and added comments to the file. Please take another look.

tqchen

some final comments

src/runtime/contrib/exmaple_ext_runtime/example_ext_runtime.cc

CMakeLists.txt

u99127 · 2019-11-21T21:05:11Z

@u99127 Thanks for your comment.

Thanks @zhiics and @tqchen.

comaniac · 2019-11-21T22:11:48Z

Comment fixed.
@tqchen somehow the CI doesn't update the config.cmake to enable the example runtime. Would you take a look when you get a chance? Thanks.

tqchen · 2019-11-21T22:25:30Z

@comaniac please send another PR first with the updated Jenkinsfile. For security reasons CI will only respect Jenkinsfile that is already in master

u99127 · 2019-11-22T09:15:15Z

There are still a couple of changes requested in the review that need to be addressed. It would be good to clean this up with a separate pull request.

tqchen · 2019-11-22T16:42:49Z

@u99127 please https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly

u99127 · 2019-11-22T16:49:23Z

@u99127 please https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly

Sorry about the noise, it seems I misread the pull request this morning as being merged but now I realize it was because of the CI changes that got merged and not the actual PR.

zhiics · 2019-11-22T17:49:01Z

@u99127 Thanks. Cloud you please explicitly approve if everything looks good to you? Otherwise, please let us know your comment. Thanks again.

u99127 · 2019-11-22T18:12:47Z

LGTM - I don't see an actual "Approve" button if clicking that's expected from a workflow perspective.

Ramana

zhiics · 2019-11-22T18:27:13Z

@u99127 I see, thank you. You can find the "button" like this:

https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/approving-a-pull-request-with-required-reviews

u99127

LGTM.

comaniac · 2019-11-22T23:12:04Z

@tqchen would you mind to take a final look and merge if it's all good?
Thanks!

tqchen · 2019-11-22T23:32:00Z

Thanks @comaniac @zhiics @u99127 !

tqchen · 2019-11-23T01:29:55Z

@zhiics @comaniac please followup to fix the windows build error here https://dev.azure.com/tvmai/tvm/_build/results?buildId=3417 Likely we will need to declare TVM_DLL in the function level instead

…s for DSOModule (apache#4280)

zhiics force-pushed the external_runtime branch 2 times, most recently from 3440e04 to a2b3567 Compare November 11, 2019 18:16

zhiics added 3 commits November 11, 2019 18:24

[tvm] A minimum runtime for external library

62cb180

fix lint and no external lib for rpc

6765721

remove gcc extern runtime

b24f83a

zhiics force-pushed the external_runtime branch from a2b3567 to b24f83a Compare November 11, 2019 18:44

retrigger ci

1c2baf9

tqchen changed the title ~~[tvm][runtime] A minimum runtime for external library~~ [TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule Nov 12, 2019

comaniac mentioned this pull request Nov 12, 2019

[WIP][TVM] Bring Your Own Codegen to TVM #4258

Closed

add an example with runtime engine

3647def

comaniac force-pushed the external_runtime branch from 0356167 to 3647def Compare November 13, 2019 22:14

uses tmp dir

a73949f

yzhliu reviewed Nov 14, 2019

View reviewed changes

tests/python/relay/test_external_runtime.py Show resolved Hide resolved

free buf

f1d1540

yzhliu approved these changes Nov 15, 2019

View reviewed changes

zhiics force-pushed the external_runtime branch from 82c5cd5 to 78fa6e8 Compare November 16, 2019 23:22

Move back to contrib

65e3510

zhiics force-pushed the external_runtime branch from c4cdca3 to 65e3510 Compare November 21, 2019 20:41

tqchen requested changes Nov 21, 2019

View reviewed changes

src/runtime/contrib/exmaple_ext_runtime/example_ext_runtime.cc Outdated Show resolved Hide resolved

src/runtime/contrib/exmaple_ext_runtime/example_ext_runtime.cc Outdated Show resolved Hide resolved

u99127 suggested changes Nov 21, 2019

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

comaniac mentioned this pull request Nov 21, 2019

Update Jenkinsfile for external runtime #4396

Merged

zhiics force-pushed the external_runtime branch from d8abde9 to 35f45c8 Compare November 22, 2019 01:17

Fix comments

2484b50

zhiics force-pushed the external_runtime branch from 35f45c8 to 2484b50 Compare November 22, 2019 01:23

u99127 approved these changes Nov 22, 2019

View reviewed changes

tqchen approved these changes Nov 22, 2019

View reviewed changes

tqchen merged commit e081051 into apache:master Nov 22, 2019

tqchen added status: accepted and removed status: need update need update based on feedbacks labels Nov 22, 2019

zhiics deleted the external_runtime branch November 22, 2019 23:53

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Nov 26, 2019

[TVM][RUNTIME] A minimum example to generate external library wrapper…

2d86322

…s for DSOModule (apache#4280)

yongwww pushed a commit to neo-ai/tvm that referenced this pull request Nov 26, 2019

[TVM][RUNTIME] A minimum example to generate external library wrapper…

4578c85

…s for DSOModule (apache#4280)

zhiics mentioned this pull request Dec 28, 2019

Relay/TRT Integration (whole graph only) neo-ai/tvm#54

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280

[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280

zhiics commented Nov 8, 2019

tqchen commented Nov 8, 2019 •

edited

tqchen commented Nov 8, 2019 •

edited

zhiics commented Nov 11, 2019

comaniac commented Nov 12, 2019

tqchen commented Nov 12, 2019

zhiics commented Nov 12, 2019 •

edited

comaniac commented Nov 13, 2019

comaniac commented Nov 14, 2019

yzhliu commented Nov 15, 2019

tqchen commented Nov 15, 2019

comaniac commented Nov 16, 2019

tqchen commented Nov 16, 2019 •

edited

tqchen commented Nov 16, 2019 •

edited

zhiics commented Nov 16, 2019 •

edited

zhiics commented Nov 21, 2019

tqchen left a comment

u99127 commented Nov 21, 2019

comaniac commented Nov 21, 2019

tqchen commented Nov 21, 2019 •

edited

u99127 commented Nov 22, 2019

tqchen commented Nov 22, 2019

u99127 commented Nov 22, 2019

zhiics commented Nov 22, 2019 •

edited

u99127 commented Nov 22, 2019

zhiics commented Nov 22, 2019

u99127 left a comment

comaniac commented Nov 22, 2019

tqchen commented Nov 22, 2019

tqchen commented Nov 23, 2019 •

edited

[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280

[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280

Conversation

zhiics commented Nov 8, 2019

tqchen commented Nov 8, 2019 • edited

Rationale

tqchen commented Nov 8, 2019 • edited

zhiics commented Nov 11, 2019

comaniac commented Nov 12, 2019

tqchen commented Nov 12, 2019

zhiics commented Nov 12, 2019 • edited

comaniac commented Nov 13, 2019

comaniac commented Nov 14, 2019

yzhliu commented Nov 15, 2019

tqchen commented Nov 15, 2019

comaniac commented Nov 16, 2019

tqchen commented Nov 16, 2019 • edited

tqchen commented Nov 16, 2019 • edited

zhiics commented Nov 16, 2019 • edited

zhiics commented Nov 21, 2019

tqchen left a comment

Choose a reason for hiding this comment

u99127 commented Nov 21, 2019

comaniac commented Nov 21, 2019

tqchen commented Nov 21, 2019 • edited

u99127 commented Nov 22, 2019

tqchen commented Nov 22, 2019

u99127 commented Nov 22, 2019

zhiics commented Nov 22, 2019 • edited

u99127 commented Nov 22, 2019

zhiics commented Nov 22, 2019

u99127 left a comment

Choose a reason for hiding this comment

comaniac commented Nov 22, 2019

tqchen commented Nov 22, 2019

tqchen commented Nov 23, 2019 • edited

tqchen commented Nov 8, 2019 •

edited

tqchen commented Nov 8, 2019 •

edited

zhiics commented Nov 12, 2019 •

edited

tqchen commented Nov 16, 2019 •

edited

tqchen commented Nov 16, 2019 •

edited

zhiics commented Nov 16, 2019 •

edited

tqchen commented Nov 21, 2019 •

edited

zhiics commented Nov 22, 2019 •

edited

tqchen commented Nov 23, 2019 •

edited