New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TVM][RUNTIME] A minimum example to generate external library wrappers for DSOModule #4280
Conversation
To followup on the discussion thread. While we understand that the current PR achieves the purpose of supporting external library by a different kind of wrapping, it does brings additional code that has duplicated functionality with DSOModule. Unless there is a strong reason to do so, I think we should not introduce the new Extern runtime. Instead, let us document the DSOModule's calling convention clearly and rewrite compilers to generate functions that can be loaded by the DSO module. That is, instead of generating a function with the extern calling convention, generate a function that takes the following signature // the original foo you intended to generate
void foo_(float* a, int N, float* b, int M, float* c) {
bar(...);
foobar(...);
}
// DSOModule compatible c function that can be compiled by gcc.
extern "C" int foo(TVMValue* value, int *type_code, int nargs) {
CHECK_EQ(nargs, 2);
DLTensor* arg0 = static_cast<DLTensor*>(value.v_handle);
DLTensor* arg1 = static_cast<DLTensor*>(value.v_handle);
foo_(static_cast<float*>(args0->data), args0->shape[0],
static_cast<float*>(args1->data), args1->shape[0]));
} We can just return the above code as a CSourceModule, compile it with gcc and link with to the other generated code. In this way, the generated code can enjoy all the benefits of our existing infra, pass through RPC and combined with other libs. RationaleThe main rationale behind the suggestion is that we want to avoid code specializations and reduce future technical debts. As we can see the current PR introduces special changes to several locations to support the new mode. Which increases the cost of maintaining the separate loading mechanism. In short: the best design is not achieved when there is nothing to add, but when there is nothing to take away |
As a followup thread about what we might be able to benefit from in terms of external module, it would be great if we have another function that generates something like a graph spec, which can be interpreted and calls into the library functions(just a bit like our graph runtime). We implement the SaveToBinary to serialize the graph spec. See also https://discuss.tvm.ai/t/standardize-graphruntime-exports-into-a-single-dll/4667 This kind of external module is something closer to support modules like tensorrt, or tf and would be a nice example to have. |
3440e04
to
a2b3567
Compare
a2b3567
to
b24f83a
Compare
@tqchen I removed everything, but just kept C APIs as a CSourceModule. I think this could be a minimal |
@tqchen could you help review again? Here is a short summary of the current runtime after your suggestions:
We will refine the other codegen PR based on this one after merging. |
Thanks @comaniac i feel that given it is already unified with the DSOModule, we don't have to call it a separate extern runtime, instead it is a way to interface external libraries into the current DSOModule runtime:) It would still be great if we can add the other example that I mentioned, where we implement a customized runtime that have its own serialization mechanism(perhaps a graph like graph runtime) and the PackedFunc executes the corresponding functions by interpreting the serialized structure(e.g. graph). This kind of example will be closer to what people want, especially they want to serialize states that are other than sequence of API calls. @zhiics |
yes, we don't need to call this as a runtime now. It actually just uses DSOModule and CSourceModule. I think @comaniac was asking should we add this example (DSO style external library) first and then move to implement the customized runtime you are mentioning. Never mind, we can work on the customized runtime and update the PR. Thanks. |
0356167
to
3647def
Compare
I've added another example that mimics the graph-runtime-like programming model. This example assumes the external backend has its own runtime that accepts a graph representation such as JSON. Accordingly, the codegen would serialize the subgraph to a string like this code snippet (I use a very simple representation instead of a complete JSON format in this example for simplification). Note that the codegen can either embed the graph to kernel code, or put the graph to a separate file and read it back in runtime. |
cc @tqchen |
@tqchen would you take a look again? |
@comaniac Thanks for the update! Can you move the ExampleJSONModule into a formal module that is compiled by the TVM runtime? Because normally, we will need to implement a SaveToBinary and LoadFromBinary instead of pumping things into a DSO file |
The current implementation expects external codegens to write the JSON string directly to the DSO module so that we can have a unified interface. Do you mean we should also provide the following JSON module other than DSO module for developers to use: JSONModule : runtime::Module {
void SaveToBinary();
void LoadFromFile();
}; Now, user would have two JSON files if the user invoke json_module.SaveToBinary() and relay.build()? Would this mean that we have to change API of graph_runtime.create() to include the second JSON file? Also, we will generate the following function in the .so. // JSONModule compatible c function that can be compiled by gcc.
extern "C" int foo(TVMValue* value, int *type_code, int nargs, std::string json_file) {
CHECK_EQ(nargs, 3);
DLTensor* arg0 = static_cast<DLTensor*>(value.v_handle);
DLTensor* arg1 = static_cast<DLTensor*>(value.v_handle);
// Parse json_file and use "foo" as the key to get the subgraph JSON.
std::string subgraph_json = ...;
foo_(static_cast<float*>(args0->data), args0->shape[0],
static_cast<float*>(args1->data), args1->shape[0]),
subgraph_json);
} So that the user codegen is expected to generate the following: // the original foo you intended to generate
void foo_(float* a, int N, float* b, int M, float* c, std::string subgraph_json) {
// Launch the external engine with subgraph_json and arguements.
} |
The current interfacing example already looks great. For the JSONModule usecase, i was thinking about something like ExampleJSONModule : runtime::Module {
void SaveToBinary();
void LoadFromFile();
PackedFunc GetFunction(name) {
if (name="tensortrun") {
for each elem in json:
if (elem.name == "func1") {
run_trt_func1(func, elem.args);
}
...
}
}
}; In this case, because JSONModule directly interprets the json and calls into the runtime API, we no longer needs the gcc related components. This is an approach that will work better say if the serialized code is a TF blob plus some meta data and we just want to call into TF runtime. |
Also note that my comment about ExampleJSONModule is mainly to serve as example ways to integrate for other external runtimes. The module itself is not meant to be used in production |
82c5cd5
to
78fa6e8
Compare
@tqchen Thanks for the suggestions. I made a gtest for the simple example JSON runtime. Hopefully, this is good now. Please take a look when you have time. |
c4cdca3
to
65e3510
Compare
@u99127 Thanks for your comment. @tqchen No worries. We actually had the same impression that the example should be put under contrib. But it looks that app dir also fine as this is really just an example. But anyways, we now moved it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some final comments
Comment fixed. |
@comaniac please send another PR first with the updated Jenkinsfile. For security reasons CI will only respect Jenkinsfile that is already in master |
d8abde9
to
35f45c8
Compare
35f45c8
to
2484b50
Compare
There are still a couple of changes requested in the review that need to be addressed. It would be good to clean this up with a separate pull request. |
Sorry about the noise, it seems I misread the pull request this morning as being merged but now I realize it was because of the CI changes that got merged and not the actual PR. |
@u99127 Thanks. Cloud you please explicitly approve if everything looks good to you? Otherwise, please let us know your comment. Thanks again. |
LGTM - I don't see an actual "Approve" button if clicking that's expected from a workflow perspective. Ramana |
@u99127 I see, thank you. You can find the "button" like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@tqchen would you mind to take a final look and merge if it's all good? |
@zhiics @comaniac please followup to fix the windows build error here https://dev.azure.com/tvmai/tvm/_build/results?buildId=3417 Likely we will need to declare TVM_DLL in the function level instead |
A minimum runtime for external codegen as mentioned in #4258