[RFC] Module based Model Runtime Interface #5038

FrozenGene · 2020-03-11T03:40:44Z

As https://discuss.tvm.ai/t/discuss-module-based-model-runtime-interface/5025 discussed, we want to support Module based Model Runtime Interface and solve the following challenges:

R1: The creation of ML model can be context dependent. For example, the user needs to be able to specify which GPU to run the graph runtime on.
R2: We start to have multiple variations of model runtime, such as RelayVM. While it does not makes sense to force all model runtime to have the same set of APIs, it would be helpful to have a same mechanism for packaging and loading.
R3: In advanced use cases, we want to be able to bundle multiple models into a single shared library.

After discussion, we have sorted out the API and reach an agreement. Here, I want to summary the API and give it an example.

# lib is a GraphRuntimeFactoryModule
# that contains json and parameters
lib = relay.build(...)

# We provide way to let user control whether we
# want to package_params or not. 
# You could save it like before
lib_no_params = lib["remove_params"]()
with open(temp.relpath("deploy_param.params"), "wb") as fo:
    fo.write(relay.save_param_dict(lib.get_params()))

# we could export it to shared library and load it back
lib.export_library("resnet18.so")
# lib_no_params.export_library("resnet18.so")

# load it back
lib = tvm.module.load("resnet18.so")

# Call into the factory module to create a graph runtime
# Having this additional factory create step solves R1
# Note that parameters are already set

# The first argument is a key that helps to solve R3, allow list of context in the future
# gmod = lib["resnet18"]([tvm.cpu(0), tvm.gpu(0)])
gmod = lib["resnet18"](tvm.cpu(0))

set_input = gmod["set_input"]
run = gmod["run"]
get_output = gmod["get_output"]

# We do not need to set the parameters here
# as the models
set_input(data=my_data)
run()
get_output()

# we could use wrapper
gmod = graph_runtime.GraphModule(lib["resnet18"](tvm.cpu(0)))
gmod.set_input(data=my_data)
gmod.run()
gmod.get_output()

More details and the decision procedure could be seen: https://discuss.tvm.ai/t/discuss-module-based-model-runtime-interface/5025

API and graph runtime / debug graph runtime core functionality support ([RUNTIME] Support module based interface runtime #5753)
Multiple models support

tqchen · 2020-03-16T16:22:48Z

Good summary, would be great if we can land it

tqchen · 2020-05-29T21:24:23Z

@FrozenGene can we follow up on this?

FrozenGene · 2020-05-30T07:48:43Z

@FrozenGene can we follow up on this?

Hi, @tqchen I will start to work on it from next Monday! Sorry for working on it lately because of other things.

FrozenGene · 2020-06-09T15:00:54Z

Draft pr: #5753 There is still many things to do but will update it next in this pr.

tqchen · 2020-07-21T00:39:11Z

Closing for now, #5753 lands the interface, let us open a new thread for the multi model support. Thanks to @FrozenGene !

FrozenGene added the status: RFC label Mar 11, 2020

FrozenGene mentioned this issue Jul 9, 2020

[RUNTIME] Support module based interface runtime #5753

Merged

tqchen closed this as completed Jul 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Module based Model Runtime Interface #5038

[RFC] Module based Model Runtime Interface #5038

FrozenGene commented Mar 11, 2020 •

edited

tqchen commented Mar 16, 2020

tqchen commented May 29, 2020

FrozenGene commented May 30, 2020

FrozenGene commented Jun 9, 2020

tqchen commented Jul 21, 2020

[RFC] Module based Model Runtime Interface #5038

[RFC] Module based Model Runtime Interface #5038

Comments

FrozenGene commented Mar 11, 2020 • edited

tqchen commented Mar 16, 2020

tqchen commented May 29, 2020

FrozenGene commented May 30, 2020

FrozenGene commented Jun 9, 2020

tqchen commented Jul 21, 2020

FrozenGene commented Mar 11, 2020 •

edited