[BYOC] [TPAT] [TensorRT] Add the ability to automatically generate TensorRT plugins using TVM #15526

Civitasv · 2023-08-11T05:55:22Z

TPAT: TVM Plugin Autogen Tool

Disclaimer: This PR is based on Tencent's TPAT.

Purpose: Tencent's TPAT should be used with their TVM fork: BlazerML-tvm, but they haven't synchronized it with the upstream for a long time, also some bugs are not resolved. In light of these issues, I decide to try integrating it to TVM.

Objective: The primary goal is to offer a clear and user-friendly API.

Architecture

Currently, only TensorRT is supported.

In essence, this solution is built upon the Template Engine (Jinja) in Python to create plugin templates for vendor-specific acceleration libraries. It then utilizes TVM for optimization and code generation targeting the respective platforms. The generated code is rendered and filled into the templates. Subsequently, platform-specific build commands are invoked to build the plugins, which ultimately serve as extensions for the corresponding vendor's acceleration library.

Inputs & Outputs

The entry of TPAT for TensorRT is as follows:

def pipeline(
    onnx_file: str, node_names: list[str], enable_tunning: bool, work_dir: str, output_onnx: str
) -> Tuple[str, list[str]]:

This entry point accepts an ONNX file, a list of nodes to be tuned, the log database location, and the output ONNX file path where the modified model will be stored.

After generating plugins for each node, the function returns the path of the output ONNX file along with a list of paths where the plugins are saved, facilitating subsequent loading.

TODO

User should have the ability to change tunning option.
Add benchmark section.
Currently, the frontend is Relay, the tunning method is MetaSchedule, we should a flexible way to support Relax and other tunning method.
Consider potential improvements to the API on the C++ side. Currently I use some global variables, then register global functions to get these variables, it feels like a hack to me, anyway, I'm not quite familiar with TVM's way to do it, so please give me some advice.
Explore dynamic batch support, currently, only static batch is supported, the original repo supports it, but I think it's a little mess, I believe there exists a more elegant way to support this feature.
Investigate the generation of QNN plugins for Qualcomm platforms.

Reference

tvm-bot · 2023-08-11T05:55:25Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @billishyahao _{See #10317 for details}

_{Generated by tvm-bot}

Civitasv · 2023-08-11T05:58:10Z

cc @tqchen @Hzfengsy @FrozenGene

Hzfengsy · 2023-08-11T06:32:45Z

Thanks, @Civitasv for this great work! There are notable things:

This is the improvement based on Relay, not Relax, so it should be sent to main branch instead of the unity branch
It's an awesome and big feature, having an RFC (https://github.com/apache/tvm-rfcs) before PR would be good
This PR is a bit large to review, could it be separated into several small ones, together with a tracking issue after the RFC

Civitasv · 2023-08-11T06:46:41Z

This is the improvement based on Relay, not Relax, so it should be sent to main branch instead of the unity branch

The final goal is to support both Relay and Relax, but I agree currently it should be sent to main branch.

It's an awesome and big feature, having an RFC (https://github.com/apache/tvm-rfcs) before PR would be good

Okay, I will write an RFC.

This PR is a bit large to review, could it be separated into several small ones, together with a tracking issue after the RFC

I will try to separate it.

Civitasv · 2023-08-12T09:14:25Z

I've already proposed an RFC. See apache/tvm-rfcs#103.

buptqq · 2023-08-14T03:16:15Z

I've already proposed an RFC. See apache/tvm-rfcs#103.
Hi, I am the author of TPAT. If you need any help, you can contact me through this email : qianqiu@tencent.com

Civitasv · 2023-08-14T03:20:28Z

I've already proposed an RFC. See apache/tvm-rfcs#103.
Hi, I am the author of TPAT. If you need any help, you can contact me through this email : qianqiu@tencent.com

@buptqq Thanks for your great work! It helps me a lot, If you are still working at this project, can you review the code? I've changed much.

Civitasv · 2023-08-15T02:46:13Z

I've improved the code, the workflow should be clear if you've read the RFC. 😄

… plugin with tvm

…ields

buptqq · 2023-08-21T15:51:35Z

I've already proposed an RFC. See apache/tvm-rfcs#103.
Hi, I am the author of TPAT. If you need any help, you can contact me through this email : qianqiu@tencent.com

@buptqq Thanks for your great work! It helps me a lot, If you are still working at this project, can you review the code? I've changed much.

OK, I will review this code.

Civitasv force-pushed the feat_tpat branch from 0d8cddb to af4ddf8 Compare August 12, 2023 09:26

Civitasv added 13 commits August 21, 2023 10:53

[tensorrt] [byoc] [plugin] add ability to support generating tensorrt…

c2fbe14

… plugin with tvm

[tensorrt] [byoc] [plugin] add TPAT python lib, make the api clearer

ac896a3

[tensorrt] [byoc] [plugin] make cpp side api clearer

2fea9e2

[tensorrt] [byoc] [plugin] Allow users to specify tunning option

cb1c86c

fix: make extract onnx and rewrite cleaer

83cee7a

[tensorrt] [byoc] [plugin] Make API clearer and remove unneccessary f…

b4726fc

…ields

[tensorrt] [byoc] [plugin] remove unused imports

52dd98f

[tensorrt] [byoc] [plugin] Make API clearer

a2c322b

[tensorrt] [byoc] [plugin] change configuration of Makefile.

7881580

[tensorrt] [byoc] [plugin] anyway, better name is better

e13a474

[tensorrt] [plugin] [byoc] fix resolving device function order

f0248b4

[tensorrt] [byoc] [plugin] enhance type inference using ort

d6b1cdb

[tensorrt] [byoc] [plugin] allows save external data

493142d

No need to use gc

e9c5a58

Civitasv force-pushed the feat_tpat branch from b1653c0 to e9c5a58 Compare August 29, 2023 11:14

junrushao force-pushed the unity branch 2 times, most recently from c95d45f to 45eeb8c Compare December 18, 2023 21:00

tqchen deleted the branch apache:unity March 29, 2024 12:18

tqchen closed this Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BYOC] [TPAT] [TensorRT] Add the ability to automatically generate TensorRT plugins using TVM #15526

[BYOC] [TPAT] [TensorRT] Add the ability to automatically generate TensorRT plugins using TVM #15526

Civitasv commented Aug 11, 2023 •

edited

Loading

tvm-bot commented Aug 11, 2023

Civitasv commented Aug 11, 2023 •

edited

Loading

Hzfengsy commented Aug 11, 2023

Civitasv commented Aug 11, 2023

Civitasv commented Aug 12, 2023

buptqq commented Aug 14, 2023

Civitasv commented Aug 14, 2023

Civitasv commented Aug 15, 2023

buptqq commented Aug 21, 2023

[BYOC] [TPAT] [TensorRT] Add the ability to automatically generate TensorRT plugins using TVM #15526

[BYOC] [TPAT] [TensorRT] Add the ability to automatically generate TensorRT plugins using TVM #15526

Conversation

Civitasv commented Aug 11, 2023 • edited Loading

Architecture

Inputs & Outputs

TODO

Reference

tvm-bot commented Aug 11, 2023

Civitasv commented Aug 11, 2023 • edited Loading

Hzfengsy commented Aug 11, 2023

Civitasv commented Aug 11, 2023

Civitasv commented Aug 12, 2023

buptqq commented Aug 14, 2023

Civitasv commented Aug 14, 2023

Civitasv commented Aug 15, 2023

buptqq commented Aug 21, 2023

Civitasv commented Aug 11, 2023 •

edited

Loading

Civitasv commented Aug 11, 2023 •

edited

Loading