Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-TensorRT engine compilation, or improved documentation for it #842

Open
fxmarty opened this issue Mar 2, 2023 · 4 comments
Open

Auto-TensorRT engine compilation, or improved documentation for it #842

fxmarty opened this issue Mar 2, 2023 · 4 comments
Labels
feature-request New feature or request onnxruntime Related to ONNX Runtime

Comments

@fxmarty
Copy link
Collaborator

fxmarty commented Mar 2, 2023

Feature request

For decoder models with cache, it can be painful to manually compile the TensorRT engine as ONNX Runtime does not give options to specify shapes. The engine build could maybe be done automatically.

The current doc is only for use_cache=False, which is not very interesting. It could be improved to show how to pre-build the TRT with use_cache=True.

References:
https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/gpu#tensorrt-engine-build-and-warmup
microsoft/onnxruntime#13559

Motivation

TensorRT is fast

Your contribution

will work on it sometime

@fxmarty fxmarty added feature-request New feature or request onnxruntime Related to ONNX Runtime labels Mar 2, 2023
@fxmarty
Copy link
Collaborator Author

fxmarty commented Apr 21, 2023

microsoft/onnxruntime#13851 will make this much easier

@puyuanOT
Copy link

Is there any update on this? I also suffer from the difficulty to use TensorRT as the provider after run optimum-cli optimization.

@chilo-ms
Copy link

chilo-ms commented May 30, 2023

Hi,

1.15 ORT TRT supports explicit input shape meaning users can provide the shape range for all the dynamic shape input.
Please see the PR as well as the doc for usage/details.

Let us know if you have further questions or other feedbacks for ORT TRT. We are willing to make ORT TRT easier to use.

@fxmarty
Copy link
Collaborator Author

fxmarty commented May 31, 2023

@chilo-ms Thanks a lot, that looks great! It is not at the top of our todo for now, but we're welcoming community contribution to interface well TensorrtExecutionProvider with ORTModel classes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request onnxruntime Related to ONNX Runtime
Projects
None yet
Development

No branches or pull requests

3 participants