Auto-TensorRT engine compilation, or improved documentation for it #842

fxmarty · 2023-03-02T13:50:17Z

Feature request

For decoder models with cache, it can be painful to manually compile the TensorRT engine as ONNX Runtime does not give options to specify shapes. The engine build could maybe be done automatically.

The current doc is only for use_cache=False, which is not very interesting. It could be improved to show how to pre-build the TRT with use_cache=True.

References:
https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/gpu#tensorrt-engine-build-and-warmup
microsoft/onnxruntime#13559

Motivation

TensorRT is fast

Your contribution

will work on it sometime

The text was updated successfully, but these errors were encountered:

fxmarty · 2023-04-21T08:53:20Z

microsoft/onnxruntime#13851 will make this much easier

puyuanOT · 2023-05-19T16:33:05Z

Is there any update on this? I also suffer from the difficulty to use TensorRT as the provider after run optimum-cli optimization.

chilo-ms · 2023-05-30T16:55:35Z

Hi,

1.15 ORT TRT supports explicit input shape meaning users can provide the shape range for all the dynamic shape input.
Please see the PR as well as the doc for usage/details.

Let us know if you have further questions or other feedbacks for ORT TRT. We are willing to make ORT TRT easier to use.

fxmarty · 2023-05-31T12:47:39Z

@chilo-ms Thanks a lot, that looks great! It is not at the top of our todo for now, but we're welcoming community contribution to interface well TensorrtExecutionProvider with ORTModel classes!

fxmarty added feature-request New feature or request onnxruntime Related to ONNX Runtime labels Mar 2, 2023

fxmarty mentioned this issue Jun 15, 2023

IO Binding for ONNX Non-CUDAExecutionProviders #1105

Open

IlyasMoutawwakil mentioned this issue Sep 19, 2023

TensorrtExecutionProvider documentation #1395

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-TensorRT engine compilation, or improved documentation for it #842

Auto-TensorRT engine compilation, or improved documentation for it #842

fxmarty commented Mar 2, 2023

fxmarty commented Apr 21, 2023

puyuanOT commented May 19, 2023

chilo-ms commented May 30, 2023 •

edited

fxmarty commented May 31, 2023

Auto-TensorRT engine compilation, or improved documentation for it #842

Auto-TensorRT engine compilation, or improved documentation for it #842

Comments

fxmarty commented Mar 2, 2023

Feature request

Motivation

Your contribution

fxmarty commented Apr 21, 2023

puyuanOT commented May 19, 2023

chilo-ms commented May 30, 2023 • edited

fxmarty commented May 31, 2023

chilo-ms commented May 30, 2023 •

edited