New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] TensorRT Explicit Engine Profile Values #13851
Comments
+1 |
Thanks! we're looking into implementing this feature, |
Great. This is really important. I had to write some (super hacky) code to find the specific input tensors I was requiring and overriding the code. NVIDIA |
yes, I think we can do something like that. Our goal is to support more features of trtexec in the coming months. |
@seddonm1 @fxmarty FYI, there's a PR in development for the profile override feature you requested. |
Hi @jywu-msft
It could be my end and I can investigate further next week. Apart from that I am concerned about |
Hi @seddonm1 If possible, could you share the whole provider options string that you used? From the error message, it seems you have the wrong format of provider option 'trt_profile_min_shapes'. Also, you can turn on verbose logging to see the whole log including the provider options as well as the parsing of the profile string. Following is the example I tested for these explicit profiles using onnxruntime_perf_test:
Re: Apart from that I am concerned about (1) Whole graph can be run on TRT. I think this means that a graph with an Onnx NonMaxSupression node will not work. This PR actually supports the "graph can be partially run on TRT", could you also try it with your model? |
Hi @chilo-ms I have fixed my side which was causing the issue I experienced (sorry to waste your time). This is a good change. I have tested with and without an ONNX NonMaxSupression node and it is working correctly. Good work 👍 |
Good to hear that this PR can work for your model. There is a follow up question that we want to get your opinion on how this explicit profiles working with engine cache. Do you think this usage easy to use? Really appreciate you feedback! [UPDATE 4/26] After internal discussion, we decided not to add |
Previous behavior of TRT EP to set TRT optimization profiles for dynamic shape input is based on input tensor values. Users can't explicitly specify the profiles. This PR makes users capable of specifying min/max/opt profiles through newly added three provider options: `trt_profile_min_shapes`, `trt_profile_max_shapes` and `trt_profile_opt_shapes` with the format of "input1:dim1xdim2...,input2:dim3xdim4...". (Note: It's similar to --minShapes, --maxShapes and --optShapes of trtexec command-line [flags](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec-flags)) For example, if you are using onnxruntime_perf_test, you can try this: `./onnxruntime_perf_test -e tensorrt -r 1 -i "trt_profile_min_shapes|imgs:1x3x384x288 trt_profile_max_shapes|imgs:32x3x384x288 trt_profile_opt_shapes|imgs:16x3x384x288" your_model_path` If the engine cache is enabled, you still need to provide these three explicit provider options in order to use this feature. ORT TRT will compare the min/max/opt profile shape with the ones saved in .profile file to decide whether to rebuild the engine. Constraints to use these provider options: (1) Need to specify min/max/opt profile shapes for all the dynamic shape input This feature is also requested by other users: #13851
Previous behavior of TRT EP to set TRT optimization profiles for dynamic shape input is based on input tensor values. Users can't explicitly specify the profiles. This PR makes users capable of specifying min/max/opt profiles through newly added three provider options: `trt_profile_min_shapes`, `trt_profile_max_shapes` and `trt_profile_opt_shapes` with the format of "input1:dim1xdim2...,input2:dim3xdim4...". (Note: It's similar to --minShapes, --maxShapes and --optShapes of trtexec command-line [flags](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec-flags)) For example, if you are using onnxruntime_perf_test, you can try this: `./onnxruntime_perf_test -e tensorrt -r 1 -i "trt_profile_min_shapes|imgs:1x3x384x288 trt_profile_max_shapes|imgs:32x3x384x288 trt_profile_opt_shapes|imgs:16x3x384x288" your_model_path` If the engine cache is enabled, you still need to provide these three explicit provider options in order to use this feature. ORT TRT will compare the min/max/opt profile shape with the ones saved in .profile file to decide whether to rebuild the engine. Constraints to use these provider options: (1) Need to specify min/max/opt profile shapes for all the dynamic shape input This feature is also requested by other users: microsoft#13851
Describe the feature request
The current behavior of the TensorRT Runtime when using Dynamic Batch is to use the shape of the incoming tensor to programatically determine the
OptProfileSelector:kMIN
,OptProfileSelector:kMAX
andOptProfileSelector:kOPT
. These values are then passed to the TensorRT engine builder to optimize the model - which can be and extremely slow process (observed to be up to ~10 minutes with some models). The problem with this is that because Onnxruntime does not necessarily have the information it needs to build a correctly shaped model it can end up building the model multiple times.This proposal (and I am happy to do the work) is to allow a user to optionally pass in explicit values for
kMIN
,kMAX
,kOPT
in the tensorrt_provider_options.The current algorithm basically something like.
first
toINT_MAX
andsecond
to-INT_MAX
else read the two values from the profile filefirst
> incoming tensorbatch_size
thenfirst
=batch_size
and indicate model needs rebuildsecond value
< incoming tensorbatch_size
thensecond
=batch_size
and indicate model needs rebuildkMIN
=first
andkMAX
=second
andkOPT
=second
Describe scenario use case
This proposal is to bring OnnxRuntime TensorRT provider into similar behavior to NVIDIA Deepstream where they explicity set
kMIN
= 1 andkMAX
andkOPT
to a user suppliedbatch_size
.This proposal does not intend to change existing behavior but allow a user to override if they want.
The text was updated successfully, but these errors were encountered: