-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MultiModal] Remove MultiModalOnnxPredictor and rename export_tensorrt #2983
Conversation
02eac4d
to
728a285
Compare
d479a3a
to
9c30959
Compare
Job PR-2983-9c30959 is done. |
(cherry picked from commit dd95466)
(cherry picked from commit 7f178e34f43e2cd3b139b20a7cfd3ac84f95a3e2)
1e35ef7
to
9865cad
Compare
Job PR-2983-9865cad is done. |
Job PR-2983-60187f7 is done. |
"tensorrt package is not installed. The package can be install via `pip install tensorrt`." | ||
) | ||
|
||
self.sess = ort.InferenceSession(onnx_model.SerializeToString(), providers=providers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provider
argument of OnnxModule
should be easy-to-remember strings, e.g., tensorrt_gpu
, onnx_gpu
, tensorrt_cpu
, onnx_cpu
? Then we can map these strings to more complex providers used by ort.InferenceSession
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There wouldn't exist a
tensorrt_cpu
, since tensorrt would only serve the purpose of nvidia GPU. onnx_gpu
for OnnxModule: AMD also provide ROCm Execution Provider, which would makeonnx_gpu
argument for OnnxModule confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say there is a trade-off between easy-to-rememberr constants vs transparency to onnxruntime config.
tail_df = dataset.test_df.tail(2) | ||
|
||
# Load a refresh predictor and optimize it for inference | ||
for providers in [None, ["TensorrtExecutionProvider"], ["CUDAExecutionProvider"], ["CPUExecutionProvider"]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to use "TensorrtExecutionProvider"
instead of ["TensorrtExecutionProvider"]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
The providers
argument provides a fallback mechanism so that we can always put preferred backend on top of the list. If it is the only item in list, the fallback mechanism won't take effact at all. For instance,
if providers=["TensorrtExecutionProviders"], and tensorrt package isn't installed on the system, we would raise an error.
"onnxruntime would fallback to CUDAExecutionProvider instead of using TensorrtExecutionProvider." | ||
) | ||
# TODO: Try a better workaround to lazy import tensorrt package. | ||
tensorrt_imported = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems removable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this might related to the import sequence with onnruntime package. But's it's hard to ensure onnxruntime wouldn't be imported beforehand.
Job PR-2983-b3f7a19 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.