Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MultiModal] Remove MultiModalOnnxPredictor and rename export_tensorrt #2983

Merged
merged 9 commits into from
Mar 10, 2023

Conversation

liangfu
Copy link
Collaborator

@liangfu liangfu commented Feb 28, 2023

Issue #, if available:

Description of changes:

  1. remove MultiModalOnnxPredictor
  2. rename export_tensorrt -> optimize_for_inference
  3. enable ORT_TENSORRT_ENGINE_CACHE_ENABLE to save engine build time in the case that TensorRT may take long time to optimize and build engine.
  4. remove data dependency for optimize_for_inference

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@liangfu liangfu marked this pull request as ready for review March 4, 2023 03:01
@liangfu liangfu added the model list checked You have updated the model list after modifying multimodal unit tests/docs label Mar 4, 2023
@liangfu liangfu changed the title [MultiModal] Remove MultiModalOnnxPredictor and unify export_onnx and export_tensorrt [MultiModal] Remove MultiModalOnnxPredictor and rename export_tensorrt Mar 4, 2023
@github-actions
Copy link

github-actions bot commented Mar 6, 2023

Job PR-2983-9c30959 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2983/9c30959/index.html

@github-actions
Copy link

github-actions bot commented Mar 8, 2023

Job PR-2983-9865cad is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2983/9865cad/index.html

@github-actions
Copy link

github-actions bot commented Mar 9, 2023

Job PR-2983-60187f7 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2983/60187f7/index.html

"tensorrt package is not installed. The package can be install via `pip install tensorrt`."
)

self.sess = ort.InferenceSession(onnx_model.SerializeToString(), providers=providers)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provider argument of OnnxModule should be easy-to-remember strings, e.g., tensorrt_gpu, onnx_gpu, tensorrt_cpu, onnx_cpu? Then we can map these strings to more complex providers used by ort.InferenceSession.

Copy link
Collaborator Author

@liangfu liangfu Mar 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. There wouldn't exist a tensorrt_cpu, since tensorrt would only serve the purpose of nvidia GPU.
  2. onnx_gpu for OnnxModule: AMD also provide ROCm Execution Provider, which would make onnx_gpu argument for OnnxModule confusing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say there is a trade-off between easy-to-rememberr constants vs transparency to onnxruntime config.

tail_df = dataset.test_df.tail(2)

# Load a refresh predictor and optimize it for inference
for providers in [None, ["TensorrtExecutionProvider"], ["CUDAExecutionProvider"], ["CPUExecutionProvider"]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to use "TensorrtExecutionProvider" instead of ["TensorrtExecutionProvider"]?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

The providers argument provides a fallback mechanism so that we can always put preferred backend on top of the list. If it is the only item in list, the fallback mechanism won't take effact at all. For instance,

if providers=["TensorrtExecutionProviders"], and tensorrt package isn't installed on the system, we would raise an error.

"onnxruntime would fallback to CUDAExecutionProvider instead of using TensorrtExecutionProvider."
)
# TODO: Try a better workaround to lazy import tensorrt package.
tensorrt_imported = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems removable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this might related to the import sequence with onnruntime package. But's it's hard to ensure onnxruntime wouldn't be imported beforehand.

@github-actions
Copy link

Job PR-2983-b3f7a19 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2983/b3f7a19/index.html

Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liangfu liangfu merged commit 0c33a4c into autogluon:master Mar 10, 2023
@liangfu liangfu deleted the remove-onnx-predictor-1 branch March 10, 2023 18:47
gradientsky pushed a commit to gradientsky/autogluon that referenced this pull request Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model list checked You have updated the model list after modifying multimodal unit tests/docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants