fix: Bugfix in TRT Engine deserialization indexing #1646

gs-olive · 2023-02-02T23:34:49Z

Description

The IO Tensors and Bindings within the TensorRT ICudaEngine object are not necessarily stored in index-order, or in the order they are stored in PyTorch. Thus, one of the checks in the TRTEngine which extracts the TRT binding name and compares it to the Torch binding name is unnecessary/incorrect and can be improved.

As an example, consider an engine with two inputs: {“input_0”, “input_1”} and two outputs {“output_0”, “output_1”}. The Torch binding names (c10::NameList) stores these as:

[“input_0”, “input_1”, “output_0”, “output_1”]

The TRT Engine binding buffer stores these names as:

[“input_0”, “input_1”, “output_1”, “output_0”]

Thus, when we use direct indexing to access the binding names, an error is encountered, despite the fact that the overall set of IO tensors is the same.

Fix bug causing crash when loading serialized TRT Engines through Torch-TRT for models with multiple outputs
Improve TRT Engine binding verification by not assuming sorted indexing in binding order
Improve check in TRTEngine.cpp for existence of binding index

Fixes #1550
Fixes #1645

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

[ x ] My code follows the style guidelines of this project (You can use the linters)
[ x ] I have performed a self-review of my own code
[ x ] I have commented my code, particularly in hard-to-understand areas and hacks
[ x ] I have made corresponding changes to the documentation
[ ~ ] I have added tests to verify my fix or my feature
- Verified locally on multiple user test cases, listed in the "Fixed" above
[ x ] New and existing unit tests pass locally with my changes
[ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

- Fix bug causing crash when loading serialized TRT Engines through Torch-TRT for models with multiple outputs - Improve TRT Engine binding verification by not assuming sorted indexing in binding order - Improve check in `TRTEngine.cpp` for existence of binding index

gs-olive · 2023-02-02T23:36:27Z

core/runtime/TRTEngine.cpp

      auto trt_idx = cuda_engine->getBindingIndex(binding_name.c_str());
-      std::string engine_binded_name = cuda_engine->getIOTensorName(inputs_size + pyt_idx);


Original issue occurs here, since trt_idx != inputs_size + pyt_idx in general (for example, when there are multiple output tensors from the TRT Engine)

narendasan

LGTM

gs-olive requested review from narendasan and peri044 February 2, 2023 23:34

gs-olive self-assigned this Feb 2, 2023

facebook-github-bot added the cla signed label Feb 2, 2023

github-actions bot added component: core Issues re: The core compiler component: runtime labels Feb 2, 2023

github-actions bot requested a review from bowang007 February 2, 2023 23:35

gs-olive commented Feb 2, 2023

View reviewed changes

narendasan approved these changes Feb 3, 2023

View reviewed changes

narendasan merged commit d638730 into pytorch:main Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Bugfix in TRT Engine deserialization indexing #1646

fix: Bugfix in TRT Engine deserialization indexing #1646

gs-olive commented Feb 2, 2023

gs-olive Feb 2, 2023 •

edited

narendasan left a comment

		auto trt_idx = cuda_engine->getBindingIndex(binding_name.c_str());
		std::string engine_binded_name = cuda_engine->getIOTensorName(inputs_size + pyt_idx);

fix: Bugfix in TRT Engine deserialization indexing #1646

fix: Bugfix in TRT Engine deserialization indexing #1646

Conversation

gs-olive commented Feb 2, 2023

Description

Type of change

Checklist:

gs-olive Feb 2, 2023 • edited

Choose a reason for hiding this comment

narendasan left a comment

Choose a reason for hiding this comment

gs-olive Feb 2, 2023 •

edited