Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Bugfix in TRT Engine deserialization indexing #1646

Merged
merged 1 commit into from
Feb 3, 2023

Conversation

gs-olive
Copy link
Collaborator

@gs-olive gs-olive commented Feb 2, 2023

Description

The IO Tensors and Bindings within the TensorRT ICudaEngine object are not necessarily stored in index-order, or in the order they are stored in PyTorch. Thus, one of the checks in the TRTEngine which extracts the TRT binding name and compares it to the Torch binding name is unnecessary/incorrect and can be improved.

As an example, consider an engine with two inputs: {“input_0”, “input_1”} and two outputs {“output_0”, “output_1”}. The Torch binding names (c10::NameList) stores these as:

[“input_0”, “input_1”, “output_0”, “output_1”]

The TRT Engine binding buffer stores these names as:

[“input_0”, “input_1”, “output_1”, “output_0”]

Thus, when we use direct indexing to access the binding names, an error is encountered, despite the fact that the overall set of IO tensors is the same.

  • Fix bug causing crash when loading serialized TRT Engines through Torch-TRT for models with multiple outputs
  • Improve TRT Engine binding verification by not assuming sorted indexing in binding order
  • Improve check in TRTEngine.cpp for existence of binding index

Fixes #1550
Fixes #1645

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ ~ ] I have added tests to verify my fix or my feature
    • Verified locally on multiple user test cases, listed in the "Fixed" above
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

- Fix bug causing crash when loading serialized TRT Engines through
Torch-TRT for models with multiple outputs
- Improve TRT Engine binding verification by not assuming sorted
indexing in binding order
- Improve check in `TRTEngine.cpp` for existence of binding index
Comment on lines 130 to -131
auto trt_idx = cuda_engine->getBindingIndex(binding_name.c_str());
std::string engine_binded_name = cuda_engine->getIOTensorName(inputs_size + pyt_idx);
Copy link
Collaborator Author

@gs-olive gs-olive Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original issue occurs here, since trt_idx != inputs_size + pyt_idx in general (for example, when there are multiple output tensors from the TRT Engine)

Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@narendasan narendasan merged commit d638730 into pytorch:main Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants