-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF-TRT Dynamic Shapes Feature Tracker #45481
Comments
Imported from GitHub PR #46382 This PR adds explicit batch and dynamic shape mode tests to ConvertConcat. Tagging @bixia1 for review and @DEKHTIARJonathan for visibility. Tracker: #45481 Copybara import of the project: -- 40508ef by Tamas Bela Feher <tfeher@nvidia.com>: TF-TRT Test ConvertConcat in dynamic shape mode COPYBARA_INTEGRATE_REVIEW=#46382 from tfeher:trt_concat_dynamic 40508ef PiperOrigin-RevId: 351863997 Change-Id: I65b51b9aaba5301665687a9c730945d25e657676
Hi tfeher, About 'Networks where the first (batch) dimension of a tensor changes within the graph (e.g. BERT)'. |
Some networks do reshape operations that change the first dim. It is done internally to express some operations more conveniently, the output is usually reshaped again to have the expected batch size. An example is BERT TF1 model. The dynamic shape feature of TF-TRT improves the TRT conversion of such networks. |
Hi, @tfeher can tf-trt dynamic_batch support the situation of being divided into multiple trt subgraphs now? |
Yes, dynamic shape mode supports graphs that have multiple TRT subgraphs. There is a known issue tensorflow/tensorrt#251, which occurs if the trt_engine_op is trying to output shape tensors. Otherwise it should work.
You only need to set the shape information for the inputs of the model. From these we calculate the size of any tensor in the graph, and set input shape information for the internal TRT subgraphs ( |
PR for Conv2dBackpropInput #51468 |
Introduction
The dynamic shape mode in TF-TRT utilizes TensorRT’s dynamic shape feature to improve the conversion rate of networks and handle networks with unknown input shapes efficiently. This issue tracks the ongoing development to enable TRT's dynamic shape mode trough TF-TRT.
Who will benefit with this feature?
The conversion rate and therefore the performance will improve for the following inference problems:
Additionally the memory usage will improve: to handle input tensors with different shapes (eg image size, sequence length) currently requires separate TRT engine creation for each input. With dynamic shape mode a single engine can handle various input shapes.
Will this change the current api? How?
Some change in the conversion parameters will be necessary to enable/disable dynamic shape mode and provide a way to select optimization profiles.
Phase 1
The first phase of this work is the basic scaffolding to enable the TF-TRT converter to use TRT's dynami shape API.
Phase 2
Enable dynamic shape mode for ops used in MobileNet, ResNet, Bert. This includes improvement in the converters plus increasing their unit test coverage. Note that at this stage dynamic shape mode is still experimental.
Phase 3
This is a direct continuation of Phase 2. Ensure that all op converter support dynamic shape mode and test it. We have almost 20 converters to update and test. The bulk of this work is improving the test coverage.
Phase 3+
Some converters in phase 3 were updated only for explicit batch support with static shape. Enable dynamic shape mode for them:
Additionally:
[TF:TRT] Always set binding dimension #52181
[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186
Phase 4
Implement calibration in dynamic shape mode. Using TRT 7.1 one can run calibration in dynamic shape mode.
Phase 5
Test performance of dynamic shape mode
Phase 6
Define API to enable dynamic shape and specify optimization profiles.
Phase 7
Optional elements from Phase 6 are moved here.
Tagging @DEKHTIARJonathan and @bixia1
The text was updated successfully, but these errors were encountered: