Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF-TRT Dynamic Shapes Feature Tracker #45481

Open
tfeher opened this issue Dec 8, 2020 · 5 comments
Open

TF-TRT Dynamic Shapes Feature Tracker #45481

tfeher opened this issue Dec 8, 2020 · 5 comments
Assignees
Labels
comp:gpu:tensorrt Issues specific to TensorRT stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests

Comments

@tfeher
Copy link
Contributor

tfeher commented Dec 8, 2020

Introduction

The dynamic shape mode in TF-TRT utilizes TensorRT’s dynamic shape feature to improve the conversion rate of networks and handle networks with unknown input shapes efficiently. This issue tracks the ongoing development to enable TRT's dynamic shape mode trough TF-TRT.

Who will benefit with this feature?
The conversion rate and therefore the performance will improve for the following inference problems:

  • Network with unknown input shapes (e.g. fully convolutional object detection networks)
  • Networks where the first (batch) dimension of a tensor changes within the graph (e.g. BERT)
  • Networks that have subgraphs where the tensors have non identical first dimension

Additionally the memory usage will improve: to handle input tensors with different shapes (eg image size, sequence length) currently requires separate TRT engine creation for each input. With dynamic shape mode a single engine can handle various input shapes.

Will this change the current api? How?

Some change in the conversion parameters will be necessary to enable/disable dynamic shape mode and provide a way to select optimization profiles.

Phase 1

The first phase of this work is the basic scaffolding to enable the TF-TRT converter to use TRT's dynami shape API.

  1. Add implicit batch experimental Add implicit batch experimental #34293
  2. Enable explicit batch mode Enable TF-TRT explicit batch mode #36379
  3. Improve binding index query Improve TensorRT binding index query #36434
  4. Add binding size specification Add TensorRT binding size dimension specification #36435
  5. Define networks with dynamic shapes Define TensorRT network with dynamic shapes #36439
  6. Add optimizaton profiles Add TensorRT optimization profiles #36660
  7. Execution context managment Execution context management for TensorRT profiles #36664
  8. TensorRT profile generation mode TensorRT profile generation mode #36729

Phase 2

Enable dynamic shape mode for ops used in MobileNet, ResNet, Bert. This includes improvement in the converters plus increasing their unit test coverage. Note that at this stage dynamic shape mode is still experimental.

Phase 3

This is a direct continuation of Phase 2. Ensure that all op converter support dynamic shape mode and test it. We have almost 20 converters to update and test. The bulk of this work is improving the test coverage.

Phase 3+

Some converters in phase 3 were updated only for explicit batch support with static shape. Enable dynamic shape mode for them:

Additionally:

Phase 4

Implement calibration in dynamic shape mode. Using TRT 7.1 one can run calibration in dynamic shape mode.

  • TRTEngineOp: allow profile collection before calibration
  • Refine APIs: build mode + calibration + lazy calibration

Phase 5

Test performance of dynamic shape mode

Phase 6

Define API to enable dynamic shape and specify optimization profiles.

Phase 7

  • C++ conversion API TF-TRT C++ conversion #52012
    Optional elements from Phase 6 are moved here.
  • Implement UserDefined profile
  • Change default conversion param from implicit batch mode to dynamic shape mode

Tagging @DEKHTIARJonathan and @bixia1

@tfeher tfeher added the type:feature Feature requests label Dec 8, 2020
@Saduf2019 Saduf2019 added the comp:gpu:tensorrt Issues specific to TensorRT label Dec 9, 2020
@Saduf2019 Saduf2019 assigned jvishnuvardhan and unassigned Saduf2019 Dec 9, 2020
@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Dec 10, 2020
copybara-service bot pushed a commit that referenced this issue Jan 14, 2021
Imported from GitHub PR #46382

This PR adds explicit batch and dynamic shape mode tests to ConvertConcat.

Tagging @bixia1 for review and @DEKHTIARJonathan for visibility.

Tracker: #45481
Copybara import of the project:

--
40508ef by Tamas Bela Feher <tfeher@nvidia.com>:

TF-TRT Test ConvertConcat in dynamic shape mode

COPYBARA_INTEGRATE_REVIEW=#46382 from tfeher:trt_concat_dynamic 40508ef
PiperOrigin-RevId: 351863997
Change-Id: I65b51b9aaba5301665687a9c730945d25e657676
@DrXuQian
Copy link

DrXuQian commented Jul 8, 2021

Hi tfeher, About 'Networks where the first (batch) dimension of a tensor changes within the graph (e.g. BERT)'.
Why would the first dimension change within the graph? I would assume that once the batch size is determined, the shape won't change throughout graph of BERT.

@tfeher
Copy link
Contributor Author

tfeher commented Jul 9, 2021

Why would the first dimension change within the graph?

Some networks do reshape operations that change the first dim. It is done internally to express some operations more conveniently, the output is usually reshaped again to have the expected batch size. An example is BERT TF1 model.

The dynamic shape feature of TF-TRT improves the TRT conversion of such networks.

@jiweibo
Copy link

jiweibo commented Jul 29, 2021

Hi, @tfeher can tf-trt dynamic_batch support the situation of being divided into multiple trt subgraphs now?
And how should we set the shape information of the min, max, and opt of the internal subgraph?

image

@tfeher
Copy link
Contributor Author

tfeher commented Jul 29, 2021

can tf-trt dynamic_batch support the situation of being divided into multiple trt subgraphs now?

Yes, dynamic shape mode supports graphs that have multiple TRT subgraphs. There is a known issue tensorflow/tensorrt#251, which occurs if the trt_engine_op is trying to output shape tensors. Otherwise it should work.

And how should we set the shape information of the min, max, and opt of the internal subgraph?

You only need to set the shape information for the inputs of the model. From these we calculate the size of any tensor in the graph, and set input shape information for the internal TRT subgraphs (trt_engine_ops).

@christopherbate
Copy link
Contributor

PR for Conv2dBackpropInput #51468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:gpu:tensorrt Issues specific to TensorRT stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

8 participants