[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

tfeher · 2021-09-29T16:08:48Z

Workaround for TRT 7 to handle shape outputs.

Previously, we use createExecutionContextWithoutDeviceMemory to create IExecutionContext, so that TF-TRT can manage the needed device memory to support the execution using the IExecutionContext. This triggers a bug in TRT 7. To workaround the bug, we switch to use createExecutionContext for creating IExecutionContext when the TRTEngine has int32 output.

Add two test cases.

Tracker #45481

Tagging @bixia1 for review.

bixia1 · 2021-09-29T21:35:17Z

Looks good to me. I modified the PR description with more details. Please check.

tensorflow/compiler/tf2tensorrt/utils/trt_engine_utils.cc

tfeher

Thanks @bixia1 for the review, I have updated the comment. Also checked the modified PR description, it looks fine.

tensorflow/compiler/tf2tensorrt/utils/trt_engine_utils.cc

christopherbate · 2021-09-30T05:25:34Z

Based on the code then it's not the fact that there is a shape tensor, but that it is int32 data type. That's not the same thing, right? Because shape tensor has additional qualifications of being 0-d or 1-d.

Does this same behavior occur when index output of "topK" operation is set as output?

Is there no op that we can insert at the output to force TRT into correct behavior?

tfeher · 2021-09-30T07:37:24Z

Thanks @christopherbate for your comment! You are right, we could and should be more specific when checking the output tensor. I have added checks to ensure that it is 0-d or 1-d.

Does this same behavior occur when index output of "topK" operation is set as output?

The error happens at an assertion of a shape tensor size: "Assertion failed: size == shapeTensorVolume(out.extent)"
I expect that the problem would only appear with int32 outputs that are shape tensors.

Is there no op that we can insert at the output to force TRT into correct behavior?

I am not aware of any.

bixia1 · 2021-09-30T16:28:06Z

I am not sure whether we can/should add check for 0-d or 1-d to filter the cases. It depends on the situation of the bug. For example, if we have this in the network:
shapeOp(4-d) -> reshape(2x2)
Do we need to allocate context with GPU memory to make this work?

tfeher · 2021-09-30T19:58:00Z

@bixia1, you are right. I have checked it, and the error happens if I insert the reshape. I will revert the change, which filters for the 0 and 1d cases.

I will also add two unit tests that would trigger the bug. To add the tests, I will need to rebase this PR to #52181.

tfeher · 2021-10-05T20:39:02Z

Rebased to avoid conflicts. Ready to proceed @bixia1

bixia1

Looks good to me, thanks for working on this!
Just two minor comments and it should be good for merging after this.

tensorflow/python/compiler/tensorrt/test/shape_output_test.py

tfeher

Thanks @bixia for the review, I have addressed the issues.

tensorflow/python/compiler/tensorrt/test/shape_output_test.py

google-ml-butler bot added the size:S CL Change Size: Small label Sep 29, 2021

google-ml-butler bot requested review from joker-eph, sanjoy and sherhut September 29, 2021 16:08

google-cla bot added the cla: yes label Sep 29, 2021

google-ml-butler bot added the awaiting review Pull request awaiting review label Sep 29, 2021

tfeher mentioned this pull request Sep 29, 2021

TF-TRT Dynamic Shapes Feature Tracker #45481

Open

bixia1 reviewed Sep 29, 2021

View reviewed changes

tensorflow/compiler/tf2tensorrt/utils/trt_engine_utils.cc Outdated Show resolved Hide resolved

tfeher force-pushed the trt_context_fix_for_shape_output branch from c430234 to db47424 Compare September 29, 2021 21:55

tfeher commented Sep 29, 2021

View reviewed changes

tensorflow/compiler/tf2tensorrt/utils/trt_engine_utils.cc Outdated Show resolved Hide resolved

gbaned self-assigned this Sep 30, 2021

gbaned added the comp:gpu:tensorrt Issues specific to TensorRT label Sep 30, 2021

gbaned added this to Assigned Reviewer in PR Queue via automation Sep 30, 2021

tfeher force-pushed the trt_context_fix_for_shape_output branch from db47424 to 56b64d4 Compare September 30, 2021 07:25

tfeher force-pushed the trt_context_fix_for_shape_output branch from 56b64d4 to ad618aa Compare September 30, 2021 20:21

tfeher mentioned this pull request Sep 30, 2021

[TF:TRT] Fix profile handling for pruned inputs #52211

Merged

sanjoy removed their request for review October 1, 2021 03:35

tensorflowbutler removed the awaiting review Pull request awaiting review label Oct 3, 2021

tfeher force-pushed the trt_context_fix_for_shape_output branch 2 times, most recently from f36cd86 to 1a3ffc0 Compare October 5, 2021 20:36

gbaned requested a review from bixia1 October 6, 2021 14:48

google-ml-butler bot added the awaiting review Pull request awaiting review label Oct 6, 2021

bixia1 reviewed Oct 6, 2021

View reviewed changes

tensorflow/python/compiler/tensorrt/test/shape_output_test.py Outdated Show resolved Hide resolved

tensorflow/python/compiler/tensorrt/test/shape_output_test.py Outdated Show resolved Hide resolved

TF-TRT Create execution context with device memory if needed

5d151c0

tfeher force-pushed the trt_context_fix_for_shape_output branch from 1a3ffc0 to 5d151c0 Compare October 6, 2021 22:17

tfeher commented Oct 6, 2021

View reviewed changes

tensorflow/python/compiler/tensorrt/test/shape_output_test.py Outdated Show resolved Hide resolved

tensorflow/python/compiler/tensorrt/test/shape_output_test.py Outdated Show resolved Hide resolved

gbaned requested a review from bixia1 October 7, 2021 10:16

tensorflowbutler removed the awaiting review Pull request awaiting review label Oct 9, 2021

bixia1 approved these changes Oct 11, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Oct 11, 2021

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Oct 11, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Oct 11, 2021

copybara-service bot merged commit 5d0277f into tensorflow:master Oct 12, 2021

google-ml-butler bot removed the ready to pull PR ready for merge process label Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

tfeher commented Sep 29, 2021 •

edited by bixia1

bixia1 commented Sep 29, 2021

tfeher left a comment

christopherbate commented Sep 30, 2021

tfeher commented Sep 30, 2021

bixia1 commented Sep 30, 2021

tfeher commented Sep 30, 2021

tfeher commented Oct 5, 2021

bixia1 left a comment

tfeher left a comment

[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

Conversation

tfeher commented Sep 29, 2021 • edited by bixia1

bixia1 commented Sep 29, 2021

tfeher left a comment

Choose a reason for hiding this comment

christopherbate commented Sep 30, 2021

tfeher commented Sep 30, 2021

bixia1 commented Sep 30, 2021

tfeher commented Sep 30, 2021

tfeher commented Oct 5, 2021

bixia1 left a comment

Choose a reason for hiding this comment

tfeher left a comment

Choose a reason for hiding this comment

tfeher commented Sep 29, 2021 •

edited by bixia1