Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TF:TRT] Create execution context with device memory if shape output is present in TRT 7 #52186

Merged

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Sep 29, 2021

Workaround for TRT 7 to handle shape outputs.

Previously, we use createExecutionContextWithoutDeviceMemory to create IExecutionContext, so that TF-TRT can manage the needed device memory to support the execution using the IExecutionContext. This triggers a bug in TRT 7. To workaround the bug, we switch to use createExecutionContext for creating IExecutionContext when the TRTEngine has int32 output.

Add two test cases.

Tracker #45481

Tagging @bixia1 for review.

@google-ml-butler google-ml-butler bot added the size:S CL Change Size: Small label Sep 29, 2021
@google-cla google-cla bot added the cla: yes label Sep 29, 2021
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label Sep 29, 2021
@bixia1
Copy link
Contributor

bixia1 commented Sep 29, 2021

Looks good to me. I modified the PR description with more details. Please check.

@tfeher tfeher force-pushed the trt_context_fix_for_shape_output branch from c430234 to db47424 Compare September 29, 2021 21:55
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bixia1 for the review, I have updated the comment. Also checked the modified PR description, it looks fine.

tensorflow/compiler/tf2tensorrt/utils/trt_engine_utils.cc Outdated Show resolved Hide resolved
@gbaned gbaned self-assigned this Sep 30, 2021
@gbaned gbaned added the comp:gpu:tensorrt Issues specific to TensorRT label Sep 30, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Sep 30, 2021
@christopherbate
Copy link
Contributor

Based on the code then it's not the fact that there is a shape tensor, but that it is int32 data type. That's not the same thing, right? Because shape tensor has additional qualifications of being 0-d or 1-d.

Does this same behavior occur when index output of "topK" operation is set as output?

Is there no op that we can insert at the output to force TRT into correct behavior?

@tfeher tfeher force-pushed the trt_context_fix_for_shape_output branch from db47424 to 56b64d4 Compare September 30, 2021 07:25
@tfeher
Copy link
Contributor Author

tfeher commented Sep 30, 2021

Thanks @christopherbate for your comment! You are right, we could and should be more specific when checking the output tensor. I have added checks to ensure that it is 0-d or 1-d.

Does this same behavior occur when index output of "topK" operation is set as output?

The error happens at an assertion of a shape tensor size: "Assertion failed: size == shapeTensorVolume(out.extent)"
I expect that the problem would only appear with int32 outputs that are shape tensors.

Is there no op that we can insert at the output to force TRT into correct behavior?

I am not aware of any.

@bixia1
Copy link
Contributor

bixia1 commented Sep 30, 2021

I am not sure whether we can/should add check for 0-d or 1-d to filter the cases. It depends on the situation of the bug. For example, if we have this in the network:
shapeOp(4-d) -> reshape(2x2)
Do we need to allocate context with GPU memory to make this work?

@tfeher
Copy link
Contributor Author

tfeher commented Sep 30, 2021

@bixia1, you are right. I have checked it, and the error happens if I insert the reshape. I will revert the change, which filters for the 0 and 1d cases.

I will also add two unit tests that would trigger the bug. To add the tests, I will need to rebase this PR to #52181.

@tfeher tfeher force-pushed the trt_context_fix_for_shape_output branch from 56b64d4 to ad618aa Compare September 30, 2021 20:21
@sanjoy sanjoy removed their request for review October 1, 2021 03:35
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Oct 3, 2021
@tfeher tfeher force-pushed the trt_context_fix_for_shape_output branch 2 times, most recently from f36cd86 to 1a3ffc0 Compare October 5, 2021 20:36
@tfeher
Copy link
Contributor Author

tfeher commented Oct 5, 2021

Rebased to avoid conflicts. Ready to proceed @bixia1

@gbaned gbaned requested a review from bixia1 October 6, 2021 14:48
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label Oct 6, 2021
Copy link
Contributor

@bixia1 bixia1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for working on this!
Just two minor comments and it should be good for merging after this.

@tfeher tfeher force-pushed the trt_context_fix_for_shape_output branch from 1a3ffc0 to 5d151c0 Compare October 6, 2021 22:17
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bixia for the review, I have addressed the issues.

@gbaned gbaned requested a review from bixia1 October 7, 2021 10:16
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Oct 9, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Oct 11, 2021
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Oct 11, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Oct 11, 2021
@copybara-service copybara-service bot merged commit 5d0277f into tensorflow:master Oct 12, 2021
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:gpu:tensorrt Issues specific to TensorRT size:S CL Change Size: Small
Projects
PR Queue
  
Approved by Reviewer
Development

Successfully merging this pull request may close these issues.

None yet

6 participants