Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TFTRT] Add Dynamic Shape Testing for ConvertConv3D #46940

Conversation

DEKHTIARJonathan
Copy link
Contributor

@bixia1 @tfeher for review

Feature Tracker: #45481

@google-ml-butler google-ml-butler bot added the size:L CL Change Size: Large label Feb 5, 2021
@google-cla google-cla bot added the cla: yes label Feb 5, 2021
@gbaned gbaned self-assigned this Feb 5, 2021
@gbaned gbaned added the comp:gpu:tensorrt Issues specific to TensorRT label Feb 5, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Feb 5, 2021
@bixia1 bixia1 self-requested a review February 5, 2021 16:12
Copy link
Contributor

@bixia1 bixia1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you! I see some formatting issues. Can you please run clang format on the code as described here https://www.tensorflow.org/community/contribute/code_style ?

Scope s = Scope::NewRootScope();
auto input = ops::Placeholder(s.WithOpName("input"), DT_FLOAT);
auto filter = ops::Placeholder(s.WithOpName("weights"), DT_FLOAT);
// Get the NodeDef for Pack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Get the/Gets a/


test->AddTestTensor("input", p.input_dims, test->get_tf_type(), p.input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test work in dynamic shape mode? AFAIK TRT requires static (known) values for the channel dimension. By default AddTestTensor replaces all input dims with -1, therefore this should fail the test in dynamic shape mode.

For the Conv2D test, I needed to fix the input dims like this, and define an extra test for the condition on the channel value .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work so far on my end. Let me know if you have any issue trying to reproduce this behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jonathan, you are right, it does work. Unfortunately, I am also right: in general it should not work in dynamic shape mode because TRT requires that the channel dimension is known. The following happens here: TRT is clever enough to deduce that setting min=max shape optimization profile means that the channel dims is known at engine creation time, and treats it as if the corresponding input dim was set to that known value (instead of -1 that we provide).

If you define a profile where min<max for the channel dimension, then TRT gives the following error message:

DefaultLogger /my_conv3d-conv:CONVOLUTION: number of channels in input tensor to a convolution layer must not be dynamic
DefaultLogger Builder failed while analyzing shapes.

I recommend the following:

  1. In the converter, check that the channel is known. We can relax this a little bit: allow conversion if kOptimal or kImplicitBatchModeCompatible profiles are used, because they have min=max for non batch dimension, reject converting the op otherwise.
  2. Adapt the test use give concrete channel dimensions.

Most importantly, I think we should adapt the unit test to check that the converters work correctly for a profile where min < max. This will be a general case in the future, and we want to be prepared for that. One option would be the following:

  1. Rebase the PR so that it contains the profile management logic from Add TF-TRT optimization profiles for testing #45588

  2. Define a ProfileStrategy::kUnitTest profile by setting min = opt = input_shape, max = min+1. Here is a sketch:

void SetUnitTestProfile(
    const std::vector<nvinfer1::Dims>& dimvec, std::vector<nvinfer1::Dims>* min,
    std::vector<nvinfer1::Dims>* opt, std::vector<nvinfer1::Dims>* max) {
  *min = dimvec;
  *opt = dimvec;
  *max = dimvec;
  for (auto& dim : *max) {
    for (int k = 0; k < dim.nbDims; k++) dim.d[k]++;
  }
}

This would come into trt_shape_optimization_profiles.cc, and a new case should be added switch statement here.
Make the kUnitTest profile as a default profile for unit tests. (Currently it is set here.)

  1. Check if the correct error is returned.

Step 3-4 could be implemented in a separate PR.

My motivation to introduce a specific profile for unit test is the following:

  • BuildAndRun sees a single input shape for the network.
  • From this single shape we want to create a non-trivial profile (min!=max). kImplicitBatchModeCompatible only changes the batch dim, but we want to a profile that also changes other dims.
  • Other profile options that we were considering for future implementations are derived from multiple input shapes, therefore they are not applicable.

@bixia1 please let us know if you have other ideas / preferences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with putting step 3-4 in a separated PR. However, instead of adding kUnitTest strategy, I suggest you to open a backdoor in TrtShapeOptimizationProfile, such as using a special test_only_field to indicate the need to modify the kOptimal mode, to serve similar purpose (similar to what we have in the V2 converter python class to support the switch of implicit/explicit batch mode). The ProfileStrategy enums will be hoisted out to a bigger scope to support the API change for dynamic shape. I don't feel that I can get pass the google code interview if we have kUnitTest in the enums.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tfeher I applied your change request: 1, 2 and 5: 38be62c

Can you please check and confirm that works for you.
If it does, I'll squash my commits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @DEKHTIARJonathan for the update, it looks good to me.

@DEKHTIARJonathan DEKHTIARJonathan force-pushed the jdekhtiar/dynamic_shape/ConvertConv3D branch from 38be62c to 20f5356 Compare February 23, 2021 18:48
@DEKHTIARJonathan
Copy link
Contributor Author

@bixia1 : PR is squashed and good to go ;)

@DEKHTIARJonathan DEKHTIARJonathan force-pushed the jdekhtiar/dynamic_shape/ConvertConv3D branch from 20f5356 to 033135a Compare February 23, 2021 18:52
@bixia1
Copy link
Contributor

bixia1 commented Feb 24, 2021

Hi, there is a comment you haven't addressed yet, https://github.com/tensorflow/tensorflow/pull/46940/files#r571100867.

I am approving it so that I can get it for testing but please feel free to push another commit and squash it.

Copy link
Contributor

@bixia1 bixia1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving it for the purpose of testing.

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Feb 24, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Feb 24, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Feb 24, 2021
@copybara-service copybara-service bot merged commit 110ecce into tensorflow:master Feb 24, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Feb 24, 2021
@DEKHTIARJonathan DEKHTIARJonathan deleted the jdekhtiar/dynamic_shape/ConvertConv3D branch March 3, 2021 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:gpu:tensorrt Issues specific to TensorRT ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

5 participants