[TF:TRT] Enable TensorRT explicit precision (QDQ/QAT) support #52248

christopherbate · 2021-10-04T18:22:03Z

Adds TensorRT QDQ support ("explicit precision mode"), also sometimes referred to as "QAT support", referring to the training algorithm where QDQ nodes are used. From here on, we refer to the existing non-explicit precision pathway in the code base as the "dynamic range INT8" mode (DR INT8) and the new mode as the QDQ INT8 mode.

In the new mode, TF QuantizeAndDequantize operations are converted to TensorRT quantization scaling layers. Both the new QDQ mode logic as well as existing DR mode logic (ConvertQuantize) are moved into the file convert/ops/quantization_ops.cc.

In addition, in QDQ mode it is necessary to prevent the existing Grappler optimizations invoked in trt_convert.py on the loaded SavedModel from folding frozen QuantizeAndDequantizeV2 operations between weighted ops (Conv, Matmul ,etc) and the weight constants. Thus, we depend on the experimental Grappler rewriter config option experimental_disable_folding_quantization_emulation and will be affected if it is removed. The alternative is to allow Grappler folding of the QDQ and constant weights and inserting identity QDQ scale factors manually during TensorRT network construction, but the logic becomes extremely verbose .

A test suite is added in convert/ops/quantization_ops_test.cc. It builds a variety of sub-graph patterns and tests for conversion success. Because TRT QDQ mode has evolved significantly in terms of robustness and features between TRT 7 and TRT8, a set of test waive/skip policies are added indicating which patterns of use are appropriate for TRT7 vs TRT8

gbaned · 2021-10-28T14:30:56Z

@christopherbate Can you please resolve conflicts? Thanks!

christopherbate · 2021-11-01T22:21:54Z

Conflicts resolved

christopherbate · 2021-11-30T22:35:26Z

All done. Let me know when you want me to squash.

tensorflow/compiler/tf2tensorrt/convert/ops/quantization_ops_test.cc

bixia1 · 2021-12-01T00:43:57Z

tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc

    use_calibration_ = false;
  }

+  const bool use_explicit_precision = GraphDefHasQDQNodes(item.graph);


I have two questions regarding this:
(1) we don't consider the TensorRT version to enable use_explicit_precision, this looks wrong to me.
For example, a user has a model that uses QuantizeAndDequantizeV2 + TensorRT 7. It was working but now with this PR, he sees an error produced by TensorRT, similar to that in quantization_ops_test.cc line 435 & 440. Am I right here?
(2) A user has a model that uses all the OPs in kQuantizationOpNames + TensorRT8. It was working fine. But now with your change, we only convert one out of the four OPs in kQuantizationOpNames . Will this be a problem?

OK yes, I let me disable this for TRT < 8.0. We can use TRT 7 for very limited use cases, but likely EP mode won't be useful in the context of TF-TRT with TRT7 at this time.

christopherbate · 2021-12-01T16:50:28Z

Completely disabled explicit QDQ tests and grappler options (disable_emulate_quantization_folding) for TRT < 8.0.

christopherbate · 2021-12-01T21:58:58Z

@bixia1 Good to squash?

gbaned · 2021-12-03T14:51:06Z

@christopherbate Can you please resolve conflicts? Thanks!

christopherbate · 2021-12-03T16:41:24Z

@christopherbate Can you please resolve conflicts? Thanks!

rebased, thanks

bixia1 · 2021-12-03T17:56:02Z

@christopherbate please squash and I will approve.

christopherbate · 2021-12-04T02:36:33Z

squashed

google-ml-butler bot added the size:XL CL Change Size:Extra Large label Oct 4, 2021

google-ml-butler bot requested review from joker-eph, sanjoy and sherhut October 4, 2021 18:22

google-ml-butler bot added the awaiting review Pull request awaiting review label Oct 4, 2021

google-cla bot added the cla: yes label Oct 4, 2021

gbaned self-assigned this Oct 5, 2021

gbaned added the comp:gpu:tensorrt Issues specific to TensorRT label Oct 5, 2021

christopherbate mentioned this pull request Oct 5, 2021

[TF:TRT] bilinear resize refactor + workaround #52188

Closed

christopherbate force-pushed the tf-trt-ep-support branch 2 times, most recently from 50cfcc3 to d076898 Compare October 5, 2021 21:39

christopherbate mentioned this pull request Oct 5, 2021

[TF:TRT] Add structured op converter #52271

Merged

christopherbate force-pushed the tf-trt-ep-support branch 3 times, most recently from ea649e0 to dad56b9 Compare October 6, 2021 03:04

sanjoy requested review from bixia1 and removed request for sanjoy October 6, 2021 05:12

christopherbate force-pushed the tf-trt-ep-support branch 4 times, most recently from b71e338 to 7e88f0b Compare October 7, 2021 21:55

gbaned added stat:awaiting response Status - Awaiting response from author and removed awaiting review Pull request awaiting review labels Oct 28, 2021

christopherbate force-pushed the tf-trt-ep-support branch 2 times, most recently from 25ba28e to 4083baf Compare November 1, 2021 22:21

gbaned removed the stat:awaiting response Status - Awaiting response from author label Nov 2, 2021

gbaned removed the request for review from bixia1 November 2, 2021 13:28

christopherbate force-pushed the tf-trt-ep-support branch 2 times, most recently from e6482aa to dac3bc4 Compare November 30, 2021 22:50

bixia1 reviewed Dec 1, 2021

View reviewed changes

christopherbate requested a review from bixia1 December 2, 2021 16:27

gbaned added stat:awaiting response Status - Awaiting response from author and removed awaiting review Pull request awaiting review labels Dec 3, 2021

christopherbate force-pushed the tf-trt-ep-support branch from 6400408 to d8f9bac Compare December 3, 2021 16:41

[TF:TRT] Enable TRT explicit precision support

a96b152

christopherbate force-pushed the tf-trt-ep-support branch from d8f9bac to a96b152 Compare December 4, 2021 02:36

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Dec 6, 2021

gbaned requested review from bixia1 and removed request for bixia1 December 6, 2021 08:34

google-ml-butler bot added the awaiting review Pull request awaiting review label Dec 6, 2021

bixia1 approved these changes Dec 6, 2021

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Dec 6, 2021

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 6, 2021

copybara-service bot merged commit 7103c2c into tensorflow:master Dec 6, 2021

google-ml-butler bot removed awaiting review Pull request awaiting review ready to pull PR ready for merge process labels Dec 6, 2021

Nyrio mentioned this pull request Feb 14, 2022

[TF-TRT] Fix pylint error in trt_convert.py #54376

Merged

codejaeger mentioned this pull request Feb 16, 2023

Cannot convert explicit Q/DQ nodes using TF-TRT #59711

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TF:TRT] Enable TensorRT explicit precision (QDQ/QAT) support #52248

[TF:TRT] Enable TensorRT explicit precision (QDQ/QAT) support #52248

christopherbate commented Oct 4, 2021 •

edited

Loading

gbaned commented Oct 28, 2021

christopherbate commented Nov 1, 2021

christopherbate commented Nov 30, 2021

bixia1 Dec 1, 2021

christopherbate Dec 1, 2021

christopherbate Dec 1, 2021

christopherbate commented Dec 1, 2021

christopherbate commented Dec 1, 2021

gbaned commented Dec 3, 2021

christopherbate commented Dec 3, 2021

bixia1 commented Dec 3, 2021

christopherbate commented Dec 4, 2021

[TF:TRT] Enable TensorRT explicit precision (QDQ/QAT) support #52248

[TF:TRT] Enable TensorRT explicit precision (QDQ/QAT) support #52248

Conversation

christopherbate commented Oct 4, 2021 • edited Loading

gbaned commented Oct 28, 2021

christopherbate commented Nov 1, 2021

christopherbate commented Nov 30, 2021

bixia1 Dec 1, 2021

Choose a reason for hiding this comment

christopherbate Dec 1, 2021

Choose a reason for hiding this comment

christopherbate Dec 1, 2021

Choose a reason for hiding this comment

christopherbate commented Dec 1, 2021

christopherbate commented Dec 1, 2021

gbaned commented Dec 3, 2021

christopherbate commented Dec 3, 2021

bixia1 commented Dec 3, 2021

christopherbate commented Dec 4, 2021

christopherbate commented Oct 4, 2021 •

edited

Loading