Skip to content

Support QDQ transformations with com.microsoft.Quantize/Dequantize ops#17127

Merged
adrianlizarraga merged 57 commits into
mainfrom
adrianl/contrib-qdq-optimizations
Aug 25, 2023
Merged

Support QDQ transformations with com.microsoft.Quantize/Dequantize ops#17127
adrianlizarraga merged 57 commits into
mainfrom
adrianl/contrib-qdq-optimizations

Conversation

@adrianlizarraga
Copy link
Copy Markdown
Contributor

@adrianlizarraga adrianlizarraga commented Aug 12, 2023

Description

  • Enables int32 support for com.microsoft.DequantizeLinear (contrib op)
  • Makes the zero_point input optional for Quantize/Dequantize contrib ops
  • Enables QDQ transformations with the Quantize/Dequantize contrib ops
  • Update tests: EnsureUniqueDQForNodeUnitTests, QDQTransformerTests, TransposeOptimizerTests

Testing

List of tested graph transformations:

  • QDQSelectorActionTransformer
    • qdq_transformer_test.cc
  • QDQS8ToU8Transformer
    • qdq_transformer_test.cc
  • DoubleQDQPairsRemover
    • qdq_transformer_test.cc
  • IdenticalChildrenConsolidation
    • qdq_transformer_test.cc
  • QDQPropagation
    • qdq_transformer_test.cc
  • QDQFinalCleanup
    • qdq_transformer_test.cc
  • CliQuantFusion
    • qdq_transformer_test.cc
  • ReluQuantFusion
    • qdq_transformer_test.cc
  • EnsureUniqueDQForNodeUnit
    • ensure_unique_dq_for_node_unit_test.cc
  • TransposeOptimizer
    • transpose_optimizer_test.cc
  • CommonSubexpressionElimination
    • graph_transform_test.cc
  • ConstantFolding
    • graph_transform_test.cc

Motivation and Context

We need to support mixed 16-bit/8-bit precision QDQ models. This PR is the first step in achieving this goal: we need to make QDQ contrib ops work with our optimizations/transformations.

@adrianlizarraga adrianlizarraga marked this pull request as ready for review August 14, 2023 17:21
@adrianlizarraga adrianlizarraga requested a review from a team as a code owner August 14, 2023 17:21
@adrianlizarraga adrianlizarraga requested a review from pengwa August 22, 2023 00:00
Comment thread onnxruntime/test/testdata/transform/convert_qdq_ops_to_ms_domain.py Fixed
skottmckay
skottmckay previously approved these changes Aug 24, 2023
Comment thread onnxruntime/test/optimizer/transpose_optimizer_test.cc Outdated
edgchen1
edgchen1 previously approved these changes Aug 24, 2023
@yufenglee
Copy link
Copy Markdown
Member

QuantizeLinear, 1,

I think you are adding int16 support. I don't see the change.


Refers to: onnxruntime/core/graph/contrib_ops/quantization_defs.cc:144 in 375d3a2. [](commit_id = 375d3a2, deletion_comment = False)

@adrianlizarraga
Copy link
Copy Markdown
Contributor Author

adrianlizarraga commented Aug 25, 2023

I think you are adding int16 support. I don't see the change.

@yufenglee The work is being broken down into separate/smaller PRs. This specific PR focuses on making sure contrib QDQ ops can be optimized in the same manner as ONNX ops (please refer to the PR description for details).

The next PR (linked in the description) adds int16 support, but I'd like to get this one merged in before starting reviews on it.

Copy link
Copy Markdown
Member

@yufenglee yufenglee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@adrianlizarraga adrianlizarraga merged commit 5a83a67 into main Aug 25, 2023
@adrianlizarraga adrianlizarraga deleted the adrianl/contrib-qdq-optimizations branch August 25, 2023 16:57
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
microsoft#17127)

### Description
- Enables int32 support for com.microsoft.DequantizeLinear (contrib op)
- Makes the `zero_point` input optional for Quantize/Dequantize contrib
ops
- Enables QDQ transformations with the Quantize/Dequantize contrib ops
- Update tests: EnsureUniqueDQForNodeUnitTests, QDQTransformerTests,
TransposeOptimizerTests

### Testing
List of tested graph transformations:
- [x] QDQSelectorActionTransformer
  - qdq_transformer_test.cc
- [x] QDQS8ToU8Transformer
  - qdq_transformer_test.cc
- [x] DoubleQDQPairsRemover
  - qdq_transformer_test.cc
- [x] IdenticalChildrenConsolidation
  - qdq_transformer_test.cc
- [x] QDQPropagation
  - qdq_transformer_test.cc
- [x] QDQFinalCleanup
  - qdq_transformer_test.cc
- [x] CliQuantFusion
  - qdq_transformer_test.cc
- [x] ReluQuantFusion
  - qdq_transformer_test.cc
- [x] EnsureUniqueDQForNodeUnit 
  - ensure_unique_dq_for_node_unit_test.cc
- [x] TransposeOptimizer 
  - transpose_optimizer_test.cc
- [x] CommonSubexpressionElimination
  - graph_transform_test.cc
- [x] ConstantFolding
  - graph_transform_test.cc

### Motivation and Context
We need to [support mixed 16-bit/8-bit precision QDQ
models](microsoft#17015). This PR is
the first step in achieving this goal: we need to make QDQ contrib ops
work with our optimizations/transformations.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants