-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Layout transform: Fix-up QDQ units and add constant folding #20685
Merged
adrianlizarraga
merged 16 commits into
main
from
adrianl/reenable-l1-opt-after-layout-transform
May 21, 2024
Merged
Layout transform: Fix-up QDQ units and add constant folding #20685
adrianlizarraga
merged 16 commits into
main
from
adrianl/reenable-l1-opt-after-layout-transform
May 21, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… Unsqueeze -> Transpose -> Q
onnxruntime/test/testdata/make_layout_transform_const_fold_inserted_squeezes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/make_layout_transform_const_fold_inserted_squeezes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/make_layout_transform_const_fold_inserted_squeezes.py
Fixed
Show fixed
Hide fixed
adrianlizarraga
changed the title
Layout transform: Fix QDQ unit for Transpose stuck after Unsqueeze
Layout transform: Fix QDQ units and add constant folding
May 16, 2024
adrianlizarraga
commented
May 16, 2024
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
adrianlizarraga
commented
May 16, 2024
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
adrianlizarraga
commented
May 16, 2024
onnxruntime/test/testdata/make_layout_transform_const_folding.py
Outdated
Show resolved
Hide resolved
adrianlizarraga
commented
May 16, 2024
onnxruntime/test/testdata/make_layout_transform_const_folding.py
Outdated
Show resolved
Hide resolved
adrianlizarraga
changed the title
Layout transform: Fix QDQ units and add constant folding
Layout transform: Fix-up QDQ units and add constant folding
May 16, 2024
adrianlizarraga
commented
May 16, 2024
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
adrianlizarraga
commented
May 16, 2024
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
…does constant folding
…ub.com:microsoft/onnxruntime into adrianl/reenable-l1-opt-after-layout-transform
…'fix-ups' easier.
skottmckay
reviewed
May 18, 2024
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/optimizer/transpose_optimization/onnx_transpose_optimization.cc
Outdated
Show resolved
Hide resolved
onnxruntime/test/testdata/make_layout_transform_const_folding.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Scott McKay <skottmckay@gmail.com>
skottmckay
approved these changes
May 20, 2024
adrianlizarraga
deleted the
adrianl/reenable-l1-opt-after-layout-transform
branch
May 21, 2024 03:19
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Problem 1: Broken Transpose QDQ unit
Layout transform's specialized cost function aggressively pushes down transposes with channel-first or channel-last perms. This can lead to a situation where a channel-fist/last Transpose gets stuck after being pushed through an Unsqueeze node that makes the Transpose's perm no longer channel-first/last. At this point, the specialized cost function defers to the default const function, which does not see a need to continue pushing this transpose node. This breaks the QDQ node units for both the Unsqueeze and the Transpose: DQ -> Unsqueeze -> Transpose -> Q.
The transpose optimizer should insert a Q -> DQ pair between the Unsqueeze and Transpose nodes to fix both QDQ node units: DQ -> Unsqueeze -> Q[new] -> DQ[new] -> Transpose -> Q
Problem 2: Inserted Squeeze/Transpose nodes should be constant folded when possible.
The transpose optimizer inserts Squeeze (and Transpose) ops between an initializer and a DQ to counteract the effect of Unsqueezing that initializer if it is consumed by multiple nodes. This results in a graph where the inserted nodes are not in valid node units:
Original graph where two Mul nodes share a common initializer input:
Resulting graph after transpose optimization without constant folding:
Here, the circled Transpose and Squeeze nodes operate on a quantized integer type but are not in valid QDQ node units. The solution is to run constant folding, which results in:
Motivation and Context
Improve the layout transformation to allow more models to run on EPs that prefer the channel-last layout.