Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layout transform: Fix-up QDQ units and add constant folding #20685

Merged
merged 16 commits into from
May 21, 2024

Conversation

adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented May 15, 2024

Description

Problem 1: Broken Transpose QDQ unit

Layout transform's specialized cost function aggressively pushes down transposes with channel-first or channel-last perms. This can lead to a situation where a channel-fist/last Transpose gets stuck after being pushed through an Unsqueeze node that makes the Transpose's perm no longer channel-first/last. At this point, the specialized cost function defers to the default const function, which does not see a need to continue pushing this transpose node. This breaks the QDQ node units for both the Unsqueeze and the Transpose: DQ -> Unsqueeze -> Transpose -> Q.

image

The transpose optimizer should insert a Q -> DQ pair between the Unsqueeze and Transpose nodes to fix both QDQ node units: DQ -> Unsqueeze -> Q[new] -> DQ[new] -> Transpose -> Q

image

Problem 2: Inserted Squeeze/Transpose nodes should be constant folded when possible.

The transpose optimizer inserts Squeeze (and Transpose) ops between an initializer and a DQ to counteract the effect of Unsqueezing that initializer if it is consumed by multiple nodes. This results in a graph where the inserted nodes are not in valid node units:

Original graph where two Mul nodes share a common initializer input:
image

Resulting graph after transpose optimization without constant folding:
image

Here, the circled Transpose and Squeeze nodes operate on a quantized integer type but are not in valid QDQ node units. The solution is to run constant folding, which results in:
image

Motivation and Context

Improve the layout transformation to allow more models to run on EPs that prefer the channel-last layout.

@adrianlizarraga adrianlizarraga changed the title Layout transform: Fix QDQ unit for Transpose stuck after Unsqueeze Layout transform: Fix QDQ units and add constant folding May 16, 2024
@adrianlizarraga adrianlizarraga marked this pull request as ready for review May 16, 2024 08:16
@adrianlizarraga adrianlizarraga changed the title Layout transform: Fix QDQ units and add constant folding Layout transform: Fix-up QDQ units and add constant folding May 16, 2024
@adrianlizarraga adrianlizarraga added the ep:QNN issues related to QNN exeution provider label May 16, 2024
@adrianlizarraga adrianlizarraga merged commit 8acf60f into main May 21, 2024
98 checks passed
@adrianlizarraga adrianlizarraga deleted the adrianl/reenable-l1-opt-after-layout-transform branch May 21, 2024 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:QNN issues related to QNN exeution provider release:1.18.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants