-
Notifications
You must be signed in to change notification settings - Fork 57
CVS-175447-[OVEP] Add a check for type mismatches in QDQ stripping #834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The casting node is for QuantizeLinear node correct? Could you add the before and after graph after this casting? Will it introduce any performance penalty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the output type of the DQ node directly and fix the C++ Lint warnings (C-style casts).
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp
Outdated
Show resolved
Hide resolved
|
The issue that causes the use of a C-style cast for the input arg is that it is provided as const by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's roll back this unordered_map which is mainly unused. As for the C-style cast, this warning could be potentially suppressed by using the const_cast<T&>().
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks!
|
Could you open the PR to get review from reviewers with write access? Please also add the Jira ticket. |
When rewiring the graph after eliminating QDQ pairs, the runtime now checks whether the type matches before and after the eliminated nodes and inserts a Cast node if there is a mismatch.
19f5230 to
69f09bb
Compare
|
@mklimenk This feature only impacts GPU, no impact on NPU right? |
Yes, this code isn't called by NPU path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds type mismatch detection and automatic Cast node insertion during QDQ (Quantize-Dequantize) pair elimination in the OpenVINO provider. The change addresses issues where floating point types differed before QuantizeLinear and after DequantizeLinear nodes.
Key Changes:
- Added type comparison logic before rewiring graph edges after QDQ elimination
- Inserted automatic Cast node creation when type mismatches are detected between float and float16
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp
Show resolved
Hide resolved
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the modifications. Looks Good !


Description
When rewiring the graph after eliminating QDQ pairs, the runtime now checks whether the type matches before and after the eliminated nodes and inserts a Cast node if there is a mismatch.
Motivation and Context
At present, QDQ elimination assumes the floating point type is the same before the QuantizeLinear node and after the following DequantizeLinear, producing errors if the types mismatch.
If feature goes to new ABI?
Yes
Jira Ticket :
CVS-175447