Skip to content

Fix InsertIOQDQ KeyError for dequantize encodings (#18622)#18622

Open
abhinaykukkadapu wants to merge 1 commit intopytorch:mainfrom
abhinaykukkadapu:fix-insert-io-qdq-keyerror-v2
Open

Fix InsertIOQDQ KeyError for dequantize encodings (#18622)#18622
abhinaykukkadapu wants to merge 1 commit intopytorch:mainfrom
abhinaykukkadapu:fix-insert-io-qdq-keyerror-v2

Conversation

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor

@abhinaykukkadapu abhinaykukkadapu commented Mar 31, 2026

Summary:
q_dq_map only contained quantize ops as keys, so when a node with a dequantize encoding (e.g. a pre-quantized LLM parameter) feeds the output node, the lookup crashes with a KeyError.

Add dequantize ops as keys in q_dq_map, mapping them to the correct dequantize target for output boundary insertion (dq.default -> dq.tensor, matching the existing quantize convention).

Since dequantize targets are now keys, _create_node transfers QCOM_QUANT_ATTRS to inserted dequant nodes. To prevent the live iterator from revisiting these nodes, iterate over a snapshot via list(graph_module.graph.nodes).

Fixes #17732

Differential Revision: D98977887

Pulled By: abhinaykukkadapu

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 31, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18622

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 7efb4e5 with merge base 8b30cfe (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 31, 2026
@github-project-automation github-project-automation bot moved this to To triage in ExecuTorch Core Mar 31, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Mar 31, 2026

@abhinaykukkadapu has imported this pull request. If you are a Meta employee, you can view this in D98977887.

@abhinaykukkadapu
Copy link
Copy Markdown
Contributor Author

abhinaykukkadapu commented Mar 31, 2026

Moving the discussion from other PR.

@haowhsu-quic

Hi @abhinaykukkadapu, the root cause of #17732 was having too many shards due to unsupported op like #16690. I believe the solution of #17732 has already been addressed in #17194.

Thanks for taking a look, right this task is not 1-1 mapping for the exact issue, but there is a comment on the task that refers to unavail dq in the map: #17732 (comment)

Also #17194 got reverted in #17385 this change with list() semantics should pass the tests.

@meta-codesync meta-codesync bot changed the title [Qualcomm] Fix InsertIOQDQ KeyError for dequantize encodings Fix InsertIOQDQ KeyError for dequantize encodings (#18622) Apr 1, 2026
abhinaykukkadapu added a commit to abhinaykukkadapu/executorch that referenced this pull request Apr 1, 2026
Summary:
q_dq_map only contained quantize ops as keys, so when a node with a dequantize encoding (e.g. a pre-quantized LLM parameter) feeds the output node, the lookup crashes with a KeyError.

Add dequantize ops as keys in q_dq_map, mapping them to the correct dequantize target for output boundary insertion (dq.default -> dq.tensor, matching the existing quantize convention).

Since dequantize targets are now keys, _create_node transfers QCOM_QUANT_ATTRS to inserted dequant nodes. To prevent the live iterator from revisiting these nodes, iterate over a snapshot via list(graph_module.graph.nodes).

Fixes pytorch#17732


Differential Revision: D98977887

Pulled By: abhinaykukkadapu
@abhinaykukkadapu abhinaykukkadapu force-pushed the fix-insert-io-qdq-keyerror-v2 branch from b7f5914 to ecae50d Compare April 1, 2026 04:20
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Apr 1, 2026

@abhinaykukkadapu has exported this pull request. If you are a Meta employee, you can view the originating Diff in D98977887.

abhinaykukkadapu added a commit to abhinaykukkadapu/executorch that referenced this pull request Apr 1, 2026
Summary:
q_dq_map only contained quantize ops as keys, so when a node with a dequantize encoding (e.g. a pre-quantized LLM parameter) feeds the output node, the lookup crashes with a KeyError.

Add dequantize ops as keys in q_dq_map, mapping them to the correct dequantize target for output boundary insertion (dq.default -> dq.tensor, matching the existing quantize convention).

Since dequantize targets are now keys, _create_node transfers QCOM_QUANT_ATTRS to inserted dequant nodes. To prevent the live iterator from revisiting these nodes, iterate over a snapshot via list(graph_module.graph.nodes).

Fixes pytorch#17732


Differential Revision: D98977887

Pulled By: abhinaykukkadapu
@abhinaykukkadapu abhinaykukkadapu force-pushed the fix-insert-io-qdq-keyerror-v2 branch from ecae50d to 14d0dbb Compare April 1, 2026 06:02

# insert dq before output or fold mix_quantization q if applicable
users = list(n.users.keys())
if n.meta.get(QCOM_QUANT_ATTRS) and any(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like we could just add a check like n.target != exir_ops.edge.quantized_decomposed.dequantize_per_tensor.tensor for any dequantize nodes that already be added.
Then we don't need the list approach (probably will have smaller memory footprint). But I still wonder if this is really an issue? Looks like the root cause of #17782 is because they tweaked the codebase.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue this diff is handling is not just with the snapshot of the nodes which is solved by the list. Here is what we want:

  1. For input we want to pop the quant attrs, and insert q node
  2. For output, we insert dq node but we don't want to pop the attrs

The target for example a conv node has q attrs, weight node if it feeds output will have dq attrs, we want to insert dq node for both without popping. I will put up a patch where we can reuse q_ops may be.

Copy link
Copy Markdown
Contributor Author

@abhinaykukkadapu abhinaykukkadapu Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haowhsu-quic i'm referring to the comments on this task, which is irrelevant to the task they commented on but this is the root cause: #17732 (comment)

Summary:
q_dq_map only contained quantize ops as keys, so when a node with a dequantize encoding (e.g. a pre-quantized LLM parameter) feeds the output node, the lookup crashes with a KeyError.

Add dequantize ops as keys in q_dq_map, mapping them to the correct dequantize target for output boundary insertion (dq.default -> dq.tensor, matching the existing quantize convention).

Since dequantize targets are now keys, _create_node transfers QCOM_QUANT_ATTRS to inserted dequant nodes. To prevent the live iterator from revisiting these nodes, iterate over a snapshot via list(graph_module.graph.nodes).

Fixes pytorch#17732


Differential Revision: D98977887

Pulled By: abhinaykukkadapu
@abhinaykukkadapu abhinaykukkadapu force-pushed the fix-insert-io-qdq-keyerror-v2 branch from 14d0dbb to 7efb4e5 Compare April 1, 2026 19:07
)
meta_val = node.meta["val"]
if target in self.q_dq_map:
if target in q_ops:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to avoid pop on target node if we are inserting dq.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

Status: To triage

Development

Successfully merging this pull request may close these issues.

Exception: An error occurred when running the 'InsertIOQDQ' pass after the following passes: ['FoldQDQ', 'InsertRequantize']

2 participants