Skip to content

Conversation

@MartinPavella
Copy link
Collaborator

@MartinPavella MartinPavella commented Oct 15, 2025

Summary

This PR adds an edge dialect pre-processing pass to remove some Q/DQ nodes. This enables some non-delegated nodes (which run on the CPU) to run in directly in int8 and avoid the QDQ compute overhead. This improves the inference speed (by eliminating the need to artificially quantize and de-quantize input and output values.

Test plan

Unit tests provided.

cc @robert-kalmar

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15148

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b2831c3 with merge base 3b1aeda (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2025
@digantdesai digantdesai added the module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ label Oct 27, 2025
@roman-janik-nxp roman-janik-nxp changed the title NXP Backend: Add padd to remove unnecessary Quantize/Dequantize nodes. NXP Backend: Add pass to remove unnecessary Quantize/Dequantize nodes. Oct 30, 2025
@MartinPavella MartinPavella force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch from aa651f1 to 972ad89 Compare November 18, 2025 13:49
@MartinPavella
Copy link
Collaborator Author

@pytorchbot label "module: nxp" "release notes: nxp"

@pytorch-bot pytorch-bot bot added the release notes: nxp Changes to the NXP Neutron backend delegate label Nov 18, 2025
@MartinPavella MartinPavella force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch from 972ad89 to 66e43e8 Compare November 20, 2025 08:29
@MartinPavella MartinPavella marked this pull request as ready for review November 20, 2025 08:30
@MartinPavella MartinPavella force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch from 66e43e8 to b093129 Compare November 20, 2025 09:05
@MartinPavella MartinPavella force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch 4 times, most recently from d5ba591 to e529f83 Compare November 25, 2025 09:11
@MartinPavella MartinPavella force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch from e529f83 to b276902 Compare November 26, 2025 07:25
@robert-kalmar
Copy link
Collaborator

Update the Summary, the pass has different intention:

This PR adds an edge dialect pre-processing pass to remove some Q/DQ nodes. This enables some non-delegated nodes (which run on the CPU) to run in directly in int8 and avoid the QDQ compute overhead. This improves the inference speed (by eliminating the need to artificially quantize and de-quantize input and output values.

@robert-kalmar robert-kalmar force-pushed the upstream/main-nxp/EIEX-519-upstream-removeadditionalqdqclusters-pass branch from b276902 to b2831c3 Compare December 2, 2025 08:19
@MartinPavella MartinPavella merged commit c00d726 into pytorch:main Dec 3, 2025
140 of 141 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants