Skip to content

Conversation

pytorchbot
Copy link
Collaborator

Summary:
When enabling the XNNPACK weight cache and running a model with qb4 or qc8-quantized linear weights, it triggers an assertion that is intended to make sure all data is in the weight cache. This can be reproduced by running the XNNPACK backend linear op tests with weight cache enabled.

The root cause appears to be that tensor scale data is bypassing the weight cache - likely an oversight. This isn't a correctness issue, but does cause the aforementioned assert to fail and uses marginally more memory than it otherwise needs to.

This PR updates the XNNPACK compileModel call to use the weight cache for scale data (instead of putting it in the unpacked_buffers list). With this change, the linear op tests pass with weight cache enabled.

Differential Revision: D82862629

Differential Revision: D82862629

Pull Request resolved: #14448

(cherry picked from commit cf1c4bc)
Copy link

pytorch-bot bot commented Sep 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14455

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 32fe0bf with merge base e0dda90 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 22, 2025
@shoumikhin shoumikhin merged commit cae59f8 into release/1.0 Sep 22, 2025
123 of 124 checks passed
@shoumikhin shoumikhin deleted the cherry-pick-14448-by-pytorch_bot_bot_ branch September 22, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants