Use weight cache for quantized tensor scale data #14455

pytorchbot · 2025-09-22T04:05:16Z

Summary:
When enabling the XNNPACK weight cache and running a model with qb4 or qc8-quantized linear weights, it triggers an assertion that is intended to make sure all data is in the weight cache. This can be reproduced by running the XNNPACK backend linear op tests with weight cache enabled.

The root cause appears to be that tensor scale data is bypassing the weight cache - likely an oversight. This isn't a correctness issue, but does cause the aforementioned assert to fail and uses marginally more memory than it otherwise needs to.

This PR updates the XNNPACK compileModel call to use the weight cache for scale data (instead of putting it in the unpacked_buffers list). With this change, the linear op tests pass with weight cache enabled.

Differential Revision: D82862629

Differential Revision: D82862629 Pull Request resolved: #14448 (cherry picked from commit cf1c4bc)

pytorch-bot · 2025-09-22T04:05:19Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14455

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 32fe0bf with merge base e0dda90 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Use weight cache for quantized tensor scale data

32fe0bf

Differential Revision: D82862629 Pull Request resolved: #14448 (cherry picked from commit cf1c4bc)

pytorchbot requested a review from digantdesai as a code owner September 22, 2025 04:05

This was referenced Sep 22, 2025

[v1.0.0] Release Tracker #14288

Open

Use weight cache for quantized tensor scale data #14448

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 22, 2025

shoumikhin approved these changes Sep 22, 2025

View reviewed changes

shoumikhin merged commit cae59f8 into release/1.0 Sep 22, 2025
123 of 124 checks passed

shoumikhin deleted the cherry-pick-14448-by-pytorch_bot_bot_ branch September 22, 2025 16:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use weight cache for quantized tensor scale data #14455

Use weight cache for quantized tensor scale data #14455

Uh oh!

pytorchbot commented Sep 22, 2025

Uh oh!

pytorch-bot bot commented Sep 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Use weight cache for quantized tensor scale data #14455

Use weight cache for quantized tensor scale data #14455

Uh oh!

Conversation

pytorchbot commented Sep 22, 2025

Uh oh!

pytorch-bot bot commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14455

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 22, 2025 •

edited

Loading