Add 16a4w_block QAT config #15878

sxu · 2025-11-19T00:19:52Z

Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's convert. _derived_bias_quant_spec also looks for it to correctly derive bias scale.

Open to suggestions on how to test. Naveen launched a QAT run and it seems to produce reasonable results.

Differential Revision: D87194388

pytorch-bot · 2025-11-19T00:19:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15878

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 2 Unrelated Failures

As of commit 0a8fd5c with merge base b4d72f1 ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for backends/qualcomm/tests/test_qnn_delegate.py:
pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 49c760a0e4ec9f05d0b6d834dd883a29d1ab9c9f28b19d5ff7c23805eaff6473 /exec failed with exit code 92
pull / test-models-linux (w2l, portable, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 68b431304cd05ad0870fcc6478bb3bee28f5f8b69cd944da0bd22b779cd2d43a /exec failed with exit code 1

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-editable / linux / linux-job (gh) (similar failure)
exir/backend/test/test_lowered_backend_module.py::TestBackendAPI::test_emit_nested_lowered_backend_module

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
exir/backend/test/test_lowered_backend_module.py::TestBackendAPI::test_emit_nested_lowered_backend_module

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-11-19T00:20:00Z

@sxu has exported this pull request. If you are a Meta employee, you can view the originating Diff in D87194388.

billmguo · 2025-11-19T23:52:04Z

@winskuo-quic can you review and approve this diff?

haowhsu-quic · 2025-11-20T00:00:20Z

backends/qualcomm/quantizer/observers/per_block_param_observer.py

+        self.eps = eps
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        if x.numel() == 0:


Would it be simpler if calling torchao.quantization.quant_primitives._fake_quantize_affine directly?

DannyYuyang-quic · 2025-11-20T01:51:59Z

I think for QAT testing we can use pseudo labels generated by the FP32 model, do a few mini training steps on the fake-quant model, and then compare outputs against the FP32 baseline (pseudo labels) within acceptable atol/rtol thresholds as usual.

Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388

sxu requested a review from cccclai as a code owner November 19, 2025 00:19

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 19, 2025

meta-codesync bot added fb-exported meta-exported labels Nov 19, 2025

sxu requested review from haowhsu-quic, shewu-quic and winskuo-quic November 19, 2025 00:21

sxu added the release notes: none Do not include this in the release notes label Nov 19, 2025

haowhsu-quic reviewed Nov 20, 2025

View reviewed changes

Add 16a4w_block QAT config (pytorch#15878)

0a8fd5c

Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388

sxu force-pushed the export-D87194388 branch from eb2e9f9 to 0a8fd5c Compare November 21, 2025 01:24

sxu marked this pull request as draft November 21, 2025 01:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add 16a4w_block QAT config #15878

Add 16a4w_block QAT config #15878

sxu commented Nov 19, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 19, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Nov 19, 2025

Uh oh!

billmguo commented Nov 19, 2025

Uh oh!

haowhsu-quic Nov 20, 2025

Uh oh!

DannyYuyang-quic commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add 16a4w_block QAT config #15878

Are you sure you want to change the base?

Add 16a4w_block QAT config #15878

Conversation

sxu commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15878

❌ 3 New Failures, 2 Unrelated Failures

Uh oh!

meta-codesync bot commented Nov 19, 2025

Uh oh!

billmguo commented Nov 19, 2025

Uh oh!

haowhsu-quic Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

DannyYuyang-quic commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sxu commented Nov 19, 2025 •

edited

Loading

pytorch-bot bot commented Nov 19, 2025 •

edited

Loading