-
Notifications
You must be signed in to change notification settings - Fork 732
Add 16a4w_block QAT config #15878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add 16a4w_block QAT config #15878
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15878
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 2 Unrelated FailuresAs of commit 0a8fd5c with merge base b4d72f1 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@winskuo-quic can you review and approve this diff? |
| self.eps = eps | ||
|
|
||
| def forward(self, x: torch.Tensor) -> torch.Tensor: | ||
| if x.numel() == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be simpler if calling torchao.quantization.quant_primitives._fake_quantize_affine directly?
|
I think for QAT testing we can use pseudo labels generated by the FP32 model, do a few mini training steps on the fake-quant model, and then compare outputs against the FP32 baseline (pseudo labels) within acceptable atol/rtol thresholds as usual. |
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388
eb2e9f9 to
0a8fd5c
Compare
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's
convert._derived_bias_quant_specalso looks for it to correctly derive bias scale.Open to suggestions on how to test. Naveen launched a QAT run and it seems to produce reasonable results.
Differential Revision: D87194388