-
Notifications
You must be signed in to change notification settings - Fork 754
NXP backend: Add QAT support for NeutronQuantizer #15692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXP backend: Add QAT support for NeutronQuantizer #15692
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15692
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 5c2ff05 with merge base 32916d3 ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "release notes: nxp" |
|
@pytorchbot label "module: nxp" |
f174949 to
35fce40
Compare
28ad64a to
26ceba4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Quantization Aware Training (QAT) support to the NeutronQuantizer backend. The implementation uses fake quantization with moving average observers for QAT mode while maintaining backward compatibility with existing Post-Training Quantization (PTQ) mode.
- Introduces
is_qatparameter throughout the quantization pipeline to control QAT vs PTQ behavior - Updates all quantization patterns to support QAT with appropriate fake quantization and observers
- Adds comprehensive test coverage for both QAT and PTQ modes across all test files
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
backends/nxp/quantizer/neutron_quantizer.py |
Refactors quantization specs into functions that return QAT or PTQ-specific specs based on is_qat parameter |
backends/nxp/quantizer/patterns.py |
Updates QuantizationPattern base class and all pattern classes to accept is_qat parameter and use appropriate observers/fake quantizers |
backends/nxp/quantizer/utils.py |
Adds QAT support to post_training_quantize function with prepare_qat_pt2e and move_exported_model_to_eval |
backends/nxp/tests/use_qat.py |
Adds pytest fixture to parametrize tests with QAT and PTQ modes |
backends/nxp/tests/executorch_pipeline.py |
Threads use_qat parameter through pipeline functions |
backends/nxp/tests/models.py |
Adds MLP model for QAT training test |
backends/nxp/tests/test_quantizer.py |
Adds QAT-specific tests including training test and graph comparison test |
| All other test files | Updates tests to parametrize with use_qat fixture for comprehensive coverage |
Comments suppressed due to low confidence (4)
backends/nxp/quantizer/patterns.py:1
- [nitpick] The order of initialization should be consistent with parent class expectations. Consider moving
super().__init__(is_qat=is_qat)to be the first statement in__init__, following Python best practices for constructor ordering.
# Copyright (c) Meta Platforms, Inc. and affiliates.
backends/nxp/quantizer/utils.py:1
- The function name
post_training_quantizeis misleading whenis_qat=True, as QAT (Quantization Aware Training) is not post-training quantization. Consider renaming toquantize_modelorprepare_and_convert_modelto reflect that it handles both PTQ and QAT modes.
# Copyright (c) Meta Platforms, Inc. and affiliates.
backends/nxp/quantizer/patterns.py:210
- This class does not call QuantizationPattern.init during initialization. (AddmmPattern.init may be missing a call to a base class init)
class AddmmPattern(QuantizationPattern):
backends/nxp/quantizer/patterns.py:841
- This class does not call QuantizationPattern.init during initialization. (ActivationsConcatClusterPattern.init may be missing a call to a base class init)
class ActivationsConcatClusterPattern(QuantizationPattern):
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Looking back at my implementation changes, I should've renamed the |
26ceba4 to
8e58be6
Compare
8e58be6 to
0912c53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 34 out of 34 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
backends/nxp/quantizer/patterns.py:845
- This class does not call QuantizationPattern.init during initialization. (ActivationsConcatClusterPattern.init may be missing a call to a base class init)
class ActivationsConcatClusterPattern(QuantizationPattern):
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0912c53 to
4e30d39
Compare
jirioc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
7744543 to
5c2ff05
Compare
|
@StrycekSimon @robert-kalmar This PR means a great change in backend workflow and as such should be described in backend docs. There is still opened a PR for docs changes that precede QAT feature. Please merge that other PR and update docs in this or a new PR. |
Summary
Adds quantization aware training support for NeutronQuantizer.
Test plan
New test cases covering QAT mode were added + dedicated test for training a simple NN in QAT mode.
cc @robert-kalmar @JakeStevens @digantdesai