Skip to content

Conversation

@manuelcandales
Copy link
Contributor

@manuelcandales manuelcandales commented Feb 2, 2026

This pull request adds support for quantized model testing (specifically int4 quantization) to the Metal backend module tests, using torchao. It introduces the ability to specify quantization options in the module registry, applies quantization to linear layers as needed, and updates the testing pipeline to compare quantized models against unquantized references when appropriate. A new test model with int4 quantization is also added.

Quantization Support and Configuration:

  • Added imports and availability checks for torchao and its quantization APIs, enabling conditional quantization support in tests.
  • Expanded the module registry documentation to describe new quantization-related options (qlinear, qlinear_group_size, compare_to_unquantized) and how to use them for int4 quantization.

Test Model and Registry Enhancements:

  • Introduced LinearNoBiasInt4, a linear layer test model without bias and with int4 quantization, and registered it in MODULE_REGISTRY with appropriate quantization and tolerance settings.

Quantization Application Logic:

  • Enhanced get_model_and_inputs to accept quantization arguments, apply quantization to models as specified in the registry, and added the quantize_model helper to perform quantization using torchao. [1] [2]

Testing Pipeline Updates:

  • Updated export_model_to_files to optionally compare quantized model outputs to unquantized references, handling device placement and ensuring correct output computation for quantized models.
  • Modified the main test routine to determine whether to compare quantized models to unquantized references based on the registry configuration.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 2, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17118

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit fec15bc with merge base ba6de95 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

manuelcandales added a commit that referenced this pull request Feb 2, 2026
ghstack-source-id: 7873518
ghstack-comment-id: 3837492780
Pull-Request: #17118
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 2, 2026
@manuelcandales manuelcandales requested review from larryliu0820 and mergennachin and removed request for cccclai and shoumikhin February 2, 2026 21:33
[ghstack-poisoned]
manuelcandales added a commit that referenced this pull request Feb 2, 2026
ghstack-source-id: 0925784
ghstack-comment-id: 3837492780
Pull-Request: #17118
[ghstack-poisoned]
manuelcandales added a commit that referenced this pull request Feb 2, 2026
ghstack-source-id: f6fba00
ghstack-comment-id: 3837492780
Pull-Request: #17118
[ghstack-poisoned]
manuelcandales added a commit that referenced this pull request Feb 2, 2026
ghstack-source-id: 5363293
ghstack-comment-id: 3837492780
Pull-Request: #17118
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants