Skip to content

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Sep 5, 2025

Summary:
We found that there is not much reuse of packing format, so we now plan to define packing format for each of the dtype (int4, float8, intx), instead of having a global packing_format that's used by all the tensors. this reduces the interference between different dtype configs.

Changes

  • Moved int4 packing format to Int4PackingFormat enum
  • renamed packing_format arg in Int4WeightOnlyConfig and Float8DynamicActivationInt4WeightConfig to be int4_packing_format

This doesn't change tensor subclass, so no BC changes for tensor subclass. For v2 of Int4WeightOnlyConfig, it breaks BC, but we don't have any official models saved with this config yet, so it's fine. We also didn't add bc testing for this since it's not finalized yet. We'll add that later.

Test Plan:
Regression tests:
python test/quantization/quantize_/workflows/int4/test_int4_marlin_sparse_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_opaque_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_tile_packed_to_4d_tensor.py python test/core/test_config.py
python test/integration/test_load_and_run_checkpoint.py

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

pytorch-bot bot commented Sep 5, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2946

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 715ec74 with merge base e7b310b (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 5, 2025
@jerryzh168 jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Sep 5, 2025
@jerryzh168 jerryzh168 force-pushed the int4-packing-format-refactor branch from c11aef3 to e55001b Compare September 5, 2025 19:46
Summary:
We found that there is not much reuse of packing format, so we now plan to define packing format for
each of the dtype (int4, float8, intx), instead of having a global packing_format that's used by all the tensors
this reduces the interference between different dtype configs.

This doesn't change tensor subclass, so no BC changes for tensor subclass.
For v2 of Int4WeightOnlyConfig, it breaks BC, but we don't have any official models saved with this config yet, so
it's fine. We also didn't add bc testing for this since it's not finalized yet. We'll add that later.

Test Plan:
Regression tests:
python test/quantization/quantize_/workflows/int4/test_int4_marlin_sparse_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_opaque_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_tile_packed_to_4d_tensor.py
python test/core/test_config.py
python test/integration/test_load_and_run_checkpoint.py

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 force-pushed the int4-packing-format-refactor branch from e55001b to 715ec74 Compare September 5, 2025 19:53
@jerryzh168 jerryzh168 merged commit 4872c4f into main Sep 5, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants