Skip to content

Conversation

brian-dellabetta
Copy link
Contributor

This deprecates TransformScheme.head_dim in favor of TransformScheme.block_size, which is more meaningful now that we want to apply block-diagonal transforms with a user-configured block size.

To be merged in conjunction with:

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@brian-dellabetta brian-dellabetta changed the title TransformScheme.block_size, deprecate head_dim [transforms] TransformScheme.block_size, deprecate head_dim Sep 11, 2025
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

Copy link
Contributor

@fynnsu fynnsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think pydantic also has an alias option which might be an alternative to the custom_validator. That might also enforce that both values aren't set. But also fine this way and I'm not sure how well alias works with the deprecation warnings.

@brian-dellabetta
Copy link
Contributor Author

Looks good! I think pydantic also has an alias option which might be an alternative to the custom_validator. That might also enforce that both values aren't set. But also fine this way and I'm not sure how well alias works with the deprecation warnings.

Thanks, yeah I played around with alias quite a bit but it serializes to the aliased field, which i didn't want. I couldn't find a combination that would load up the aliased value and save as the new field.

@kylesayrs kylesayrs merged commit 0e5df88 into main Sep 17, 2025
2 checks passed
@kylesayrs kylesayrs deleted the bdellabe/transforms-configure-hadamard-size branch September 17, 2025 14:26
brian-dellabetta added a commit to vllm-project/llm-compressor that referenced this pull request Sep 17, 2025
…field (#1806)

SUMMARY:
Resolves `INFERENG-1882`

The research community [has pointed
out](https://github.com/IST-DASLab/FP-Quant?tab=readme-ov-file#fp-format-quantization-harness)
that the rotation/transform block size is important when performing
transforms:

> Key to efficiency is that the Hadamard block size matches the
microscaling format group size (16 or 32)

This exposes a new field on SpinQuantModifier and QuIPModifier to allow
the user to set it to an arbitrary value, as long as the model's
hidden_size and head_dim are both evenly divisible by it.

- [x] Add to SpinQuant Modifier. Option to allow for different
`transform_block_size`s for R1 vs. R2 can be added at a future time.
- [x] Add to QuIPModifier. Option to allow for different
`transform_block_size`s for U vs. V can be added at a future time.

Merge in conjunction with:
 * neuralmagic/compressed-tensors#466

TEST PLAN:
`transform_block_size` added to parameterized
`tests/llmcompressor/modifiers/transform/(test_correctness.py|test_serialization.py)`

---------

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
dsikka pushed a commit to vllm-project/llm-compressor that referenced this pull request Sep 21, 2025
…field (#1806)

SUMMARY:
Resolves `INFERENG-1882`

The research community [has pointed
out](https://github.com/IST-DASLab/FP-Quant?tab=readme-ov-file#fp-format-quantization-harness)
that the rotation/transform block size is important when performing
transforms:

> Key to efficiency is that the Hadamard block size matches the
microscaling format group size (16 or 32)

This exposes a new field on SpinQuantModifier and QuIPModifier to allow
the user to set it to an arbitrary value, as long as the model's
hidden_size and head_dim are both evenly divisible by it.

- [x] Add to SpinQuant Modifier. Option to allow for different
`transform_block_size`s for R1 vs. R2 can be added at a future time.
- [x] Add to QuIPModifier. Option to allow for different
`transform_block_size`s for U vs. V can be added at a future time.

Merge in conjunction with:
 * neuralmagic/compressed-tensors#466

TEST PLAN:
`transform_block_size` added to parameterized
`tests/llmcompressor/modifiers/transform/(test_correctness.py|test_serialization.py)`

---------

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants