Skip to content

feat: add SpinQuant offline rotation and integrate with PTQ pipeline#262

Merged
gavingavin99 merged 7 commits into
Tencent:mainfrom
gavingavin99:dev_rotation
Mar 18, 2026
Merged

feat: add SpinQuant offline rotation and integrate with PTQ pipeline#262
gavingavin99 merged 7 commits into
Tencent:mainfrom
gavingavin99:dev_rotation

Conversation

@gavingavin99
Copy link
Copy Markdown
Collaborator

@gavingavin99 gavingavin99 commented Mar 16, 2026

feat: add SpinQuant offline rotation and integrate with PTQ pipeline

  • Add angelslim/compressor/transform/ package: - TransformBase abstract class and TransformFactory with register decorator - SpinQuant implementation: R1/R2/R4 offline Hadamard rotation fused into weights - SpinQuantMapping for LLaMA/Qwen layer name resolution - fuse_ln_linear, center_embeddings utilities; hadamard_utils
  • Integrate transform into PTQ: TransformFactory.create() + run() is called before quantization in PTQ.init()
  • Extend config_parser: add TransformConfig, FullConfig.transform_config, SlimConfigParser support for optional transform: YAML section
  • Add Engine.prepare_compressor(transform_config=) passthrough and lm_eval()
  • Add tools/run_transform_offline.py for standalone transform + save
  • Add configs/qwen3/spinquant/ with SpinQuant + fp8_static / int4_awq examples

        - Add angelslim/compressor/transform/ package:
          - TransformBase abstract class and TransformFactory with @register decorator
          - SpinQuant implementation: R1/R2/R4 offline Hadamard rotation fused into weights
          - SpinQuantMapping for LLaMA/Qwen layer name resolution
          - fuse_ln_linear, center_embeddings utilities; hadamard_utils
        - Integrate transform into PTQ: TransformFactory.create() + run() is called
          before quantization in PTQ.__init__()
        - Extend config_parser: add TransformConfig, FullConfig.transform_config,
          SlimConfigParser support for optional transform: YAML section
        - Add Engine.prepare_compressor(transform_config=) passthrough and lm_eval()

        - Add tools/run_transform_offline.py for standalone transform + save

        - Add configs/qwen3/spinquant/ with SpinQuant + fp8_static / int4_awq examples
Comment thread tools/run.py Outdated

# Step 7: Save compressed model
def find_modules_with_hooks(model: torch.nn.Module):
"""查找并打印模型中所有带有 hook 的子模块"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释用英文

global:
save_path: ./output


Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qwen3-8b_int4_awq.yaml应该不用修改?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不用改,是因为之前改了别的路径,不小心commit了,重新改了回去

@@ -0,0 +1,77 @@
# coding=utf-8
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意引用规范,使用Tencent的声明
参考自其他库的在下方注明

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadamard_utils.py‎这个文件行数是否有方法精简,近10w行文件会不会过大

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件行数多是因为有些特殊的shape(非二次幂),需要单独设置hadamard核,把hadamard核函数写到文件里了,所以看起来行数比较多。精简的话可能通用性降低

embedding.weight.data = new_weight


# [TODO] check this function correct or not
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种注释要不要删掉

@gavingavin99 gavingavin99 merged commit 26567b2 into Tencent:main Mar 18, 2026
5 checks passed
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里面的mapping是适用于某一个模型还是都要适用,是否需要扩充

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants