Skip to content

fix bug of gguf alg ext#1796

Merged
n1ck-guo merged 5 commits into
mainfrom
hengguo/fix_gguf_ext_alg
May 12, 2026
Merged

fix bug of gguf alg ext#1796
n1ck-guo merged 5 commits into
mainfrom
hengguo/fix_gguf_ext_alg

Conversation

@n1ck-guo
Copy link
Copy Markdown
Contributor

Description

Please briefly describe your main changes, the motivation.

Type of Change

Bug fix

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.
  • The CUDA CI has passed. You can trigger it by commenting /azp run Unit-Test-CUDA-AutoRound.

Signed-off-by: n1ck-guo <heng.guo@intel.com>
Copilot AI review requested due to automatic review settings May 11, 2026 01:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a GGUF algorithm-extension initialization bug by ensuring format-driven overrides (e.g., gguf:q2_k_s forcing data_type=int_asym_dq) are applied before constructing the quantizer, so the correct wrapper implementation is selected.

Changes:

  • Reordered the post_init() pipeline to create the quantizer after _resolve_formats() and introduced a dedicated _create_quantizer() phase.
  • Synced format-driven attribute overrides back into quantize_config prior to quantizer creation (instead of mutating an already-created quantizer).
  • Added a CUDA regression test asserting enable_alg_ext + gguf:q2_k_s selects dq_wrapper_block (DQ wrapper path).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
auto_round/compressors_new/base.py Moves quantizer construction to a new phase after format resolution; adds inplace defaulting in hardware setup; updates pipeline documentation.
test/test_cuda/algorithms/test_alg_ext.py Adds a regression test to ensure GGUF format overrides lead to selecting dq_wrapper_block when alg-ext is enabled.

Comment thread auto_round/compressors_new/base.py
Comment thread auto_round/compressors_new/base.py
Comment thread auto_round/compressors_new/base.py
Comment thread auto_round/compressors_new/base.py
Comment thread test/test_cuda/algorithms/test_alg_ext.py Outdated
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Comment thread test/test_cuda/algorithms/test_alg_ext.py Outdated
Comment thread test/test_cuda/algorithms/test_alg_ext.py Outdated
n1ck-guo added 3 commits May 11, 2026 09:53
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@n1ck-guo
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Comment thread auto_round/algorithms/quantization/base.py
@n1ck-guo n1ck-guo added the ready only add when the PR is ready to merge label May 12, 2026
@n1ck-guo n1ck-guo requested a review from wenhuach21 May 12, 2026 00:45
Comment thread test/test_cuda/algorithms/test_alg_ext.py
@n1ck-guo n1ck-guo merged commit f64a0d5 into main May 12, 2026
41 checks passed
@n1ck-guo n1ck-guo deleted the hengguo/fix_gguf_ext_alg branch May 12, 2026 01:55
lvliang-intel pushed a commit that referenced this pull request May 12, 2026
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready only add when the PR is ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants