Skip to content

support GPTQ_FORMAT for "gptqmodel:exllamav2" backend#1434

Merged
xin3he merged 2 commits intomainfrom
xinhe/2-10c
Feb 11, 2026
Merged

support GPTQ_FORMAT for "gptqmodel:exllamav2" backend#1434
xin3he merged 2 commits intomainfrom
xinhe/2-10c

Conversation

@xin3he
Copy link
Copy Markdown
Contributor

@xin3he xin3he commented Feb 11, 2026

Description

#1433

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #1433

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: He, Xin3 <xin3.he@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for GPTQ_FORMAT for the gptqmodel:exllamav2 backend, including a compatibility conversion for certain GPTQ qzero representations.

Changes:

  • Introduced a process_gptq_qzero() pass to adjust qzeros for ExllamaV2 GPTQ layers.
  • Invoked qzero processing during gptqmodel post-init.
  • Expanded gptqmodel:exllamav2 backend packing_format to accept additional GPTQ formats.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
auto_round/inference/convert_model.py Adds qzero conversion logic and hooks it into gptqmodel post-init.
auto_round/inference/backend.py Updates exllamav2 backend packing_format to include GPTQ_FORMAT.

Copy link
Copy Markdown
Contributor

@Kaihui-intel Kaihui-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version check can also be added

Signed-off-by: He, Xin3 <xin3.he@intel.com>
@chensuyue chensuyue added this to the 0.10.0 milestone Feb 11, 2026
@xin3he xin3he merged commit 8a3d793 into main Feb 11, 2026
29 checks passed
@xin3he xin3he deleted the xinhe/2-10c branch February 11, 2026 08:35
chensuyue pushed a commit that referenced this pull request Feb 11, 2026
Signed-off-by: He, Xin3 <xin3.he@intel.com>
(cherry picked from commit 8a3d793)
lvliang-intel pushed a commit that referenced this pull request Feb 27, 2026
Signed-off-by: He, Xin3 <xin3.he@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants