Skip to content

Fix bnb 4bit/8bit quantization drop chunked tensors bug#46210

Merged
SunMarc merged 1 commit into
huggingface:mainfrom
kaixuanliu:bnb-quantize
May 27, 2026
Merged

Fix bnb 4bit/8bit quantization drop chunked tensors bug#46210
SunMarc merged 1 commit into
huggingface:mainfrom
kaixuanliu:bnb-quantize

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@kaixuanliu kaixuanliu commented May 26, 2026

What does this PR do?

Bnb4bitQuantize.convert / Bnb8bitQuantize.convert only quantized the first entry of input_dict and returned {full_layer_name: value}, silently dropping any extra tensors produced by an upstream one-to-many WeightConverter (e.g. a Chunk op splitting a fused weight). Those targets were never loaded, kept their random init,and thus trigger the assert Error:
assert module.weight.shape[1] == 1.

Fix:

iterate over every (param_name, value) in input_dict and quantize each one against its own module.

Repro

RUN_SLOW=1 pytest tests/models/hrm_text/test_modeling_hrm_text.py::HrmTextModelTest::test_flash_attn_2_fp32_ln

hrm_text is affected because it chunks attn.gqkv_proj → gate_proj/q_proj/k_proj/v_proj and mlp.gate_up_proj → gate_proj/up_proj
on load. Without the fix, q/k/v_proj and mlp.up_proj show up as MISSING in the load report and the 4bit forward asserts.
The case passes after this fix.

Who can review?

@SunMarc pls help review, thx!

… load

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu kaixuanliu changed the title Fix bnb 4bit/8bit quantization dropping chunked tensors during weight… Fix bnb 4bit/8bit quantization drop chunked tensors bug May 26, 2026
Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks a lot for this !

@SunMarc SunMarc enabled auto-merge May 27, 2026 10:41
@SunMarc SunMarc added this pull request to the merge queue May 27, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Merged via the queue into huggingface:main with commit 212863b May 27, 2026
31 checks passed
@kaixuanliu kaixuanliu deleted the bnb-quantize branch May 28, 2026 03:27
yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request May 28, 2026
…46210)

Fix bnb 4bit/8bit quantization dropping chunked tensors during weight load

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
kashif pushed a commit to kashif/transformers that referenced this pull request Jun 1, 2026
…46210)

Fix bnb 4bit/8bit quantization dropping chunked tensors during weight load

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants