Fix incorrect attribute mapping relationships in GLM MoE DSA Config by Dovis01 · Pull Request #46338 · huggingface/transformers

Dovis01 · 2026-06-02T09:04:53Z

Fix GlmMoeDsaConfig legacy `head_dim` overwriting `qk_rope_head_dim`

Summary

Loading the same GLM-5 config.json yields different attention dimensions between transformers v5.3.0 and v5.4.0. v5.4.0 silently corrupts qk_rope_head_dim due to a new attribute_map entry.

v5.3.0 vs v5.4.0 and after

For example: Checkpoint config.json:

"head_dim": 192,
"qk_nope_head_dim": 192,
"qk_rope_head_dim": 64

	v5.3.0	v5.4.0
Config class	Custom `__init__`	`@strict` dataclass, inherits `Glm4MoeLiteConfig`
`attribute_map`	No `head_dim` mapping	`"head_dim": "qk_rope_head_dim"` (inherited)
`qk_rope_head_dim` after load	64 ✓	192 ✗ (overwritten by legacy `head_dim`)
`head_dim` in config output	192 (kept as separate legacy field)	absent (aliased into `qk_rope_head_dim`)
`qk_head_dim`	256 (192+64) ✓	384 (192+192) ✗

Why v5.3.0 works

Custom __init__ sets qk_rope_head_dim from the explicit JSON field first; legacy head_dim=192 is stored separately and does not overwrite qk_rope_head_dim.

Why v5.4.0 breaks

qk_rope_head_dim=64 is set as a dataclass field
Legacy head_dim=192 arrives later in __post_init__
attribute_map redirects head_dim → qk_rope_head_dim, overwriting 64 with 192

Downstream impact (v5.4.0 only)

MLA attention projection shapes use wrong rope dim (192 vs 64)
Indexer Q/K split along rope dim is wrong
RoPE inv_freq length is wrong (getattr(config, "head_dim") also resolves to 192)
Inference engines reading config.qk_rope_head_dim (e.g. SGLang NSA/MLA backends) get incorrect KV cache / RoPE layout

Fix

Override attribute_map in modular_glm_moe_dsa.py — drop "head_dim": "qk_rope_head_dim" for GlmMoeDsaConfig

Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com>

ArthurZucker

39f751a is what changed it, we might have to fix a test maybe? otherwise makes sense

HuggingFaceDocBuilderDev · 2026-06-02T10:22:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com>

github-actions · 2026-06-02T11:38:31Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm_moe_dsa

ArthurZucker

As its urgent fine by me!

Fix error attributes mapping

67a9994

Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com>

ArthurZucker approved these changes Jun 2, 2026

View reviewed changes

Dovis01 added 3 commits June 2, 2026 10:45

Fix tests

3f02fef

Signed-off-by: Shijin Zhang <75300765+Dovis01@users.noreply.github.com>

Merge branch 'main' into fix-glm-moe-dsa

3841bbc

Merge branch 'main' into fix-glm-moe-dsa

386e43f

ArthurZucker approved these changes Jun 2, 2026

View reviewed changes

ArthurZucker merged commit 3163718 into huggingface:main Jun 2, 2026
17 checks passed

zRzRzRzRzRzRzR mentioned this pull request Jun 3, 2026

[Bugfix] Restore overridden HF config fields and support index_skip_topk_offset for DSA topk sharing sgl-project/sglang#27114

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incorrect attribute mapping relationships in GLM MoE DSA Config#46338

Fix incorrect attribute mapping relationships in GLM MoE DSA Config#46338
ArthurZucker merged 4 commits into
huggingface:mainfrom
Dovis01:fix-glm-moe-dsa

Dovis01 commented Jun 2, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Dovis01 commented Jun 2, 2026

Fix GlmMoeDsaConfig legacy head_dim overwriting qk_rope_head_dim

Summary

v5.3.0 vs v5.4.0 and after

Why v5.3.0 works

Why v5.4.0 breaks

Downstream impact (v5.4.0 only)

Fix

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix GlmMoeDsaConfig legacy `head_dim` overwriting `qk_rope_head_dim`