Fix GlmMoeDsaConfig default mlp_layer_types in modular conversion by OiPunk · Pull Request #43876 · huggingface/transformers

OiPunk · 2026-02-10T04:28:23Z

Summary

This PR fixes #43864 by preserving the GlmMoeDsaConfig default mlp_layer_types from the modular source.

GlmMoeDsaConfig should default to dense MLP for the first 3 layers and sparse afterward. During modular conversion, the parent init body was being inlined and overwrote that default with the parent pattern (["dense"] + ["sparse"] * ...).

Changes

In modular_glm_moe_dsa.py, call PreTrainedConfig.__init__(self, **kwargs) instead of super().__init__(**kwargs) to avoid inlining parent init logic.
Regenerated configuration_glm_moe_dsa.py via modular converter, which removes the duplicated parent default block.
Added a regression test in tests/models/glm_moe_dsa/test_configuration_glm_moe_dsa.py to assert the expected default pattern for num_hidden_layers=8.

Validation

PYTHONPATH=src python3 utils/modular_model_converter.py glm_moe_dsa
PYTHONPATH=src python3 utils/check_modular_conversion.py --files src/transformers/models/glm_moe_dsa/modular_glm_moe_dsa.py
PYTHONPATH=src python3 -m pytest tests/models/glm_moe_dsa/test_configuration_glm_moe_dsa.py -q
PYTHONPATH=src python3 -m trace --count --summary --module unittest tests.models.glm_moe_dsa.test_configuration_glm_moe_dsa | grep -E "configuration_glm_moe_dsa|test_configuration_glm_moe_dsa"
- output: configuration_glm_moe_dsa ... 100%

src/transformers/models/glm_moe_dsa/configuration_glm_moe_dsa.py

tests/models/glm_moe_dsa/test_configuration_glm_moe_dsa.py

OiPunk · 2026-02-10T10:43:50Z

Thanks for the detailed review. I pushed commit a10f430 to address the requested changes.

What I changed:

Removed duplicate attribute assignments in modular_glm_moe_dsa.py so each config field is assigned once.
Regenerated configuration_glm_moe_dsa.py from the modular source to keep generated output in sync.
Moved the default mlp_layer_types regression test into test_modeling_glm_moe_dsa.py::GlmMoeDsaModelTest and removed the standalone configuration test file.

Validation run locally:

PYTHONPATH=src python3 utils/modular_model_converter.py glm_moe_dsa
PYTHONPATH=src python3 utils/check_modular_conversion.py --files src/transformers/models/glm_moe_dsa/modular_glm_moe_dsa.py
PYTHONPATH=src python3 -m ruff check src/transformers/models/glm_moe_dsa/modular_glm_moe_dsa.py src/transformers/models/glm_moe_dsa/configuration_glm_moe_dsa.py tests/models/glm_moe_dsa/test_modeling_glm_moe_dsa.py
PYTHONPATH=src pytest -q tests/models/glm_moe_dsa/test_modeling_glm_moe_dsa.py -k default_mlp_layer_types

I also verified both mlp_layer_types paths (None and explicit list) execute in the config initializer.

zucchini-nlp · 2026-02-10T11:16:23Z

run-slow: glm_moe_dsa

github-actions · 2026-02-10T11:17:39Z

This comment contains run-slow, running the specified jobs:

models: ["models/glm_moe_dsa"]
quantizations: []

HuggingFaceDocBuilderDev · 2026-02-10T11:25:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2026-02-10T11:29:38Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	2415753c	merge commit
PR	a10f4303	branch commit
main	884749a1	base commit

✅ No failing test specific to this PR 🎉 👏 !

zucchini-nlp · 2026-02-10T11:51:54Z

@bot /style

github-actions · 2026-02-10T11:52:27Z

Style fix bot fixed some files and pushed the changes.

github-actions · 2026-02-10T11:56:28Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm_moe_dsa

zucchini-nlp

Thanks

…ggingface#43876) * Fix GlmMoeDsaConfig default mlp layer pattern * fix(glm-moe-dsa): dedupe config init and colocate test * Apply repo consistency fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Fix GlmMoeDsaConfig default mlp layer pattern

760c2e4

OiPunk mentioned this pull request Feb 10, 2026

GlmMoeDsaConfig: mlp_layer_types default overwritten by inlined parent init #43864

Closed

zucchini-nlp reviewed Feb 10, 2026

View reviewed changes

src/transformers/models/glm_moe_dsa/configuration_glm_moe_dsa.py Show resolved Hide resolved

zucchini-nlp reviewed Feb 10, 2026

View reviewed changes

tests/models/glm_moe_dsa/test_configuration_glm_moe_dsa.py Outdated Show resolved Hide resolved

fix(glm-moe-dsa): dedupe config init and colocate test

a10f430

Apply repo consistency fixes

d212502

zucchini-nlp approved these changes Feb 10, 2026

View reviewed changes

zucchini-nlp merged commit 476600a into huggingface:main Feb 10, 2026
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GlmMoeDsaConfig default mlp_layer_types in modular conversion#43876

Fix GlmMoeDsaConfig default mlp_layer_types in modular conversion#43876
zucchini-nlp merged 3 commits intohuggingface:mainfrom
OiPunk:codex/transformers-43864-glm-moe-config-default

OiPunk commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

OiPunk commented Feb 10, 2026

Uh oh!

zucchini-nlp commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

zucchini-nlp commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

zucchini-nlp left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

OiPunk commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Validation

Uh oh!

Uh oh!

Uh oh!

OiPunk commented Feb 10, 2026

Uh oh!

zucchini-nlp commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026

CI Results

Commit Info

Uh oh!

zucchini-nlp commented Feb 10, 2026

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

OiPunk commented Feb 10, 2026 •

edited

Loading

github-actions bot commented Feb 10, 2026 •

edited

Loading