🚨 Validate config attributes#41250
Merged
zucchini-nlp merged 114 commits intohuggingface:mainfrom Mar 16, 2026
Merged
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Member
Author
|
Blocked by #41541 (comment) for now |
Member
Author
|
Tieme to revive this branch |
Member
Author
|
Nice, much better and easy to maintain BC with remote code now! |
ArthurZucker
approved these changes
Feb 5, 2026
Comment on lines
+196
to
+205
| # Keys are always strings in JSON so convert ids to int here for id2label and pruned_heads | ||
| if self.id2label is None: | ||
| self._create_id_label_maps(kwargs.get("num_labels", 2)) | ||
| else: | ||
| if kwargs.get("num_labels") is not None and len(self.id2label) != kwargs.get("num_labels"): | ||
| logger.warning( | ||
| f"You passed `num_labels={kwargs.get('num_labels')}` which is incompatible to " | ||
| f"the `id2label` map of length `{len(self.id2label)}`." | ||
| ) | ||
| self.id2label = {int(key): value for key, value in self.id2label.items()} |
Collaborator
There was a problem hiding this comment.
is it a good time to get rid of these general attributes and only have them for models that actually require them?
Member
Author
|
@bot /repo |
Contributor
|
Repo. Consistency bot fixed some files and pushed the changes. |
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: afmoe, aimv2, albert, align, altclip, apertus, arcee, aria, audio_spectrogram_transformer, audioflamingo3, auto, autoformer, aya_vision, bamba, bark, bart |
This was referenced Mar 16, 2026
Open
michaelzhang-ai
added a commit
to michaelzhang-ai/sglang
that referenced
this pull request
Mar 17, 2026
…GLM-5 nightly tests Transformers PR huggingface/transformers#41250 (merged Mar 16) converts PretrainedConfig subclasses to @DataClass via __init_subclass__, which breaks sglang's DeepseekVL2Config (non-default field ordering) and prevents the server from starting at all. Remove `pip install git+https://github.com/huggingface/transformers.git` from all Qwen 3.5 and GLM-5 CI jobs (MI30x, MI35x, ROCm 7.0 and 7.2). Use the stable transformers shipped in the docker image instead, matching all other nightly jobs (Grok2, DeepSeek-V3.2, etc.). Keep mistral-common and lm-eval[api] for Qwen 3.5 tests that need them.
aashay-sarvam
added a commit
to aashay-sarvam/transformers
that referenced
this pull request
Mar 17, 2026
- Remove torch_dtype="auto" from docs (now default) - Simplify modular_sarvam_mla.py to only override defaults that differ from DeepseekV3Config (no __init__, no workarounds) - Add @strict(accept_kwargs=True) for config validation (huggingface#41250) - Regenerate configuration_sarvam_mla.py with dataclass fields and __post_init__ pattern - Hub config.json changes needed: remove head_dim/q_head_dim, change rope_scaling.type to "yarn", update architectures Made-with: Cursor
BenjaminBossan
added a commit
to BenjaminBossan/peft
that referenced
this pull request
Mar 17, 2026
Resolves the failing tests on transformers main branch. After the change in huggingface/transformers#41250, the num_hidden_layers attribute is no longer part of the model config when serialized to a dict. The _prepare_prompt_learning_config function was using this attribute. Therefore, we now pass the config before converting it into a dict and extract the attribute from it.
michaelzhang-ai
added a commit
to sgl-project/sglang
that referenced
this pull request
Mar 17, 2026
Transformers PR huggingface/transformers#41250 (merged Mar 16) converts PretrainedConfig subclasses to @DataClass via __init_subclass__, which breaks sglang's DeepseekVL2Config and prevents the server from starting. For Qwen 3.5: remove git+transformers entirely — stable version in the docker image is sufficient (verified passing). For GLM-5: pin to commit 96f807a33b75 (last commit before the breaking change) since GLM-5 needs the glm_moe_dsa model type which is only in the transformers dev branch, not in stable releases yet.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
As per title. Continues from #40793 and supersedes #36534
NOTE: config classes can't accept positional args anymore! I don't think anyone would use pos args anyway but marring the PR as breaking
Note
High Risk
Refactors
PreTrainedConfigand many model config classes to@dataclass+huggingface_hub@strictvalidation, which can change initialization/serialization behavior and reject previously-accepted configs. Also enforces save-time validation and updates defaults/deprecations (e.g.,use_return_dict), risking backward-compatibility across model loading and downstream integrations.Overview
Adds strict config validation.
PreTrainedConfigis converted to a@dataclasswithhuggingface_hub’s@strict, introduces built-in validators (architecture consistency, special token id ranges, layer type checks,output_attentionsvsattn_implementation), and runsvalidate()automatically onsave_pretrained.Modernizes and standardizes model configs. Many model configuration classes are migrated from custom
__init__logic to dataclass fields +__post_init__, moving compatibility logic (e.g., defaulting sub-configs, key/value casting for JSON) into post-init and adding model-specificvalidate_architecturewhere needed.API/behavior tweaks. Deprecates
use_return_dictin favor ofreturn_dict(and updates multiple model forward paths accordingly), adjusts RoPE validation ignore-key handling, narrows AutoTokenizer fallback exception handling, and bumps the minimumhuggingface-hubrequirement to>=1.5.0.Written by Cursor Bugbot for commit 07095f3. This will update automatically on new commits. Configure here.