Fix init weights in remote code by zucchini-nlp · Pull Request #43768 · huggingface/transformers

zucchini-nlp · 2026-02-05T14:04:02Z

What does this PR do?

Helps vLLM to bump to v5

zucchini-nlp · 2026-02-05T14:14:32Z

src/transformers/modeling_utils.py

        if getattr(module, "_is_hf_initialized", False):
            return

+        if (weight := getattr(module, "weight", None)) is not None and getattr(weight, "_is_hf_initialized", False):
+            return
+


module's never have an _is_hf_initialized attr, ig this is a typo? Otherwise it causes the whole model to be random init when remote code has an old-format _init_weights defined and it takes ages for big models

HuggingFaceDocBuilderDev · 2026-02-05T14:15:26Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/transformers/modeling_utils.py

zucchini-nlp · 2026-02-06T14:02:06Z

src/transformers/tokenization_python.py

        # 5. Special tokens mask configuration
        # Patterns: "none", "cls_sep", "eos", "bos", "bos_eos", "cls_double_sep", "prefix_suffix"
-        self.special_tokens_pattern = kwargs.pop("special_tokens_pattern", "cls_sep")
+        self.special_tokens_pattern = kwargs.pop("special_tokens_pattern", "bos_eos")


cc @itazap @ArthurZucker , i want to clarify this part. Should we default to None because cls-sep ids arent always available for all tokenizers. This we are getting [None, 1, 18001, 468, None] as token ids for those models

Ignore the current change to bos_eos

github-actions · 2026-02-06T16:43:59Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_5_omni

init or tie weight in remote code

bc132eb

zucchini-nlp commented Feb 5, 2026

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

zucchini-nlp added 7 commits February 5, 2026 16:06

processing

b05a477

config attr

9726c6f

maybe? the special token logic is breaking many tests

0718e59

updates

c8e77c1

Merge branch 'main' into vllm-v5-bump

2cf6aa9

oh c'mon

e54a6fb

omg

195a552

zucchini-nlp commented Feb 6, 2026

View reviewed changes

try None and see if tests fail

9dccede

hmellor mentioned this pull request Feb 6, 2026

[Feature]: Support transformers>=5 vllm-project/vllm#30466

Open

1 task

oops

1db3144

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix init weights in remote code#43768

Fix init weights in remote code#43768
zucchini-nlp wants to merge 10 commits intohuggingface:mainfrom
zucchini-nlp:vllm-v5-bump

zucchini-nlp commented Feb 5, 2026 •

edited

Loading

Uh oh!

zucchini-nlp Feb 5, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 5, 2026

Uh oh!

Uh oh!

zucchini-nlp Feb 6, 2026

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zucchini-nlp commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

zucchini-nlp Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 5, 2026

Uh oh!

Uh oh!

zucchini-nlp Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zucchini-nlp commented Feb 5, 2026 •

edited

Loading

zucchini-nlp Feb 5, 2026 •

edited

Loading