Fix T5 v1.1 detection by githubnemo · Pull Request #43681 · huggingface/transformers

githubnemo · 2026-02-02T13:12:00Z

PR #41541 refactored tie_word_embeddings handling (among other things) which subtly broke detection of T5 v1.1 vs. original detection. As a consequence, decoder output scaling was always applied, regardless of T5 version.

This is resolved by using the correct value for tie_word_embeddings.

Testing:

This was not covered by the tests since the tests instantiate the config once and modify attributes on the config. This is problematic since all the decision logic is happening in T5Config.__init__. This was addressed by having a specific get_config_v1_1 method that initializes the config as if it were coming from a v1.1 model (e.g., flan-t5).

PR huggingface#41541 refactored `tie_word_embeddings` handling (among other things) which subtly broke detection of T5 v1.1 vs. original detection. As a consequence, decoder output scaling was always applied, regardless of T5 version. This is resolved by using the correct value for `tie_word_embeddings`. **Testing:** This was not covered by the tests since the tests instantiate the config once and modify attributes on the config. This is problematic since all the decision logic is happening in `T5Config.__init__`. This was addressed by having a specific `get_config_v1_1` method that initializes the config as if it were coming from a v1.1 model (e.g., flan-t5).

HuggingFaceDocBuilderDev · 2026-02-02T13:21:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Nice, thanks! I guess slow tests are just broken which is why we didn't notice the bug it earlier

…o/transformers into issue/broken-t5-v1.1-detection

…v1.1-detection

tests/models/mt5/test_modeling_mt5.py

github-actions · 2026-02-05T10:57:56Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: mt5, t5

* Fix T5 v1.1 detection PR #41541 refactored `tie_word_embeddings` handling (among other things) which subtly broke detection of T5 v1.1 vs. original detection. As a consequence, decoder output scaling was always applied, regardless of T5 version. This is resolved by using the correct value for `tie_word_embeddings`. **Testing:** This was not covered by the tests since the tests instantiate the config once and modify attributes on the config. This is problematic since all the decision logic is happening in `T5Config.__init__`. This was addressed by having a specific `get_config_v1_1` method that initializes the config as if it were coming from a v1.1 model (e.g., flan-t5). * Make repo consistent * Make repo consistent * mt5 isn't copied from t5 anymore --------- Co-authored-by: nemo <git@ningu.net> Co-authored-by: raushan <raushan@huggingface.co>

githubnemo requested a review from zucchini-nlp February 2, 2026 13:12

zucchini-nlp approved these changes Feb 2, 2026

View reviewed changes

githubnemo force-pushed the issue/broken-t5-v1.1-detection branch from 448f5b0 to a44b3c1 Compare February 2, 2026 14:45

Make repo consistent

54f31a2

githubnemo force-pushed the issue/broken-t5-v1.1-detection branch from a44b3c1 to 54f31a2 Compare February 2, 2026 15:05

githubnemo and others added 5 commits February 2, 2026 16:38

Merge branch 'main' into issue/broken-t5-v1.1-detection

03ab453

Merge branch 'main' into issue/broken-t5-v1.1-detection

14be49d

Make repo consistent

fb28af4

Merge branch 'issue/broken-t5-v1.1-detection' of github.com:githubnem…

b689a64

…o/transformers into issue/broken-t5-v1.1-detection

Merge remote-tracking branch 'huggingface/main' into issue/broken-t5-…

52b13a1

…v1.1-detection

zucchini-nlp reviewed Feb 3, 2026

View reviewed changes

tests/models/mt5/test_modeling_mt5.py Outdated Show resolved Hide resolved

mt5 isn't copied from t5 anymore

29451c6

zucchini-nlp enabled auto-merge (squash) February 3, 2026 11:14

zucchini-nlp added 3 commits February 3, 2026 12:22

Merge branch 'main' into issue/broken-t5-v1.1-detection

c752e4d

Merge branch 'main' into issue/broken-t5-v1.1-detection

ec49fc6

Merge branch 'main' into issue/broken-t5-v1.1-detection

f75554b

ArthurZucker disabled auto-merge February 5, 2026 11:01

ArthurZucker merged commit b8a1c69 into huggingface:main Feb 5, 2026
16 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix T5 v1.1 detection#43681

Fix T5 v1.1 detection#43681
ArthurZucker merged 11 commits intohuggingface:mainfrom
githubnemo:issue/broken-t5-v1.1-detection

githubnemo commented Feb 2, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 2, 2026

Uh oh!

zucchini-nlp left a comment •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

githubnemo commented Feb 2, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 2, 2026

Uh oh!

zucchini-nlp left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Feb 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zucchini-nlp left a comment •

edited

Loading