🚨 T5Gemma2 model structure by zucchini-nlp · Pull Request #43633 · huggingface/transformers

zucchini-nlp · 2026-01-30T15:50:24Z

What does this PR do?

Makes sure that the attn implementation is set to all sub-configs. The config.encoder.text_config was not getting its attn set because we aren't passing it to PreTrainedModel.__init__. We can't change the model structure without breaking so I manually re-added a call to self.adjust_attn_implemetation in modeling code

Also deleted __setattr__, not sure what is the reason behind having them. Composite configs usually don't need to force-set the same attr in all subconfigs magically

zucchini-nlp · 2026-01-30T15:50:47Z

src/transformers/models/t5gemma2/modeling_t5gemma2.py

+        # Set attn implementation manually because `text-config` is never passed to `super()`
+        self.text_config._attn_implementation_internal = self._check_and_adjust_attn_implementation(
+            self.text_config._attn_implementation, is_init_check=True
+        )
+


@tomaarsen here is why the attn implementation was not being set

Just making sure: this is not also necessary for the vision_config, right?
Will review in more details next week.

nope, the vision config is used a few lines above to init a PreTrainedModel and thus it is passed to PreTrainedModel.__init__

Hmm, could we refactor with the conversion mapping / checkpoint conversion mapping? Iiuc, if we refactor the text specific things into its own module then we won't have this issue

I think it needs conversion mapping because you also want to use the encoder as standalone model

will be a huge change though, let me see

HuggingFaceDocBuilderDev · 2026-01-30T16:00:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tomaarsen

Just ran some more extensive tests, this seems to work for me now. The attn_implementation is set automatically and my ST training works as expected.
This PR should supersede parts of #43559 now, ~~which can be closed once this is merged.~~

zucchini-nlp · 2026-02-02T12:25:45Z

Will finish up in a while and ask for review

tomaarsen · 2026-02-02T12:32:06Z

Somewhat related: If I train a T5GemmaEncoder, I end up with a repository like: https://huggingface.co/tomaarsen/t5gemma2-270m-gooaq-cmnrl/tree/main
Note that the model_type is t5gemma2_encoder, which I can't load with AutoConfig like other architectures. Perhaps I should still expand on #43559 to turn t5gemma2_encoder into an architecture with AutoConfig and AutoModel support? Otherwise I still can't conveniently load https://huggingface.co/tomaarsen/t5gemma2-270m-gooaq-cmnrl.

Tom Aarsen

zucchini-nlp · 2026-02-02T13:03:13Z

@tomaarsen yes, it's expected that AutoConfig will not work on the encoder-part. If it's needed for ST, we need the other PR you had as well. I was assuming ST will load directly with T5GemmaEncoderConfig in a similar fashion to old T5 family

tomaarsen · 2026-02-02T13:05:41Z

If it's needed for ST, we need the other PR you had as well.

Agreed, will re-add it.

I was assuming ST will load directly with T5GemmaEncoderConfig in a similar fashion to old T5 family

I will, but the old T5 family is also loaded with:
load AutoConfig -> Check config type -> Check for edge cases (T5, MT5, T5Gemma, T5Gemma2), otherwise AutoModel

And then I need to be able to use

from transformers import AutoConfig

config = AutoConfig.from_pretrained("tomaarsen/t5gemma2-270m-gooaq-cmnrl")

Tom Aarsen

zucchini-nlp · 2026-02-02T13:51:30Z

Failing test is flaky, ready for review

vasqu

The config changes are good!

I just think we could maybe convert the model instead via conversion_mapping or similar? This is a quick and dirty workaround so would be pro making this properly converted into its own encoder text module if possible

vasqu · 2026-02-02T16:42:38Z

src/transformers/models/t5gemma2/modeling_t5gemma2.py

+        # Set attn implementation manually because `text-config` is never passed to `super()`
+        self.text_config._attn_implementation_internal = self._check_and_adjust_attn_implementation(
+            self.text_config._attn_implementation, is_init_check=True
+        )
+


Hmm, could we refactor with the conversion mapping / checkpoint conversion mapping? Iiuc, if we refactor the text specific things into its own module then we won't have this issue

I think it needs conversion mapping because you also want to use the encoder as standalone model

zucchini-nlp · 2026-02-03T11:22:48Z

run-slow: t5gemma2

github-actions · 2026-02-03T11:24:05Z

This comment contains run-slow, running the specified jobs:

models: ["models/t5gemma2"]
quantizations: []

zucchini-nlp · 2026-02-03T12:00:02Z

run-slow: t5gemma2

zucchini-nlp · 2026-02-03T13:00:49Z

run-slow: t5gemma2

github-actions · 2026-02-03T13:19:04Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	705fc028	merge commit
PR	e98f4bc0	branch commit
main	b6a202f8	base commit

⚠️ No test being reported (jobs are skipped or cancelled)!

github-actions · 2026-02-03T13:19:36Z

💔 This comment contains run-slow, but unknown error occurred and the workflow run aborted!

zucchini-nlp · 2026-02-03T13:57:05Z

run-slow: t5gemma2

github-actions · 2026-02-03T13:58:14Z

This comment contains run-slow, running the specified jobs:

models: ["models/t5gemma2"]
quantizations: []

github-actions · 2026-02-03T14:13:35Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	3aa9ea02	merge commit
PR	958df6e0	branch commit
main	affcf459	base commit

✅ No failing test specific to this PR 🎉 👏 !

vasqu

LGTM, thanks a lot. Much better this way! Just one nit on the test to maybe add cleanup on setup as well

src/transformers/models/t5gemma2/modular_t5gemma2.py

tests/models/t5gemma2/test_modeling_t5gemma2.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

…-setter

zucchini-nlp · 2026-02-04T09:05:18Z

Checked with rebase, everything is still working and tests are passing. Will merge after CI turns green

github-actions · 2026-02-04T14:13:57Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: t5gemma, t5gemma2

maybe

193242b

zucchini-nlp commented Jan 30, 2026

View reviewed changes

fix repo

7452def

Sebmono mentioned this pull request Feb 1, 2026

[PR] T5Gemma2 Sandgarden-Demo/transformers#74

Open

tomaarsen approved these changes Feb 2, 2026

View reviewed changes

tomaarsen mentioned this pull request Feb 2, 2026

[feat] Allow loading T5Gemma2Encoder with AutoModel #43559

Merged

5 tasks

fix test

5308e96

zucchini-nlp requested review from Cyrilvallez and vasqu February 2, 2026 13:54

tomaarsen mentioned this pull request Feb 2, 2026

[feat] Add support for T5Gemma and T5Gemma2 models huggingface/sentence-transformers#3644

Merged

vasqu reviewed Feb 2, 2026

View reviewed changes

zucchini-nlp added 2 commits February 3, 2026 12:17

do conversion but can't tests, hub is down?

856d3c4

oops

e98f4bc

ah the model was gated!

9b356d5

zucchini-nlp changed the title ~~T5Gemma2~~ 🚨 T5Gemma2 Feb 3, 2026

zucchini-nlp changed the title ~~🚨 T5Gemma2~~ 🚨 T5Gemma2 model structure Feb 3, 2026

override helper

966d558

zucchini-nlp requested a review from vasqu February 3, 2026 13:01

fix repo and test

958df6e

vasqu approved these changes Feb 3, 2026

View reviewed changes

src/transformers/models/t5gemma2/modular_t5gemma2.py Outdated Show resolved Hide resolved

src/transformers/models/t5gemma2/modular_t5gemma2.py Show resolved Hide resolved

tests/models/t5gemma2/test_modeling_t5gemma2.py Show resolved Hide resolved

zucchini-nlp and others added 3 commits February 3, 2026 15:50

Update src/transformers/models/t5gemma2/modular_t5gemma2.py

445b6aa

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

comments and check tests after rebase

480c685

Merge remote-tracking branch 'upstream/main' into attn-impl-resursive…

095cc7a

…-setter

zucchini-nlp added 2 commits February 4, 2026 10:55

Merge branch 'main' into attn-impl-resursive-setter

a67b607

skip the annoying test

477f155

zucchini-nlp enabled auto-merge (squash) February 4, 2026 14:16

bad copy without looking

1952234

zucchini-nlp merged commit d75266f into huggingface:main Feb 4, 2026
25 checks passed

Conversation

zucchini-nlp commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

zucchini-nlp Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

tomaarsen Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

vasqu Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jan 30, 2026

Uh oh!

tomaarsen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 2, 2026

Uh oh!

tomaarsen commented Feb 2, 2026

Uh oh!

zucchini-nlp commented Feb 2, 2026

Uh oh!

tomaarsen commented Feb 2, 2026

Uh oh!

zucchini-nlp commented Feb 2, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

zucchini-nlp commented Feb 3, 2026

Uh oh!

zucchini-nlp commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

CI Results

Commit Info

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

zucchini-nlp commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

github-actions bot commented Feb 3, 2026

CI Results

Commit Info

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zucchini-nlp commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

zucchini-nlp commented Jan 30, 2026 •

edited

Loading

tomaarsen left a comment •

edited

Loading