Timm unification continued by zucchini-nlp · Pull Request #44252 · huggingface/transformers

zucchini-nlp · 2026-02-24T13:00:59Z

What does this PR do?

Deprecate timm backbone in favor of keeping all models within one timm folder, similar to other vision models. A backbone is just a variation of PreTrainedModel

HuggingFaceDocBuilderDev · 2026-02-24T13:10:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2026-02-24T17:07:15Z

run-slow: detr, conditional_detr, beit, rt_detr, rt_detr_v2, dpt, depth_anything, prompt_depth_anything, mm_grounding_dino, grounding_dino, table_transformer, maskformer, oneformer, vitmatte, tvp, d_fine

github-actions · 2026-02-24T17:08:31Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/depth_anything", "models/detr", "models/dpt", "models/grounding_dino", "models/maskformer", "models/mm_grounding_dino", "models/oneformer", "models/prompt_depth_anything", "models/rt_detr", "models/rt_detr_v2", "models/table_transformer", "models/tvp", "models/vitmatte"]
quantizations: []

zucchini-nlp · 2026-02-24T17:19:35Z

src/transformers/models/auto/auto_factory.py

+        # Early exit for `timm` models, they aren't hosted on the hub usually
+        use_timm_backbone = kwargs.pop("use_timm_backbone", None)


let's keep it actually and use only when Autobackbone.from_pretrained(). We don't call from_pretrained anywhere across repo so it will be used only by users
Then we can delete _BaseAutoBackboneClass

zucchini-nlp · 2026-02-24T17:20:32Z

src/transformers/models/auto/configuration_auto.py

        ("timesfm", "TimesFmConfig"),
        ("timesformer", "TimesformerConfig"),
-        ("timm_backbone", "TimmBackboneConfig"),
+        ("timm_backbone", "TimmBackboneConfig"),  # for BC


we should map any timm_backbone to the new model class when loading, so we don't log deprecation warnings. Mapping happens in the auto-modeling file

github-actions · 2026-02-24T17:22:03Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	7e809adc	workflow commit (merge commit)
PR	f7a0fece	branch commit (from PR)
main	0ff46c90	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

zucchini-nlp · 2026-02-24T17:22:03Z

src/transformers/models/gemma3n/configuration_gemma3n.py

-    @classmethod
-    def from_dict(cls, config_dict: dict[str, Any], **kwargs):
-        # Create a copy to avoid mutating the original dict
-        config_dict = config_dict.copy()


gemma3n vision is not a classifier and has no ImageClassificationModel, not needed!

zucchini-nlp · 2026-02-24T17:22:35Z

src/transformers/models/timm_backbone/configuration_timm_backbone.py

+    def __setattr__(self, key, value):
+        if (mapped_key := super().__getattribute__("special_attribute_map").get(key)) is not None:
+            if isinstance(mapped_key, (tuple, list)):
+                model_args = super().__getattribute__("__dict__").get(mapped_key[0])
+                model_args[mapped_key[1]] = value
+            else:
+                setattr(self, mapped_key[1], value)
+        else:
+            super().__setattr__(key, value)
+


weird way to keep BC for setter/getter

zucchini-nlp · 2026-02-24T17:24:15Z

src/transformers/models/timm_wrapper/configuration_timm_wrapper.py

-        self.architecture = architecture
+        is_backbone_config = kwargs.get("backbone") is not None
+        self.architecture = kwargs.pop("backbone") if is_backbone_config else architecture


allows loading old timm_backbone configs. No idea if we should log warnings, we can't deprecate it away from hub ckpt

zucchini-nlp · 2026-02-24T17:24:49Z

src/transformers/models/timm_wrapper/modeling_timm_wrapper.py

        output_attentions: bool | None = None,
        output_hidden_states: bool | list[int] | None = None,
-        return_dict: bool | None = None,
        do_pooling: bool | None = None,
        use_cache: bool | None = None,


wanted to get rid of all, but we can't unless we use capture_outputs 🥲

zucchini-nlp · 2026-02-24T17:37:21Z

run-slow: detr, conditional_detr, beit, rt_detr, rt_detr_v2, dpt, depth_anything, prompt_depth_anything, mm_grounding_dino, grounding_dino, table_transformer, maskformer, oneformer, vitmatte, tvp, d_fine

github-actions · 2026-02-24T17:38:35Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/depth_anything", "models/detr", "models/dpt", "models/grounding_dino", "models/maskformer", "models/mm_grounding_dino", "models/oneformer", "models/prompt_depth_anything", "models/rt_detr", "models/rt_detr_v2", "models/table_transformer", "models/tvp", "models/vitmatte"]
quantizations: []

github-actions · 2026-02-24T18:28:09Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	b07217a3	workflow commit (merge commit)
PR	ba56a16b	branch commit (from PR)
main	0133a75c	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

… timm ig

zucchini-nlp · 2026-02-25T14:31:00Z

src/transformers/models/conditional_detr/modeling_conditional_detr.py

        self.config = config

-        backbone = load_backbone(config)
+        backbone = AutoBackbone.from_config(config=config.backbone_config)


goodbye load_backbone 👋🏻

now we can replace it with one-line from_config

zucchini-nlp · 2026-02-25T14:32:55Z

src/transformers/conversion_mapping.py

-        "timm_wrapper": [
-            # Simply add the prefix `timm_model`
-            # TODO: Would be probably much cleaner with a `add_prefix` argument in WeightRenaming


base_model_prefix can do pretty well and doesn't have false-positive matches when reverse mapping

zucchini-nlp · 2026-02-25T14:33:49Z

src/transformers/conversion_mapping.py

            for k, v in model._checkpoint_conversion_mapping.items()
        ]

-    # TODO: should be checked recursively on submodels!!


needed it for timm, so we can define it once in above mapping and re-use in all models where Timm is a backbone

github-actions · 2026-02-26T13:35:44Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, colpali, colqwen2, conditional_detr, d_fine, dab_detr, deformable_detr, depth_anything, detr, dpt, gemma3n, grounding_dino

draft

5579b0c

zucchini-nlp added 6 commits February 24, 2026 15:44

fix the test

a5dbd5f

delete from auto-map and redirect to TimmWrapper

5c7eb10

what if?

4303988

stupid error

217070e

modeling code

55e9723

delete the utility fn

f7a0fec

zucchini-nlp added 2 commits February 24, 2026 18:09

oh no, not the corect path

abf4993

last test and fix repo

8a6e379

zucchini-nlp commented Feb 24, 2026

View reviewed changes

zucchini-nlp added 2 commits February 24, 2026 18:29

and here maybe

04e0d8a

and one more test

ba56a16

zucchini-nlp changed the title ~~[WIP] timm unification continued~~ Timm unification continued Feb 25, 2026

zucchini-nlp added 5 commits February 25, 2026 10:43

docstring

a829360

deprecation messages

100c489

add test and docs

854f2fb

allow loading old checkpoints, if any. Official ckpt don't exist with…

c6476bc

… timm ig

dont create imagenet 1000 labels by default

3a790ec

zucchini-nlp commented Feb 25, 2026

View reviewed changes

zucchini-nlp added 7 commits February 25, 2026 17:02

fix tests

ab6def3

Merge remote-tracking branch 'upstream/main' into timm-cont

375eb3e

docs

1d94e42

fix copies

2f36718

fix the rest of tests!

882b555

move paligemma to proper mapping and see

9794fcf

push

f13720c

zucchini-nlp mentioned this pull request Feb 26, 2026

Dynamic weight conversion is recursive #44300

Merged

zucchini-nlp mentioned this pull request Mar 12, 2026

fix(models): Forward timm model kwargs to timm.create_model for OmDet-Turbo #44611

Merged

5 tasks

		# Early exit for `timm` models, they aren't hosted on the hub usually
		use_timm_backbone = kwargs.pop("use_timm_backbone", None)

Conversation

zucchini-nlp commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 24, 2026

Uh oh!

zucchini-nlp commented Feb 24, 2026

Uh oh!

github-actions bot commented Feb 24, 2026

Uh oh!

zucchini-nlp Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 24, 2026

CI Results

Commit Info

Uh oh!

zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 24, 2026

Uh oh!

github-actions bot commented Feb 24, 2026

Uh oh!

github-actions bot commented Feb 24, 2026

CI Results

Commit Info

Uh oh!

zucchini-nlp Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zucchini-nlp commented Feb 24, 2026 •

edited

Loading

zucchini-nlp Feb 24, 2026 •

edited

Loading

zucchini-nlp Feb 24, 2026 •

edited

Loading

zucchini-nlp Feb 25, 2026 •

edited

Loading