🚨 Delete duplicate code in backbone utils by zucchini-nlp · Pull Request #43323 · huggingface/transformers

zucchini-nlp · 2026-01-16T14:32:44Z

What does this PR do?

This PR cleans up backbone utilities. Specifically, we have currently 5 different config attr to decide which backbone to load, most of which can be merged into one and seem redundant
After this PR, we'll have only one config.backbone_config as a single source of truth. The models will load the backbone from_config and load pretrained weights only if the checkpoint has any weights saved. The overall idea is same as in other composite models

I removed these config attr:

backbone - the backbone model id is now used to create a backbone_config by loading it from the hub or from timm
backbone_kwargs - it is used to update backbone_config with user-provided kwargs (i.e. backbone_config = CONFIG_MAPPING[model_type](**backbone_kwargs))
use_pretrained_backbone - we don't load a pretrained backbone anymore unless the user is calling from_pretrained. The default is to initialize a model with random weights and let the users either tune it from scratch or load pretrained weights themselves
use_timm_backbone - we can infer model type from config and the requested backbone type, so this arg was redundant

Along the way, I also updated the tests and docs. Recommended review path: modeling_backbone_utils.py -> auto_factory.py -> timm backbone model files -> couple other models of your choice

HuggingFaceDocBuilderDev · 2026-01-16T14:41:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2026-01-27T11:38:04Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43323&sha=754b61

zucchini-nlp · 2026-01-27T12:52:48Z

docs/source/en/backbones.md

-config = MaskFormerConfig(backbone="microsoft/resnet-50", use_pretrained_backbone=True)
+config = MaskFormerConfig(backbone="microsoft/resnet-50")
 model = MaskFormerForInstanceSegmentation(config)


imo it doesn't serve much purpose, loading a random init model with pretrained backbone. User still has to tune the model so it can be used

Therefore I deleted this feature. Pretrained weights are loaded from_pretrained and random weights from_config, same way as any other model

zucchini-nlp · 2026-01-27T12:54:04Z

docs/source/en/backbones.md

 ```py
 from transformers import MaskFormerConfig, MaskFormerForInstanceSegmentation

-config = MaskFormerConfig(backbone="resnet50", use_timm_backbone=True, use_pretrained_backbone=True)


use_timm_backbone is not really needed imo. We can infer if the requested checkpoint is from timm or HF by checking if repo exists on the hub with a valid config

Deleted it as well as a redudant arg

zucchini-nlp · 2026-01-27T12:55:12Z

src/transformers/models/beit/configuration_beit.py

-        self._out_features, self._out_indices = get_aligned_output_features_output_indices(
-            out_features=out_features, out_indices=out_indices, stage_names=self.stage_names
-        )
+        out_indices = list(out_indices) if out_indices is not None else None
+        self._out_features, self._out_indices = out_features, out_indices
+        self.align_output_features_output_indices()


The feature-index aligning happens in Mixin when we call align_output_features_output_indices

Nit: Can we set self._out_features and self. _out_indices in align_output_features_output_indices(out_features, out_indices)? To only call self.align_output_features_output_indices(out_features, out_indices) in backbone configs instead of these three lines. It would also simplify the setters

sure, I actually added a set_output_features_output_indices as well so we can use that to "set+align" values

zucchini-nlp · 2026-02-02T12:51:37Z

run-slow: auto, beit, bit, conditional_detr, convnext, convnextv2, d_fine, dab_detr, deformable_detr, depth_anything, detr, timm_backbone

github-actions · 2026-02-02T12:52:52Z

This comment contains run-slow, running the specified jobs:

models: ["models/auto", "models/beit", "models/bit", "models/conditional_detr", "models/convnext", "models/convnextv2", "models/d_fine", "models/dab_detr", "models/deformable_detr", "models/depth_anything", "models/detr", "models/timm_backbone"]
quantizations: []

github-actions · 2026-02-02T13:09:34Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	176e9614	merge commit
PR	eb88a4c1	branch commit
main	e5c8b0d1	base commit

✅ No failing test specific to this PR 🎉 👏 !

Cyrilvallez · 2026-02-03T14:50:52Z

Hey! Very nice initiative, indeed backbones have been quite annoying and have no reason to exist at all in general (we can simply do usual composite models)! Especially the timm_backbone and its pretrained weights loading which does not work correctly with our from_pretrained.
In general, we could completely remove timm_backbone in favor of timm_wrapper I believe if we remove the use_pretrained_weights anyway 🤗
And about

use_pretrained_backbone - we don't load a pretrained backbone anymore unless the user is calling from_pretrained

I believe we need to remove it completely - otherwise the loading is being done in the __init__, and when from_pretrained is called, if those weights are not in the main hf repo because they are assumed to only be on timm repo, then the weights will be considered missing and will be reinitialized.... So we can fully remove it IMO and assume weights need to live in hf repo, which will solve a long due issue!

Cyrilvallez · 2026-02-03T14:52:47Z

src/transformers/utils/backbone_utils.py

+class BackboneConfigMixin(BackboneConfigMixin):
+    warnings.warn(
+        "Importing `BackboneConfigMixin` from `utils/backbone_utils.py` is deprecated and will be removed in "
+        "Transformers v5.10. Import as `from transformers.modeling_backbone_utils import BackboneConfigMixin` instead.",
+        FutureWarning,


I don't really mind the renaming in general, but not sure if it's really needed

you mean moving to a different place? It was hitting a circular import with PreTrainedConfig, atm I imported it lazily in the new file so it doesn't get imported at the top

No I meant you renamed backbone_utils -> modeling_backbone_utils haha

oh that! No reason behind, just a matter of taste. Can rename it back for sure

Yes I believe it's a little easier as we want to deprecate the backbone api, then it avoids maintaining yet another BC entry point!

Cyrilvallez · 2026-02-03T14:53:55Z

src/transformers/models/bit/modeling_bit.py

    """
 )
-class BitBackbone(BitPreTrainedModel, BackboneMixin):
+class BitBackbone(BackboneMixin, BitPreTrainedModel):


Why do we need to switch the order here? Cause of __init__?

yeah, to init the backbone stuff first and then call BitPreTrainedModel.__init__ from within it

Cyrilvallez · 2026-02-03T14:56:03Z

In general we want to move away from the backbone API as much as possible in favor of standard composite models so this will help 🤗

zucchini-nlp · 2026-02-03T15:01:18Z

In general, we could completely remove timm_backbone in favor of timm_wrapper I believe if we remove the use_pretrained_weights anyway 🤗

100% , that is my goal for the subsequent PR. This is already hard to manage with gh conflicts and models, so we need to first clean-up extra kwargs and keep it as timm_backbone

I believe we need to remove it completely. So we can fully remove it IMO and assume weights need to live in hf repo, which will solve a long due issue!

It is removed already completely. We pop the kwarg and never use it, it's hardcoded as False in timm backbone modeling file. Atm the slow tests are passing which I believe means that the weights are already in hf repo for official releases

Cyrilvallez

Alright very nice! Super happy to gradually move away from the backbone API! Feel free to merge once CI is back to green (after the conflict handling issues)

zucchini-nlp · 2026-02-04T09:45:34Z

run-slow: auto, beit, bit, conditional_detr, convnext, convnextv2, d_fine, dab_detr, deformable_detr, depth_anything

github-actions · 2026-02-04T09:46:53Z

This comment contains run-slow, running the specified jobs:

models: ["models/auto", "models/beit", "models/bit", "models/conditional_detr", "models/convnext", "models/convnextv2", "models/d_fine", "models/dab_detr", "models/deformable_detr", "models/depth_anything"]
quantizations: []

github-actions · 2026-02-04T09:53:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, beit, bit, conditional_detr, convnext, convnextv2, d_fine, dab_detr, deformable_detr

zucchini-nlp · 2026-02-04T09:53:58Z

run-slow: beit, bit, conditional_detr, convnext, convnextv2, d_fine, dab_detr, deformable_detr, depth_anything, timm_backbone, detr, pp_doclayout_v3, resnet, vitpose

github-actions · 2026-02-04T09:54:53Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	18774550	merge commit
PR	ea9ac12d	branch commit
main	480ed54e	base commit

⚠️ No test being reported (jobs are skipped or cancelled)!

github-actions · 2026-02-04T09:56:07Z

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/bit", "models/conditional_detr", "models/convnext", "models/convnextv2", "models/d_fine", "models/dab_detr", "models/deformable_detr", "models/depth_anything", "models/detr", "models/pp_doclayout_v3", "models/resnet", "models/timm_backbone", "models/vitpose"]
quantizations: []

github-actions · 2026-02-04T10:06:17Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	674802ce	merge commit
PR	509ffc48	branch commit
main	480ed54e	base commit

✅ No failing test specific to this PR 🎉 👏 !

zucchini-nlp · 2026-02-04T10:27:15Z

Nice, slow CI passes and failing ones are just flaky

draft

71f301b

zucchini-nlp added 12 commits January 23, 2026 11:53

first make it work, then make pretty

d0f4762

👀

7ab4252

unused attributes?

56d3d0c

everythgin we need is in the config!

ded5d7b

push

cb302df

update

a725f67

update

3eae234

fixes

0e7792b

forgot

2323559

push more changes

a69cbc6

move it out from mxin

5459482

last test fixes i hope

754b616

zucchini-nlp added 3 commits January 27, 2026 13:18

delete backbone utils from utils

dc3d676

merge main

f7306fc

docs

49eda76

zucchini-nlp commented Jan 27, 2026

View reviewed changes

zucchini-nlp added 5 commits January 27, 2026 14:11

more docs updates

a4f4b51

remove backbone from args

06eb80c

docstring

a5d987d

get rid of circular import and revert utils/backbonutils for BC

5063b0b

fix more tests

cd46e95

zucchini-nlp changed the title ~~[WIP] Attempt at cleaning backbone utils~~ Attempt at cleaning backbone utils Jan 27, 2026

zucchini-nlp requested review from molbap and yonigozlan January 27, 2026 17:06

modular

35061b2

make comment more detailed for future us

eb88a4c

zucchini-nlp changed the title ~~Delete duplicate code in backbone utils~~ 🚨 Delete duplicate code in backbone utils Feb 2, 2026

zucchini-nlp requested review from Cyrilvallez and vasqu February 2, 2026 12:50

Cyrilvallez reviewed Feb 3, 2026

View reviewed changes

Merge branch 'main' into backbone

ac68691

Cyrilvallez approved these changes Feb 3, 2026

View reviewed changes

Cyrilvallez mentioned this pull request Feb 3, 2026

Remove timm backbone weight loading to prevent meta-tensor warnings in from_pretrained #42284

Closed

5 tasks

zucchini-nlp added 2 commits February 4, 2026 10:39

fix modular

1f3d29b

update tests after DETR refactor

ea9ac12

zucchini-nlp added 2 commits February 4, 2026 10:49

rename backbone_utils

223591c

delete bare is_timm_available

509ffc4

Cyrilvallez mentioned this pull request Feb 4, 2026

chore(typing): initial ty integration #43396

Open

5 tasks

zucchini-nlp enabled auto-merge (squash) February 4, 2026 10:27

zucchini-nlp merged commit 0c4fe5c into huggingface:main Feb 4, 2026
26 checks passed

Conversation

zucchini-nlp commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jan 16, 2026

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 2, 2026

Uh oh!

github-actions bot commented Feb 2, 2026

Uh oh!

github-actions bot commented Feb 2, 2026

CI Results

Commit Info

Uh oh!

Cyrilvallez commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Feb 3, 2026

Uh oh!

zucchini-nlp commented Feb 3, 2026

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

zucchini-nlp commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

CI Results

Commit Info

Uh oh!

github-actions bot commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026

CI Results

Commit Info

Uh oh!

zucchini-nlp commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zucchini-nlp commented Jan 16, 2026 •

edited

Loading

Cyrilvallez commented Feb 3, 2026 •

edited

Loading