Skip to content

Timm unification continued#44252

Open
zucchini-nlp wants to merge 23 commits intohuggingface:mainfrom
zucchini-nlp:timm-cont
Open

Timm unification continued#44252
zucchini-nlp wants to merge 23 commits intohuggingface:mainfrom
zucchini-nlp:timm-cont

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Feb 24, 2026

What does this PR do?

Deprecate timm backbone in favor of keeping all models within one timm folder, similar to other vision models. A backbone is just a variation of PreTrainedModel

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: detr, conditional_detr, beit, rt_detr, rt_detr_v2, dpt, depth_anything, prompt_depth_anything, mm_grounding_dino, grounding_dino, table_transformer, maskformer, oneformer, vitmatte, tvp, d_fine

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/depth_anything", "models/detr", "models/dpt", "models/grounding_dino", "models/maskformer", "models/mm_grounding_dino", "models/oneformer", "models/prompt_depth_anything", "models/rt_detr", "models/rt_detr_v2", "models/table_transformer", "models/tvp", "models/vitmatte"]
quantizations: []

Comment on lines +275 to +276
# Early exit for `timm` models, they aren't hosted on the hub usually
use_timm_backbone = kwargs.pop("use_timm_backbone", None)
Copy link
Copy Markdown
Member Author

@zucchini-nlp zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep it actually and use only when Autobackbone.from_pretrained(). We don't call from_pretrained anywhere across repo so it will be used only by users
Then we can delete _BaseAutoBackboneClass

("timesfm", "TimesFmConfig"),
("timesformer", "TimesformerConfig"),
("timm_backbone", "TimmBackboneConfig"),
("timm_backbone", "TimmBackboneConfig"), # for BC
Copy link
Copy Markdown
Member Author

@zucchini-nlp zucchini-nlp Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should map any timm_backbone to the new model class when loading, so we don't log deprecation warnings. Mapping happens in the auto-modeling file

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 7e809adc workflow commit (merge commit)
PR f7a0fece branch commit (from PR)
main 0ff46c90 base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

Comment on lines -505 to -508
@classmethod
def from_dict(cls, config_dict: dict[str, Any], **kwargs):
# Create a copy to avoid mutating the original dict
config_dict = config_dict.copy()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gemma3n vision is not a classifier and has no ImageClassificationModel, not needed!

Comment on lines +91 to +100
def __setattr__(self, key, value):
if (mapped_key := super().__getattribute__("special_attribute_map").get(key)) is not None:
if isinstance(mapped_key, (tuple, list)):
model_args = super().__getattribute__("__dict__").get(mapped_key[0])
model_args[mapped_key[1]] = value
else:
setattr(self, mapped_key[1], value)
else:
super().__setattr__(key, value)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird way to keep BC for setter/getter

Comment on lines +75 to +80
self.architecture = architecture
is_backbone_config = kwargs.get("backbone") is not None
self.architecture = kwargs.pop("backbone") if is_backbone_config else architecture
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allows loading old timm_backbone configs. No idea if we should log warnings, we can't deprecate it away from hub ckpt

Comment on lines 255 to 258
output_attentions: bool | None = None,
output_hidden_states: bool | list[int] | None = None,
return_dict: bool | None = None,
do_pooling: bool | None = None,
use_cache: bool | None = None,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to get rid of all, but we can't unless we use capture_outputs 🥲

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: detr, conditional_detr, beit, rt_detr, rt_detr_v2, dpt, depth_anything, prompt_depth_anything, mm_grounding_dino, grounding_dino, table_transformer, maskformer, oneformer, vitmatte, tvp, d_fine

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/beit", "models/conditional_detr", "models/d_fine", "models/depth_anything", "models/detr", "models/dpt", "models/grounding_dino", "models/maskformer", "models/mm_grounding_dino", "models/oneformer", "models/prompt_depth_anything", "models/rt_detr", "models/rt_detr_v2", "models/table_transformer", "models/tvp", "models/vitmatte"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN b07217a3 workflow commit (merge commit)
PR ba56a16b branch commit (from PR)
main 0133a75c base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@zucchini-nlp zucchini-nlp changed the title [WIP] timm unification continued Timm unification continued Feb 25, 2026
self.config = config

backbone = load_backbone(config)
backbone = AutoBackbone.from_config(config=config.backbone_config)
Copy link
Copy Markdown
Member Author

@zucchini-nlp zucchini-nlp Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goodbye load_backbone 👋🏻

now we can replace it with one-line from_config

Comment on lines -298 to -300
"timm_wrapper": [
# Simply add the prefix `timm_model`
# TODO: Would be probably much cleaner with a `add_prefix` argument in WeightRenaming
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_model_prefix can do pretty well and doesn't have false-positive matches when reverse mapping

for k, v in model._checkpoint_conversion_mapping.items()
]

# TODO: should be checked recursively on submodels!!
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed it for timm, so we can define it once in above mapping and re-use in all models where Timm is a backbone

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, colpali, colqwen2, conditional_detr, d_fine, dab_detr, deformable_detr, depth_anything, detr, dpt, gemma3n, grounding_dino

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants