Skip to content

Weights not initialized correctly when instantiating model with a pretrained backbone #38061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 4 tasks
matteot11 opened this issue May 10, 2025 · 1 comment
Open
1 of 4 tasks
Labels

Comments

@matteot11
Copy link

matteot11 commented May 10, 2025

System Info

  • transformers version: 4.51.3
  • Platform: macOS-14.4.1-arm64-arm-64bit
  • Python version: 3.9.19
  • Huggingface_hub version: 0.30.2
  • Safetensors version: 0.4.3
  • Accelerate version: 1.2.1
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.7.0 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:

Who can help?

@amyeroberts
@qubvel

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I am trying to load a Mask2Former model with a pretrained backbone following this PR.

However, the backbone weights do not appear to be properly initialized when using use_pretrained_backbone=True in the config. Here's a minimal example:

from transformers import (
    SwinForImageClassification,
    Mask2FormerForUniversalSegmentation,
    Mask2FormerConfig,
)

swin_model_name = "microsoft/swin-tiny-patch4-window7-224"

def params_match(params1, params2):
    return all([(p1 == p2).all() for p1, p2 in zip(params1, params2)])

# load pretrained swin model
swin_model = SwinForImageClassification.from_pretrained(swin_model_name)

# load Mask2Former with a pretrained swin backbone
config = Mask2FormerConfig(
    backbone=swin_model_name,
    use_pretrained_backbone=True,
)
m2f = Mask2FormerForUniversalSegmentation(config)

# AssertionError: parameters don't match
assert params_match(
    swin_model.base_model.encoder.parameters(),
    m2f.model.pixel_level_module.encoder.encoder.parameters(),
)

The Swin parameters in Mask2Former do not match those from the separately loaded Swin model, suggesting the backbone was not properly initialized.

However, if I explicitly load the backbone via load_backbone function, the parameters do match:

from transformers.utils.backbone_utils import load_backbone

m2f.model.pixel_level_module.encoder = load_backbone(config)

# Now passes
assert params_match(
    swin_model.base_model.encoder.parameters(),
    m2f.model.pixel_level_module.encoder.encoder.parameters(),
)

Could this be caused by the post_init() method being called during the instantiation of Mask2Former, even if a pretrained backbone is being loaded?

Expected behavior

As mentioned before, the backbone should be correctly initialized when specifying use_pretrained_backbone=True in the config.

@matteot11 matteot11 added the bug label May 10, 2025
@Rocketknight1
Copy link
Member

cc @qubvel and maybe @NielsRogge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants