BERT is broken on `v4.49.0-Gemma-3`

### System Info

I'm using pytorch 2.6.0 on Linux.

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Consider the following code:

```py
import transformers

model = transformers.AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")
tokenizer = transformers.AutoTokenizer.from_pretrained("google-bert/bert-base-cased")

text = "Hi! My name is Stinky Bob and I'm a [MASK]."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
predicted_token_id = outputs["logits"][0, masked_index].argmax(axis=-1)
predicted_token = tokenizer.decode(predicted_token_id)
print(predicted_token)
```

On `v4.49.0-Gemma-3` it produces the following output:

```
man
```

On `4.49.0`, `4.48.0`, `4.40.0`, `4.30.0` it produces the following output:

```
friend
```

As far as I can see it breaks on `v4.49.0-Gemma-3` because the checkpoint loading is broken, and it doesn't load the weights for the `model.cls.predictions.transform.LayerNorm` properly (the weights are just default initialized).

### Expected behavior

I expect the BERT weights to be properly loaded and the output consistent with the previous version of transformers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BERT is broken on `v4.49.0-Gemma-3` #36802

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BERT is broken on v4.49.0-Gemma-3 #36802

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

BERT is broken on `v4.49.0-Gemma-3` #36802