Correct Tapas initialization by Rocketknight1 · Pull Request #44575 · huggingface/transformers

Rocketknight1 · 2026-03-10T14:42:40Z

Some parameters in Tapas are initialized in __init__() and not reinitialized in _init_weights(), which means that if the model is created on the meta device, those parameters do not get a weight initialization. This causes a crash later if the uninitialized memory has some NaN values in it! This caused the test_all_tensors_are_parameter_or_buffer test to be flaky.

This PR leaves tensor creation in __init__() but moves initialization to _init_weights()

cc @vasqu

…e doesn't leave us with uninitialized tensors

github-actions · 2026-03-10T14:43:49Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: tapas

Rocketknight1 · 2026-03-10T14:43:57Z

run-slow: tapas

github-actions · 2026-03-10T14:45:23Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/tapas"]
quantizations: []

vasqu

Thanks for the fix, looks good to me. Makes me wonder how we can detect these misses more easily 😓

HuggingFaceDocBuilderDev · 2026-03-10T14:52:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Rocketknight1 · 2026-03-10T14:55:36Z

@vasqu we could expand test_all_tensors_are_parameter_or_buffer to assert no NaNs in the model parameters. Because it just runs a forward pass and doesn't check outputs, most operations complete fine even with NaNs. tapas only reports the failure because those parameters are used in torch.distribution stuff which breaks on NaN values.

This wouldn't catch every failure (because the NaNs only appear sometimes) but I think (?) it should create a flaky failure for any uninitialized model parameter. Not sure how to make that flaky failure more reliable, though!

github-actions · 2026-03-10T15:00:49Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	041416bc	workflow commit (merge commit)
PR	b1574744	branch commit (from PR)
main	1cbb9c2e	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

Rocketknight1 · 2026-03-10T15:03:58Z

Merging this and will think about a follow-up PR to surface this class of bug more reliably and with less flakiness.

Do initialization in _init_weights so that creation on the meta devic…

b157474

…e doesn't leave us with uninitialized tensors

Rocketknight1 marked this pull request as ready for review March 10, 2026 14:43

vasqu approved these changes Mar 10, 2026

View reviewed changes

Rocketknight1 added this pull request to the merge queue Mar 10, 2026

Merged via the queue into main with commit 3b6c11d Mar 10, 2026
25 checks passed

Rocketknight1 deleted the correct_tapas_initialization branch March 10, 2026 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct Tapas initialization#44575

Correct Tapas initialization#44575
Rocketknight1 merged 1 commit intomainfrom
correct_tapas_initialization

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

vasqu left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2026

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 10, 2026

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026

CI Results

Commit Info

Uh oh!

Rocketknight1 commented Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants