Skip to content

Fix TypeError when loading float8 models by falling back to bfloat16 in local_torch_dtype#44596

Closed
Desel72 wants to merge 1 commit into
huggingface:mainfrom
Desel72:fix/issue-#44589
Closed

Fix TypeError when loading float8 models by falling back to bfloat16 in local_torch_dtype#44596
Desel72 wants to merge 1 commit into
huggingface:mainfrom
Desel72:fix/issue-#44589

Conversation

@Desel72
Copy link
Copy Markdown

@Desel72 Desel72 commented Mar 11, 2026

Fix TypeError when loading float8 models by falling back to bfloat16 in local_torch_dtype

What does this PR do?

When loading FP8 models (e.g. Qwen/Qwen3.5-35B-A3B-FP8) with dtype="auto", the auto-detected dtype from checkpoint weights can be torch.float8_e4m3fn. This dtype flows to local_torch_dtype() which calls torch.set_default_dtype(), but PyTorch does not support float8 types as default dtype, causing:
TypeError: couldn't find storage object Float8_e4m3fnStorage

This happens when:

  • The top-level config has no dtype set (common with composite models where dtype is only in a sub-config)
  • _get_dtype() auto-detects torch.float8_e4m3fn from the checkpoint weights
  • FineGrainedFP8HfQuantizer doesn't override update_dtype(), so it can't intercept this

This PR adds a check in local_torch_dtype() to fall back to torch.bfloat16 when a float8 dtype is encountered. This only affects model skeleton initialization (set_default_dtype); actual float8 weights are still loaded correctly downstream via _load_pretrained_model.

Also adds a unit test to verify the fallback behavior for both float8_e4m3fn and float8_e5m2.

Fixes #44589

Before submitting

Who can review?

@Cyrilvallez (model loading / from_pretrained)
@SunMarc (quantization)

@Desel72
Copy link
Copy Markdown
Author

Desel72 commented Mar 11, 2026

Hi @Rocketknight1
Could you please share the reasons they were closed and what I should update to move toward merging?

@Rocketknight1
Copy link
Copy Markdown
Member

We're trying to avoid pure code agent PRs right now and working on formalizing a policy against them. The main reason is simply that we're able to run our own code agents if we need to - users running them on random issues just adds a useless middleman.

@Desel72
Copy link
Copy Markdown
Author

Desel72 commented Mar 11, 2026

Thanks for your reply. Is there a any way to become a merged PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError: couldn't find storage object Float8_e4m3fnStorage

2 participants