Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Modeling] Load FP8 safetensors such as DeepSeek #36828

Merged
merged 2 commits into from
Mar 27, 2025

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Mar 19, 2025

Purpose

  • Support loading safetensors in FP8

Changes

  • Add F8_E4M3 to str_to_torch_dtype mapping used by safetensors loading logic
  • Add a value error Cannot load safetensors of unknown dtype {k_dtype} if string lookup fails

Testing

  • Load DeepSeekV3, which has fp8 weight safetensors
from transformers import AutoModelForCausalLM

model_id = "deepseek-ai/DeepSeek-V3"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True,
)

Reviewers

@ArthurZucker

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@github-actions github-actions bot marked this pull request as draft March 19, 2025 15:30
Copy link

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

@kylesayrs kylesayrs changed the title support loading fp8 [Modeling] Load FP8 safetensors such as DeepSeek Mar 19, 2025
@kylesayrs kylesayrs closed this Mar 19, 2025
@kylesayrs kylesayrs reopened this Mar 19, 2025
@kylesayrs kylesayrs marked this pull request as ready for review March 19, 2025 20:07
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, is there no requirements (like cuda specific requirements?)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@kylesayrs
Copy link
Contributor Author

@ArthurZucker Although fp8 operations have always been someone limited and hardware dependent (gpu vs cpu), there is no hardware requirement for loading fp8 tensors outside of the required torch version

@ArthurZucker
Copy link
Collaborator

Will just update the branch to have the ci run!

@ArthurZucker ArthurZucker merged commit d6d930a into huggingface:main Mar 27, 2025
18 checks passed
@ArthurZucker
Copy link
Collaborator

Thanks @kylesayrs 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants