Skip to content

[Devstral 24B] FP8 is currently not working correctly #42746

@patrickvonplaten

Description

@patrickvonplaten

System Info

Latest transformers "main".

Who can help?

@SunMarc @MekkCyber

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

If you run this code snippet: https://huggingface.co/mistralai/Devstral-Small-2-24B-Instruct-2512#transformers with dequantize=False, you will notice that it has infinite repetition issues.

If however you run the model from this PR: #42744, you can see that everything works fine which leads to the conclusion that something funky is going on with the activation scales (maybe they give inf values somewhere?).

The same activation scales work for vLLM: https://huggingface.co/mistralai/Devstral-Small-2-24B-Instruct-2512#vllm-recommended so there is probably something we can do inside transformers to fix it?

Expected behavior

That FP8 works correctly

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions