Fix regression compressed-tensors #36921

SunMarc · 2025-03-24T10:47:13Z

What does this PR do?

This PR fixes the compressed-tensors model loading as it was broken for some models by this PR. Impacted models are models with config attribute "quantization_status": "frozen" (e.g. this is the case for the FP8 model) because we defaulted run_compressed to be True in the config. Thus, it triggers an error immediately.

Fixes #36915

To reproduce:

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")

Also maybe we shouldn't run the model with run_compressed in the case the model is frozen, but for the fp8 model, the weights are actually in fp8 format.

github-actions · 2025-03-24T10:47:24Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

HuggingFaceDocBuilderDev · 2025-03-24T11:13:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc · 2025-03-24T13:43:28Z

cc @rahul-tuli

dsikka · 2025-03-25T15:01:30Z

src/transformers/quantizers/quantizer_compressed_tensors.py

@@ -154,7 +152,8 @@ def is_quantization_compressed(self):

        return (
            self.quantization_config.quantization_config is not None
-            and self.quantization_config.quantization_config.quantization_status == QuantizationStatus.COMPRESSED
+            and self.quantization_config.quantization_config.quantization_status
+            in [QuantizationStatus.COMPRESSED, QuantizationStatus.FROZEN]


This is not always True.

The model listed is an old model which is failing as the config itself is incorrect and should be updated or run_compressed should be passed in as False.

The FROZEN state is reserved for models that have already been decompressed.

Sounds good then, this is what I thought . Can you update the config for those models (the most used ones should be enough !) ?

LMK when this is done and I will close this PR !

Hi @SunMarc I have updated our most downloaded models, you should be good to go!

fix compressed-tensors

9dff86d

github-actions bot marked this pull request as draft March 24, 2025 10:47

style

eab4db5

SunMarc added 3 commits March 24, 2025 12:16

fix

ed8e673

style

d4e6a76

fix

f92a3c5

SunMarc marked this pull request as ready for review March 24, 2025 11:21

github-actions bot requested review from ArthurZucker and Rocketknight1 March 24, 2025 11:21

SunMarc mentioned this pull request Mar 24, 2025

ValueError: run_compressed is only supported for quantized_compressed models #36915

Closed

4 tasks

MekkCyber added 2 commits March 24, 2025 19:55

Merge branch 'main' into fix-compressed-tensors

faad39c

Merge branch 'main' into fix-compressed-tensors

0fca0f2

dsikka suggested changes Mar 25, 2025

View reviewed changes

SunMarc closed this Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix regression compressed-tensors #36921

Fix regression compressed-tensors #36921

Uh oh!

SunMarc commented Mar 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Mar 24, 2025

Uh oh!

SunMarc commented Mar 24, 2025 •

edited

Loading

Uh oh!

dsikka Mar 25, 2025 •

edited

Loading

Uh oh!

SunMarc Mar 25, 2025

Uh oh!

SunMarc Mar 25, 2025

Uh oh!

rahul-tuli Mar 25, 2025

Uh oh!

SunMarc Mar 26, 2025

Uh oh!

Uh oh!

Fix regression compressed-tensors #36921

Fix regression compressed-tensors #36921

Uh oh!

Conversation

SunMarc commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Mar 24, 2025

Uh oh!

SunMarc commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsikka Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SunMarc Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

SunMarc Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

rahul-tuli Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

SunMarc Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SunMarc commented Mar 24, 2025 •

edited

Loading

SunMarc commented Mar 24, 2025 •

edited

Loading

dsikka Mar 25, 2025 •

edited

Loading