ValueError: `run_compressed` is only supported for quantized_compressed models #36915

CanYing0913 · 2025-03-24T02:27:03Z

System Info

transformers version: 4.50.0
Platform: Linux-6.11.0-19-generic-x86_64-with-glibc2.39
Python version: 3.10.16
Huggingface_hub version: 0.29.3
Safetensors version: 0.5.3
Accelerate version: 1.5.2
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (GPU?): 2.6.0+cu124 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:
Using GPU in script?:
GPU type: NVIDIA GeForce RTX 3090

Who can help?

@SunMarc @MekkCyber

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Install transformers v4.50
run the following:

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")

Got the result:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 573, in from_pretrained
    return model_class.from_pretrained(
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/modeling_utils.py", line 272, in _wrapper
    return func(*args, **kwargs)
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4422, in from_pretrained
    hf_quantizer.preprocess_model(
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/quantizers/base.py", line 215, in preprocess_model
    return self._process_model_before_weight_loading(model, **kwargs)
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/quantizers/quantizer_compressed_tensors.py", line 121, in _process_model_before_weight_loading
    raise ValueError("`run_compressed` is only supported for quantized_compressed models")
ValueError: `run_compressed` is only supported for quantized_compressed models

Expected behavior

Using transformers==4.47.0 and transformers==4.49.0 resolves the issue. Is this a bug in the new version, or there are changes the model maintainer need to update its model accordingly?

The text was updated successfully, but these errors were encountered:

SunMarc · 2025-03-24T11:22:36Z

Thanks for the report ! This will be fixed here : #36921

dsikka · 2025-03-25T15:10:58Z

Hi @CanYing0913

We'll likely have to update the model as it was saved incorrectly.
A temporary workaround would be to update the model's config.json, changing the status from frozen to compressed

Thanks!

CanYing0913 · 2025-03-25T18:53:30Z

@dsikka @SunMarc Thanks for the help!

SunMarc · 2025-03-26T13:35:51Z

Since the models config have been updated, I will close this issue !

CanYing0913 added the bug label Mar 24, 2025

SunMarc mentioned this issue Mar 24, 2025

Fix regression compressed-tensors #36921

Closed

SunMarc closed this as completed Mar 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: `run_compressed` is only supported for quantized_compressed models #36915

ValueError: `run_compressed` is only supported for quantized_compressed models #36915

CanYing0913 commented Mar 24, 2025 •

edited

Loading

SunMarc commented Mar 24, 2025

dsikka commented Mar 25, 2025 •

edited

Loading

CanYing0913 commented Mar 25, 2025

SunMarc commented Mar 26, 2025

ValueError: run_compressed is only supported for quantized_compressed models #36915

ValueError: run_compressed is only supported for quantized_compressed models #36915

Comments

CanYing0913 commented Mar 24, 2025 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

SunMarc commented Mar 24, 2025

dsikka commented Mar 25, 2025 • edited Loading

CanYing0913 commented Mar 25, 2025

SunMarc commented Mar 26, 2025

ValueError: `run_compressed` is only supported for quantized_compressed models #36915

ValueError: `run_compressed` is only supported for quantized_compressed models #36915

CanYing0913 commented Mar 24, 2025 •

edited

Loading

dsikka commented Mar 25, 2025 •

edited

Loading