Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: run_compressed is only supported for quantized_compressed models #36915

Closed
4 tasks
CanYing0913 opened this issue Mar 24, 2025 · 4 comments
Closed
4 tasks
Labels

Comments

@CanYing0913
Copy link

CanYing0913 commented Mar 24, 2025

System Info

  • transformers version: 4.50.0
  • Platform: Linux-6.11.0-19-generic-x86_64-with-glibc2.39
  • Python version: 3.10.16
  • Huggingface_hub version: 0.29.3
  • Safetensors version: 0.5.3
  • Accelerate version: 1.5.2
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.6.0+cu124 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA GeForce RTX 3090

Who can help?

@SunMarc @MekkCyber

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Install transformers v4.50
  2. run the following:
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")
model = AutoModelForCausalLM.from_pretrained("neuralmagic/Meta-Llama-3.1-8B-FP8")
  1. Got the result:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 573, in from_pretrained
    return model_class.from_pretrained(
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/modeling_utils.py", line 272, in _wrapper
    return func(*args, **kwargs)
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4422, in from_pretrained
    hf_quantizer.preprocess_model(
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/quantizers/base.py", line 215, in preprocess_model
    return self._process_model_before_weight_loading(model, **kwargs)
  File "/home/canying/miniforge3/envs/apq/lib/python3.10/site-packages/transformers/quantizers/quantizer_compressed_tensors.py", line 121, in _process_model_before_weight_loading
    raise ValueError("`run_compressed` is only supported for quantized_compressed models")
ValueError: `run_compressed` is only supported for quantized_compressed models

Expected behavior

Using transformers==4.47.0 and transformers==4.49.0 resolves the issue. Is this a bug in the new version, or there are changes the model maintainer need to update its model accordingly?

@SunMarc
Copy link
Member

SunMarc commented Mar 24, 2025

Thanks for the report ! This will be fixed here : #36921

@dsikka
Copy link
Contributor

dsikka commented Mar 25, 2025

Hi @CanYing0913

We'll likely have to update the model as it was saved incorrectly.
A temporary workaround would be to update the model's config.json, changing the status from frozen to compressed

Thanks!

@CanYing0913
Copy link
Author

@dsikka @SunMarc Thanks for the help!

@SunMarc
Copy link
Member

SunMarc commented Mar 26, 2025

Since the models config have been updated, I will close this issue !

@SunMarc SunMarc closed this as completed Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants