doc: replace a broken example with a working one #595

stas00 · 2024-08-21T04:12:27Z

the quantization example in readme fails to run:

    return model_class.from_pretrained(
  File "/env/lib/conda/stas-inference/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3477, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named model.safetensors found in directory /data/huggingface/hub/models--lmsys--vicuna-7b-v1.5/snapshots/3321f76e3f527bd14065daf69dad9344000a201d.

it looks like datasets is looking for safetensors files and https://huggingface.co/lmsys/vicuna-7b-v1.5/tree/main doesn't have them.

so I replaced it with a working example from https://github.com/casper-hansen/AutoAWQ/blob/6f14fc7436d9a3fb5fc69299e4eb37db4ee9c891/examples/quantize.py

stas00 · 2024-08-21T04:25:47Z

I'd also add device_map="auto" and replace the awkward: **{"low_cpu_mem_usage": True, "use_cache": False} with a much easier to read/edit:

model = AutoAWQForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, use_cache=False, device_map="auto")

w/o device_map="auto" I get:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)

for other models I tried to quantize

doc: replace a broken example with a working one

3a10625

stas00 mentioned this pull request Aug 21, 2024

[Doc]: AutoAWQ quantization example fails vllm-project/vllm#7717

Closed

casper-hansen merged commit 79258d6 into casper-hansen:main Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: replace a broken example with a working one #595

doc: replace a broken example with a working one #595

stas00 commented Aug 21, 2024 •

edited

Loading

stas00 commented Aug 21, 2024

doc: replace a broken example with a working one #595

doc: replace a broken example with a working one #595

Conversation

stas00 commented Aug 21, 2024 • edited Loading

stas00 commented Aug 21, 2024

stas00 commented Aug 21, 2024 •

edited

Loading