-
Notifications
You must be signed in to change notification settings - Fork 815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'model.layers.0.self_attn.q_proj.qweight' #1528
Comments
@LIUKAI0815 Thanks for the feedback. Could you kindly tell me which model are you using? This requires using the official GPTQ quantized checkpoints from HF. |
I have the same issue using a quantized Mistral model : TheBloke/Mistral-7B-v0.1-AWQ |
@jershi425 I'm using the Qwen1.5-14B-Chat |
Has this problem been solved? I have the same error when using a quantized mixtral model |
Hi @Mary-Sam could u please list more details/log on your issue? So we can look into it. |
Hi @nv-guomingz I am using the latest version of My model has the following quantization configuration
And I am getting the following error:
|
python3 convert_checkpoint.py --model_dir /workspace/lk/model/Qwen/14B --output_dir ./tllm_checkpoint_1gpu_gptq --dtype float16 --use_weight_only --weight_only_precision int4_gptq --per_group
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024042300
0.10.0.dev2024042300
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:02<00:00, 3.45it/s]
[04/30/2024-10:16:11] Some parameters are on the meta device device because they were offloaded to the cpu.
loading weight in each layer...: 0%| | 0/40 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 365, in
main()
File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 357, in main
convert_and_save_hf(args)
File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 319, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 325, in execute
f(args, rank)
File "/workspace/lk/model/tensorRT/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 305, in convert_and_save_rank
qwen = from_hugging_face(
File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/convert.py", line 1081, in from_hugging_face
weights = load_from_gptq_qwen(
File "/opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/models/qwen/weight.py", line 158, in load_from_gptq_qwen
comp_part = model_params[prefix + key_list[0] + comp + suf]
KeyError: 'model.layers.0.self_attn.q_proj.qweight'
The text was updated successfully, but these errors were encountered: