Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

通过webui.py 导入13B原模型,用8bit方式会报错 #5

Closed
edisonzf2020 opened this issue Jun 3, 2023 · 3 comments
Closed

通过webui.py 导入13B原模型,用8bit方式会报错 #5

edisonzf2020 opened this issue Jun 3, 2023 · 3 comments
Labels
solved This problem has been already solved.

Comments

@edisonzf2020
Copy link

通过webui.py 导入13B原模型,用8bit方式会报错,执行代码如下:
python src/web_demo.py --model_name_or_path ../models/Ziya-LLaMA-13B --quantization_bit 8
出错信息如下:

Traceback (most recent call last):
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/web_demo.py", line 18, in <module>
    model, tokenizer = load_pretrained(model_args, finetuning_args)
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/utils/common.py", line 182, in load_pretrained
    model = model.half() # cast all params to float16 for inference
  File "/home/hysz/anaconda3/envs/qlora/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1896, in half
    raise ValueError(
ValueError: `.half()` is not supported for `4-bit` or `8-bit` models. Please use the model as it is, since the model has already been casted to the correct `dtype`.
@hiyouga
Copy link
Owner

hiyouga commented Jun 3, 2023

请更新仓库代码后重新尝试。

@hiyouga hiyouga added the pending This problem is yet to be addressed. label Jun 3, 2023
@edisonzf2020
Copy link
Author

可以启动了,但是推理会报错,报错信息在另一个issue描述

@edisonzf2020
Copy link
Author

通过webui.py 导入13B原模型,用8bit方式会报错,执行代码如下: python src/web_demo.py --model_name_or_path ../models/Ziya-LLaMA-13B --quantization_bit 8 出错信息如下:

Traceback (most recent call last):
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/web_demo.py", line 18, in <module>
    model, tokenizer = load_pretrained(model_args, finetuning_args)
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/utils/common.py", line 182, in load_pretrained
    model = model.half() # cast all params to float16 for inference
  File "/home/hysz/anaconda3/envs/qlora/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1896, in half
    raise ValueError(
ValueError: `.half()` is not supported for `4-bit` or `8-bit` models. Please use the model as it is, since the model has already been casted to the correct `dtype`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved.
Projects
None yet
Development

No branches or pull requests

2 participants