通过webui.py 导入13B原模型，用8bit方式会报错 #5

edisonzf2020 · 2023-06-03T04:23:47Z

通过webui.py 导入13B原模型，用8bit方式会报错，执行代码如下：
python src/web_demo.py --model_name_or_path ../models/Ziya-LLaMA-13B --quantization_bit 8
出错信息如下：

Traceback (most recent call last):
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/web_demo.py", line 18, in <module>
    model, tokenizer = load_pretrained(model_args, finetuning_args)
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/utils/common.py", line 182, in load_pretrained
    model = model.half() # cast all params to float16 for inference
  File "/home/hysz/anaconda3/envs/qlora/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1896, in half
    raise ValueError(
ValueError: `.half()` is not supported for `4-bit` or `8-bit` models. Please use the model as it is, since the model has already been casted to the correct `dtype`.

The text was updated successfully, but these errors were encountered:

hiyouga · 2023-06-03T08:35:28Z

请更新仓库代码后重新尝试。

edisonzf2020 · 2023-06-03T13:31:34Z

可以启动了，但是推理会报错，报错信息在另一个issue描述

edisonzf2020 · 2023-06-03T13:32:19Z

通过webui.py 导入13B原模型，用8bit方式会报错，执行代码如下： python src/web_demo.py --model_name_or_path ../models/Ziya-LLaMA-13B --quantization_bit 8 出错信息如下：

Traceback (most recent call last):
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/web_demo.py", line 18, in <module>
    model, tokenizer = load_pretrained(model_args, finetuning_args)
  File "/home/hysz/AI/LLaMA-Efficient-Tuning/src/utils/common.py", line 182, in load_pretrained
    model = model.half() # cast all params to float16 for inference
  File "/home/hysz/anaconda3/envs/qlora/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1896, in half
    raise ValueError(
ValueError: `.half()` is not supported for `4-bit` or `8-bit` models. Please use the model as it is, since the model has already been casted to the correct `dtype`.

hiyouga added the pending This problem is yet to be addressed. label Jun 3, 2023

edisonzf2020 closed this as completed Jun 3, 2023

hiyouga added solved This problem has been already solved. and removed pending This problem is yet to be addressed. labels Jun 4, 2023

DBtxy mentioned this issue Jul 27, 2023

单节点多卡A100 全量微调 CUDA error: an illegal memory access was encountered #267

Closed

godfly mentioned this issue Aug 17, 2023

大数据量全参数预训练报错、流式读数据报错 #549

Closed

HaimianYu mentioned this issue Nov 24, 2023

deepspeed多机多卡，训练以第一个batch卡住，然后报错Socket Timeout #1630

Closed

1 task

feria-tu mentioned this issue May 17, 2024

在昇腾npu环境下运行报错 #3779

Closed

1 task

zjxxsr mentioned this issue Jun 11, 2024

使用单机多卡微调Qwen2-72B #4205

Closed

1 task

Mr-Otaku-Lin mentioned this issue Jun 13, 2024

Qwen2-7B lora训练后推理出错 #4251

Closed

1 task

zhoushaoxiang mentioned this issue Jun 14, 2024

Ascend-D910 训练 RuntimeError: SET StreamOverflowSwitch Failed. #4284

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

通过webui.py 导入13B原模型，用8bit方式会报错 #5

通过webui.py 导入13B原模型，用8bit方式会报错 #5

edisonzf2020 commented Jun 3, 2023

hiyouga commented Jun 3, 2023

edisonzf2020 commented Jun 3, 2023

edisonzf2020 commented Jun 3, 2023

通过webui.py 导入13B原模型，用8bit方式会报错 #5

通过webui.py 导入13B原模型，用8bit方式会报错 #5

Comments

edisonzf2020 commented Jun 3, 2023

hiyouga commented Jun 3, 2023

edisonzf2020 commented Jun 3, 2023

edisonzf2020 commented Jun 3, 2023