-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8int加载Qwen-14b-chat,会报错RuntimeError: value cannot be converted to type at::Half without overflow #1475
Comments
用 bf16 试试 |
bf 16 也试了,尝试了不少参数更改,但都不太行,自己排查了下,可能是torch版本的原因,数据类型问题。我后续再尝试更换下toruch版本能否解决。 |
我也遇到了同样的问题,尝试使用bf16不起作用,希望能项目能支持量化后模型的微调。 |
我也遇到了 我8int加载Qwen-7b-chat报错 |
@Chen-mingxuan @wrl1224 @HelWireless 尝试下 |
我尝试了pip install bitsandbytes==0.41.1但没有作用,可以直接加载Qwen-7B-Chat-Int8进行lora训练,也算是一个平替方法。 |
https://huggingface.co/Qwen/Qwen-7B-Chat-Int8 |
https://huggingface.co/Qwen/Qwen-7B-Chat/discussions/10 可以看看这个,不知道有没有帮助 |
我修改了源码modeling_qwen.py中的572行attention_mask.masked_fill(~causal_mask, torch.finfo(query.dtype).min) |
经过测试最小值-65504.0 再小-65505.0就报错,对应的是半精度浮点数float16的最小值。 |
加载Qwen7b-chat 和Baichuan13b-chat 8 int 都比较正常,Qwen14b-chat加载目前会报错。报错内容如下:
RuntimeError: value cannot be converted to type at::Half without overflow
使用的4090gpu,win11下的wsl2的环境,训练命令如下:
The text was updated successfully, but these errors were encountered: