-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
success | load in 8 bit. it runs on one-3090ti (24G) #38
Comments
老哥,我用你的代码,加载模型时不报错,提问题时报这个错 显卡M40 24G cuda11.6 pytorch 1.13.1+cu116 |
generate函数我使用的是MOSS原文中的方式。目前来看,可以运行。你先试试FastChat项目,能否正常运行?看看是不是transformers的库或者tokenizer库需要升级了? |
remove do_sample=True can pass the error |
感谢您的回答,这样虽然不报错了,但是会卡很久没有回复 |
为了验证是不是只有MOSS有这个问题,我特意在ChatGLM-6B上用了相同的办法,同样也是卡住很久没有回复 |
I download model to local machine. then use FastChat env. so I don't need create another env for MOSS. it works!
Because 24G is not enough to MOSS( fnlp/moss-moon-003-sft), I try load model in 8 bit. It's ok and make response very quickly.
show my code:
The text was updated successfully, but these errors were encountered: