-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
经过简单设置后,MOSS可以在16GB显存的单张显卡上运行 #35
Comments
人才啊。 GPU不够的地方用CPU来补充吗? |
我使用load in 8 bit, 成功加载模型。运行速度也很快。比你这个方法的速度要快。基本上秒出。 我是3090, 24G , 单卡单机。 |
不知道 |
请问如何修改代码? |
|
"12GiB"改成"8GiB" 可以在4070ti 12GB的显卡上跑起来, 就是回答需要5分钟 |
我买的阿里云gpu服务器,30GiB显存,回答都很慢 十几秒,你们怎么忍受的? |
请问您用的是windows系统吗,您能否将您更改后的moss_cli_demo.py发送过来,谢谢! |
尝试使用load_in_8bit 加载 int4的模型,在NVIDIA GeForce RTX 3090 24G一块卡上运行很慢,生成一篇600字的文章要4minute |
16G显存+32G内存勉强运行,速度比较慢,但也算可以用
只需要把
moss_cli_demo.py
中31至33行进行简单修改即可这边最大GPU内存设置为12GB是为了给CUDA kernels留出空间以避免OOM
参考:accelerate usage guides
希望可以帮到没有很多卡的业余玩家
The text was updated successfully, but these errors were encountered: