-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
6B-模型推理需要多大显存? #20
Comments
@Copilot-X 请问你的运行代码是怎么样的呢?理论上用 bf16/fp16 加载模型只需要 12GB 左右显存 |
我跑了 demo , 加载了模型后 13G 左右显存占用, 推理时候再多 500MB 左右 |
加载推理的代码有么? 我对比一下看看 |
就仓库的呀: https://github.com/01-ai/Yi/blob/main/demo/text_generation.py |
目前模型是用bfloat16数据类型,6B模型至少需要13GB左右的显存。 |
200k上下文的6B与34B模型分别需要多少显存? |
想问下,推理速度有多少tokens / s |
本次 Chat 版本的发布特地增加了该部分内容。 |
按照readme给的代码,用的6B chat 11GB模型,8G显存,显卡是3070Ti |
6B-模型推理需要多大显存? 直接加载推理爆显存, 24G不够, 跟chatglm2-3的只需要15G左右不一样么?
The text was updated successfully, but these errors were encountered: