Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yi-34B 需要的资源是多少? #55

Closed
xgysigned opened this issue Nov 7, 2023 · 13 comments
Closed

Yi-34B 需要的资源是多少? #55

xgysigned opened this issue Nov 7, 2023 · 13 comments
Assignees
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-complete Your PR changes impact docs and the related docs have been already added. quantization

Comments

@xgysigned
Copy link

yi-34b需要的资源是多少?单卡3090或4090能跑吗?还是需要多卡?

@ZhaoFancy ZhaoFancy added the doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. label Nov 7, 2023
@ZhaoFancy ZhaoFancy changed the title yi-34b需要的资源是多少? Yi-34B 需要的资源是多少? Nov 7, 2023
@crapthings
Copy link

image

let's find out

@crapthings
Copy link

crapthings commented Nov 7, 2023

image

3 块48g或者a100 80g的够不够不知道

@crapthings
Copy link

image image

@ericjank
Copy link

ericjank commented Nov 7, 2023

image image

真有钱

@ericjank
Copy link

ericjank commented Nov 7, 2023

image 3 块48g或者a100 80g的够不够不知道

哈尔滨的朋友?

@wangye01inf
Copy link

wangye01inf commented Nov 7, 2023

@xgysigned 4090/3090 的显存应该在 24 GB,34B 参数以 float16/bfloat16 加载需要 34 GB*2=68 GB 左右显存,需要上多卡

多卡可以考虑使用仓库中的 TP Demo:https://github.com/01-ai/Yi/blob/main/demo/text_generation_tp.py

也可以考虑使用 vllm/llamacpp 等社区开源的推理框架的一些特性来进一步降低显存的需求以及提升推理性能:

@xihajun
Copy link

xihajun commented Nov 7, 2023

可以考虑支持autotrain-advanced,似乎支持int4推理

@xain
Copy link

xain commented Nov 8, 2023

希望量化后的版本支持24G显卡。

@waltcow
Copy link

waltcow commented Nov 10, 2023

希望量化后的版本支持24G显卡。

@m1105550
Copy link

卡一個量化後版本

@Samge0
Copy link

Samge0 commented Nov 14, 2023

4块2080ti魔改22g的显卡(22g*4=88g)可以跑吗?目前我有两块

@AmeowCAT
Copy link

4bit量化版本应该正好支持24G显存的显卡

@Yimi81 Yimi81 added the doc-complete Your PR changes impact docs and the related docs have been already added. label Mar 8, 2024
@255doesnotexist
Copy link

q4_k_s is suitable for deploying on Tesla P40 (24G VRAM).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-complete Your PR changes impact docs and the related docs have been already added. quantization
Projects
None yet
Development

No branches or pull requests