Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4bit量化后的模型为什么这么大 #31

Open
liuyukid opened this issue Jul 27, 2023 · 2 comments
Open

4bit量化后的模型为什么这么大 #31

liuyukid opened this issue Jul 27, 2023 · 2 comments

Comments

@liuyukid
Copy link

liuyukid commented Jul 27, 2023

chatglm2-6b-int4的模型大概在4G左右,我看LinkSoul/Chinese-Llama-2-7b-4bit的模型大小在13G左右
截屏2023-07-27 11 20 06

@CRGBS
Copy link

CRGBS commented Jul 27, 2023

量化方法不同 可以考慮用llamacpp能運行的ggml版本

@wutengcoding
Copy link

请问本地加载的时候如何分布式加载呢,有多个gpu会自动做分片加载吗,单gpu装不下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants