Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

量化后nan问题 #3

Open
huyiming2018 opened this issue May 16, 2024 · 4 comments
Open

量化后nan问题 #3

huyiming2018 opened this issue May 16, 2024 · 4 comments

Comments

@huyiming2018
Copy link

您好,非常好的工作。尝试复现论文中的指标,我的模型是llama2-7b,使用run_llama.sh脚本量化后,模型输出包含大量nan,数据集为c4,类似情况如何解决呢。谢谢!

@GuoYi0
Copy link
Collaborator

GuoYi0 commented May 16, 2024

@huyiming2018 是直接运行的 run_llama.sh那个脚本吗?

@chuangzhidan
Copy link

您好,非常好的工作。尝试复现论文中的指标,我的模型是llama2-7b,使用run_llama.sh脚本量化后,模型输出包含大量nan,数据集为c4,类似情况如何解决呢。谢谢!

好奇想问下,你是在跑脚本做eval的时候发现的,还是你对已有的量化模型做了加载和推理后发现的?

@huyiming2018
Copy link
Author

run_llama.sh

是的,group_size改成128或64就可以了,默认是per-channel量化

@chuangzhidan
Copy link

group_size改成128或64就可以了,默认是per-channel量化

很想知道怎么推理:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants