Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP16 微调overflow #43

Closed
mynewstart opened this issue Jul 13, 2023 · 1 comment
Closed

FP16 微调overflow #43

mynewstart opened this issue Jul 13, 2023 · 1 comment

Comments

@mynewstart
Copy link

请问有人试过用DeepSpeed Chat的代码基于ZERO2+FP16微调吗,发现会下溢出, 换成BF16后同样发现embeddng 层某些参数变为0,导致后面梯度为nan,没法更新参数,请问怎么解决?

@GradientGuru
Copy link
Contributor

modeling代码已经更新,请在hugging face pull 最新代码。微调可参考本项目README更新

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants