We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问有人试过用DeepSpeed Chat的代码基于ZERO2+FP16微调吗,发现会下溢出, 换成BF16后同样发现embeddng 层某些参数变为0,导致后面梯度为nan,没法更新参数,请问怎么解决?
The text was updated successfully, but these errors were encountered:
modeling代码已经更新,请在hugging face pull 最新代码。微调可参考本项目README更新
Sorry, something went wrong.
No branches or pull requests
请问有人试过用DeepSpeed Chat的代码基于ZERO2+FP16微调吗,发现会下溢出, 换成BF16后同样发现embeddng 层某些参数变为0,导致后面梯度为nan,没法更新参数,请问怎么解决?
The text was updated successfully, but these errors were encountered: