We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者你好,我发现Adabelief-Optimizer/PyTorch_Experiments/AdaBelief.py里的第157行: ‘ denom = (exp_avg_var.add_(group['eps']).sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])’ exp_avg_var.add_(eps)这样是不是每次修正偏差都会导致exp_avg_var加上一个eps,和文中的St更新公式不一样。是不是应该改成exp_avg_var.add(group['eps'])或者是使用add_实验效果好?
The text was updated successfully, but these errors were encountered:
感谢指出,我没有测过用add。写code的时候没有想到这一点直接把后面的add_(group['eps'])复制到前面去了。有可能改过来之后效果更好,因为随着增加eps*t,分母变大导致stepsize逐渐变小,可能导致后期fientune不起作用。稍后我测一下。
Sorry, something went wrong.
No branches or pull requests
作者你好,我发现Adabelief-Optimizer/PyTorch_Experiments/AdaBelief.py里的第157行:
‘ denom = (exp_avg_var.add_(group['eps']).sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])’
exp_avg_var.add_(eps)这样是不是每次修正偏差都会导致exp_avg_var加上一个eps,和文中的St更新公式不一样。是不是应该改成exp_avg_var.add(group['eps'])或者是使用add_实验效果好?
The text was updated successfully, but these errors were encountered: