Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

denom = (exp_avg_var.add_(group['eps']).sqrt() / math.sqrt(bias_correction2)).add_(group['eps']) #18

Closed
yuanwei2019 opened this issue Oct 26, 2020 · 1 comment

Comments

@yuanwei2019
Copy link

作者你好,我发现Adabelief-Optimizer/PyTorch_Experiments/AdaBelief.py里的第157行:
‘ denom = (exp_avg_var.add_(group['eps']).sqrt() / math.sqrt(bias_correction2)).add_(group['eps'])’
exp_avg_var.add_(eps)这样是不是每次修正偏差都会导致exp_avg_var加上一个eps,和文中的St更新公式不一样。是不是应该改成exp_avg_var.add(group['eps'])或者是使用add_实验效果好?

@juntang-zhuang
Copy link
Owner

juntang-zhuang commented Oct 26, 2020

感谢指出,我没有测过用add。写code的时候没有想到这一点直接把后面的add_(group['eps'])复制到前面去了。有可能改过来之后效果更好,因为随着增加eps*t,分母变大导致stepsize逐渐变小,可能导致后期fientune不起作用。稍后我测一下。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants