-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 6th No.13】为 Paddle 新增 RAdam / NAdam API 之 NAdam #849
Conversation
输出数值结果的一致性和数据类型是否正确,使用 PyTorch 作为参考标准 | ||
|
||
- **计算精度** | ||
需要保证 `前向/后向` 计算的精度正确性,使用 PyTorch 作为参考标准 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
补充具体测试用例
- $\mu _t$ 与 $\mu _ {t+1}$ | ||
|
||
PyTorch 使用 `momentum_decay` 与一个系数 `0.96` 和 `step` 组合计算,TensorFlow 只与 `step` 相关 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
详细对比下Paddle优化器参数与PyTorch区别,PyTorch涵盖的必要功能原则Paddle都能覆盖,且签名要和Paddle现有优化器风格保持一致。建议这里加个表格,详细对比Paddle已有Adam优化器、PyTorch优化器、拟设计的Paddle新增NAdam优化器
Update 20240409
@cxxly 请评审! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Docs
Description
【Hackathon 6th No.13】为 Paddle 新增 RAdam / NAdam API
这里涉及两个算法,此 PR 为其中
NAdam
算法。请评审!