-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
paddle 的 gradient clipping 疑似无法被触发,求解答。 #775
Comments
现在看上去只有 kSgdSparseCpuTraining 这个分枝会触发 Gradient clipping ? |
代码里面把 SgdLocalUpdater 换成 SgdThreadUpdater 是可以解决gradient clipping 的问题。希望有一个比较彻底的办法,怀疑这段代码的逻辑有些问题。 |
所以SgdThreadUpdater是否可完全替代 SgdLocalUpdater ? @emailweixu @reyoung |
One major difference is that SgdThreadUpdater cannot use with ConcurrentRemoteUpdater, which is the current default. However, ConcurrentRemoteParameterUpdater hasn't shown clear advantage over RemoteParameterUpdater yet. So it should be ok to default to use RemoteParameterUpdater (--use_old_updater=true). And I agree that we should default to use SgdThreadUpdater. There are a few cases not supported by SgdThreadUpdater. As long as we give clear message about them, it should be fine. |
* add pre-commit CI
paddle 实现了 对 gradient 做 element-wise 的 hard clipping。我的设置方式是在配置中调用:
输出配置解析结果,也确定参数的 gradient clipping 阈值设置成功。
代码中 FirstOrderOptimizer.cpp文件中,OptimizerWithGradientClipping::update 实现了梯度的clipping,但是这个函数现在并没有没有被调用。
训练算法参数如下:
目前只是本地gpu 单卡训练,未开启 do _average_in_cpu.
代码中:trainer 的 init 调用 createParameterUpdater,然后会 new SgdLocalUpdater,再 reset成 AverageOptimizer,因为没有设置 average sgd 的窗口,这个 optimizer 什么都没干。
而只有 OptimizerWithRegularizer 会创建 OptimizerWithGradientClipping,始终无法确定如何能触发 gradient clipping 。
求解答,谢谢啦。
The text was updated successfully, but these errors were encountered: