-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question again on gradient_clipping_threshold #3696
Comments
I have the same problem. but it hasn't been solved yet ... |
Previously, I just changed |
@yu239 The v2 API always use |
@qingqing01 I use the class paddle::Trainer in the C++ code. It seems that SgdLocalUpdater is always used. So I can just change the Paddle source code (TrainerInternal.cpp) to use SgdThreadUpdater? Is there any potential issue by doing so? |
@qingqing01 I can hack the source code so that SgdThreadUpdater is used in place of SgdLocalUpdater. But this does not seem like a final solution. If I use paddle::Trainer, is there any official way that I can specify SgdThreadUpdater without modifying the Paddle source code? |
In an old Issue #775 , there seemed to be some discussions on how the gradient clipping is triggered. Currently, on a single machine Paddle always uses SgdLocalUpdater. However, gradient clipping is only used in SgdThreadUpdater. Is there any plan to fix this issue? Or can we always use SgdThreadUpdater?
The text was updated successfully, but these errors were encountered: