Bad convergence when using momentum optimizer #696

kuke · 2018-03-08T02:41:43Z

The contrast training experiment shows the DeepASR model converges well when the Adam optimizer is used #676. But when changed to the momentum optimizer, the convergence turns out to be bad. There should be some problems in the implementation of the momentum optimizer in Fluid.

Here is the comparsion of training accuracy on 4 GPUs between Fluid and Houyi with the same setting:

Parameters:

batch_size: 128
device: GPU
hidden_dim: 1024
learning_rate: 0.00016
minimum_batch_size: 1
parallel: True
proj_dim: 512
stacked_num: 5
momentum: 0.9

The text was updated successfully, but these errors were encountered:

shanyi15 · 2018-08-15T09:55:58Z

您好，此issue在近一个月内暂无更新，我们将于今天内关闭。若在关闭后您仍需跟进提问，可重新开启此问题，我们将在24小时内回复您。因关闭带来的不便我们深表歉意，请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

kuke added the DeepASR label Mar 8, 2018

wangkuiyi assigned kuke Mar 10, 2018

kuke mentioned this issue Jul 18, 2018

Fix serious bug in nesterov momentum optimizer. PaddlePaddle/Paddle#12231

Merged

shanyi15 closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad convergence when using momentum optimizer #696

Bad convergence when using momentum optimizer #696

kuke commented Mar 8, 2018

shanyi15 commented Aug 15, 2018

Bad convergence when using momentum optimizer #696

Bad convergence when using momentum optimizer #696

Comments

kuke commented Mar 8, 2018

shanyi15 commented Aug 15, 2018