How is the update for "train_on_batch" determined? #3303

gxlzj · 2016-07-25T03:01:29Z

Dear developers and friends,

I am a new user to keras. My question is as fellows:

I see for "train_on_batch", the optimizer only do one single gradient update. But how is the learning rate/ momentum used in this setting?

For example, for SGD, we have a learning rate and momentum. When I use "train_on_batch" manually, will optimizer SGD automatically store the gradient update and use it as momentum when I use "train_on_batch" again next time?

codekansas · 2016-07-28T18:29:47Z

Yes, it will, consult here. Momentum is persistent between batches.

stale · 2017-05-23T22:32:17Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

stale bot added the stale label May 23, 2017

stale bot closed this as completed Jun 22, 2017

Clorr mentioned this issue Mar 16, 2018

Learning rate, beta1, beta 2 in adam optimizer in models deepfakes/faceswap-model#20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the update for "train_on_batch" determined? #3303

How is the update for "train_on_batch" determined? #3303

gxlzj commented Jul 25, 2016

codekansas commented Jul 28, 2016

stale bot commented May 23, 2017

How is the update for "train_on_batch" determined? #3303

How is the update for "train_on_batch" determined? #3303

Comments

gxlzj commented Jul 25, 2016

codekansas commented Jul 28, 2016

stale bot commented May 23, 2017