AdaBound.iterations #4

iperov · 2019-03-11T08:17:32Z

this param is not saved.

I looked at official pytorch implementation from original paper.
https://github.com/Luolc/AdaBound/blob/master/adabound/adabound.py

it has

# State initialization
if len(state) == 0:
    state['step'] = 0

state is saved with the optimizer.

also it has

# Exponential moving average of gradient values
state['exp_avg'] = torch.zeros_like(p.data)
# Exponential moving average of squared gradient values
state['exp_avg_sq'] = torch.zeros_like(p.data)

these values should also be saved

So your keras implementation is wrong.

The text was updated successfully, but these errors were encountered:

titu1994 · 2019-03-11T14:29:46Z

The implementation is correct.

State initialization occurs here : https://github.com/titu1994/keras-adabound/blob/master/adabound.py#L66

self.iterations is inherited from the base class Optimizer.

Exponential weight saving is done automatically via lines https://github.com/titu1994/keras-adabound/blob/master/adabound.py#L76-L82

self.weigts and self.iterations is saved when you say model.save() or use Model checkpoint with save_weights_only=False.

Lastly, before accusing something of being "wrong", be quite sure that you fully understand how the class works.

iperov · 2019-03-11T14:34:11Z

show me where iterations saved by base Optimizer

https://github.com/keras-team/keras/blob/master/keras/optimizers.py

iperov · 2019-03-11T14:35:41Z

there is no code in keras which saves .iterations

https://github.com/keras-team/keras/search?q=.iterations&unscoped_q=.iterations

titu1994 · 2019-03-11T14:38:21Z

Fair point, iterations is not saved, and that's a simple fix. It affects continued retraining, not initial training.

iperov · 2019-03-11T14:42:26Z

Sure.
LR decay just reset every time I restore model from save LOL

iperov · 2019-03-11T14:46:39Z

and .weights are also not saved, because they consume (same amount of memory as trainable weights) * 2
due to

ms = [K.zeros(K.int_shape(p), dtype=K.dtype(p)) for p in params]
vs = [K.zeros(K.int_shape(p), dtype=K.dtype(p)) for p in params]

so I think you dont know how keras works.

titu1994 · 2019-03-11T14:53:53Z

https://github.com/keras-team/keras/blob/master/keras/engine/network.py#L1090-L1162
https://github.com/keras-team/keras/blob/master/keras/engine/saving.py#L452-L501 and https://github.com/keras-team/keras/blob/master/keras/engine/saving.py#L37-L343

Thats how Optimizer weights are saved and restored, not by the optimizer class directly.

Please read how to save models and their optimizers jointly at Keras documentation at https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model

iperov · 2019-03-11T15:18:02Z

ok then, sorry

iperov · 2019-03-11T16:32:20Z

@titu1994 do you know what will happen with Adam and model, if all moving weights reinitialized to zero every N iterations ?

titu1994 · 2019-03-11T16:51:01Z

Any optimizer that uses bias correction, such as Adam or RMSProp I believe, requires the iteration count to steadily increase so that the exponential bias correction factor can decrease to zero.

If iteration count is reset, this will affect the moving average ws, vs.

If these moving averages are affected, then the gradient updated will be somewhat incorrect as the weight update is a moving average between the current gradient gt and the history of gradients, ws. As ws is compromised due to being set to zeros, gradient update is somewhat weaker and that will affect accuracy.

Similar case with vs and that will affect the effective learning rate of the Adam optimizer.

iperov mentioned this issue Mar 11, 2019

implement adabound in keras keras-team/keras#12352

Closed

titu1994 closed this as completed Mar 11, 2019

titu1994 added a commit that referenced this issue Mar 11, 2019

Fix for #4

09fc73e

titu1994 added a commit that referenced this issue Mar 11, 2019

Fix for #4

d06c45a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdaBound.iterations #4

AdaBound.iterations #4

iperov commented Mar 11, 2019 •

edited

titu1994 commented Mar 11, 2019 •

edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019 •

edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019 •

edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019

AdaBound.iterations #4

AdaBound.iterations #4

Comments

iperov commented Mar 11, 2019 • edited

titu1994 commented Mar 11, 2019 • edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019 • edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019 • edited

iperov commented Mar 11, 2019

iperov commented Mar 11, 2019

titu1994 commented Mar 11, 2019

iperov commented Mar 11, 2019 •

edited

titu1994 commented Mar 11, 2019 •

edited

titu1994 commented Mar 11, 2019 •

edited

titu1994 commented Mar 11, 2019 •

edited