LSTM for timeseries f/c & batch shuffling (again) + learning rate #3

asehmi · 2022-01-14T21:36:45Z

Hi,

I spent a lot of time with your workbook and made a few adjustments for my multivariate data sets. These are just comments rather than issues, but you may find them helpful:

You use tf.keras.callbacks.LearningRateScheduler(lambda epoch: 1e-8 * 10**(epoch / 20)), which I think should have a negative exponent (-epoch) or decimal mantissa (0.1) in order to reduce the learning rate rather than increase it. All guidance I have read for SGD suggests the LR needs to go down, unless you're implementing the 1Cycle learning rate scheme. Perhaps increasing is what you want, but just highlighting my concern.
I tried shuffle=True and found that it made sense to make batch_size (2x or 3x) larger than the lookback value, n_past, otherwise it can make the results worse.
I tried another tf.keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=25) as a LR scheduler and it works pretty well with the tf.keras.optimizers.SGD(learning_rate=1e-2, momentum=0.9) optimizer. The patience should be less than the early_stopping_cb patience value.
I found that by setting initial_epoch on model.fit you can fine-tune a previously saved model (with the same architecture). I use model.load_weights('lstm_regressor_prewarmed.model.h5', by_name=True, skip_mismatch=True) to initialize the weights in the new model before running model.fit. The initial epoch is v. important if your optimizer uses epoch to adjust the learning rate.

Did you do any work on WaveNet?

It's been fun. Thanks for sharing your efforts.

Arvindra

The text was updated successfully, but these errors were encountered:

chibui191 · 2022-01-17T00:08:15Z

Thank you @asehmi for these suggestions. I haven't had time to rerun the notebooks with these changes yet, but I'll definitely keep them in mind next time.

I really appreciate your feedback!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM for timeseries f/c & batch shuffling (again) + learning rate #3

LSTM for timeseries f/c & batch shuffling (again) + learning rate #3

asehmi commented Jan 14, 2022

chibui191 commented Jan 17, 2022

LSTM for timeseries f/c & batch shuffling (again) + learning rate #3

LSTM for timeseries f/c & batch shuffling (again) + learning rate #3

Comments

asehmi commented Jan 14, 2022

chibui191 commented Jan 17, 2022