Did you increase the decoupled weight decay simultaneously when decreasing the learning rate? #2

xiangning-chen · 2023-02-15T21:53:35Z

Thanks for implementing and testing our lion optimizer!
Just wondering did you also enlarge the decoupled weight decay to maintain the regularization strength?

best,
--xiangning

lucidrains · 2023-02-15T21:56:08Z

@xiangning-chen Hi Xiangning! Thank you for this interesting paper

So far I have been only testing with weight decay turned off. There are a lot of networks that are still trained with just plain Adam, and I wanted to see how Lion fares against Adam alone

lucidrains · 2023-02-15T21:57:05Z

@xiangning-chen but yes, I have noted the section in the paper where you said the weight decay needs to be higher

Let me add that to the readme to increase the chances people train it correctly

xiangning-chen · 2023-02-15T21:58:16Z

Thanks for the update!
Yeah disabling weight decay for both optimizers is pretty meaningful and fair, thank you!

lucidrains · 2023-02-15T22:01:55Z

@xiangning-chen ok good luck! hope this technique holds up to scrutiny!

lucidrains closed this as completed Feb 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Did you increase the decoupled weight decay simultaneously when decreasing the learning rate? #2

Did you increase the decoupled weight decay simultaneously when decreasing the learning rate? #2

xiangning-chen commented Feb 15, 2023

lucidrains commented Feb 15, 2023

lucidrains commented Feb 15, 2023

xiangning-chen commented Feb 15, 2023 •

edited

lucidrains commented Feb 15, 2023

Did you increase the decoupled weight decay simultaneously when decreasing the learning rate? #2

Did you increase the decoupled weight decay simultaneously when decreasing the learning rate? #2

Comments

xiangning-chen commented Feb 15, 2023

lucidrains commented Feb 15, 2023

lucidrains commented Feb 15, 2023

xiangning-chen commented Feb 15, 2023 • edited

lucidrains commented Feb 15, 2023

xiangning-chen commented Feb 15, 2023 •

edited