You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update: seems to work for my local enwik8 autoregressive language modeling
Update 2: [experiments](https://api.wandb.ai/links/lucidrains/d4v6c8sl), seems much worse than Adam if learning rate held constant
Update 3: Dividing the learning rate by 3, seeing better early results than Adam. Maybe Adam has been dethroned, after nearly a decade.
Update 4: using the 10x smaller learning rate rule of thumb from the paper resulted in the worst run. so I guess it still takes a bit of tuning
The text was updated successfully, but these errors were encountered:
The text was updated successfully, but these errors were encountered: