You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1e-2 is the recommended value for ImageNet classification task.
In general, we recommend using a weight decay value equal to baseline optimizer (AdamW).
You can directly use hyper parameters of AdamW to AdamP.
If you don't have any hyper parameter values for AdamW, weight decay 0 is good for the first try.
Because many tasks use zero weight decay, so just try this and slowly increase the weight decay for tuning.
In the readme, the value is 1e-2. Is it a recommended value? Or just keeping the default value 0?
The text was updated successfully, but these errors were encountered: