AdamW-and-SGDW

Fixing Weight Decay Regularization in Adam
Ilya Loshchilov, Frank Hutter

[WIP Alert]

This repository is still work in progress.
The functionanity of AdamW and SGDW have not been fully checked. The implementation could be wrong.

Usage

Please have a look at demo_fashion_mnist.ipynb.

from AdamW import AdamW
from SGDW import SGDW

# Suggested weight decay factor from the paper: w = w_norm * (b/B/T)**0.5
# b: batch size
# B: total number of training points per epoch
# T: total number of epochs
# w_norm: designed weight decay factor (w is the normalized one).

# weight_decay: float >= 0. The parameter for decoupled weight decay.
AdamW(lr=0.001, beta_1=0.9, beta_2=0.999, weight_decay=1e-4, epsilon=1e-8, decay=0.)
SGDW(lr=0.01, momentum=0., decay=0., weight_decay=1e-4, nesterov=False)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
AdamW.py		AdamW.py
README.md		README.md
SGDW.py		SGDW.py
demo_fashion_mnist.ipynb		demo_fashion_mnist.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdamW-and-SGDW

[WIP Alert]

Usage

About

Releases

Packages

Languages

shaoanlu/AdamW-and-SGDW

Folders and files

Latest commit

History

Repository files navigation

AdamW-and-SGDW

[WIP Alert]

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages