Large Batch Training of Convolutional Networks with Layer-wise Adaptive Rate Scaling

This is a non-official implementation of the optimizer Layer-wise Adaptive Rate Scaling (LARS) from Ginsburg, Boris, Igor Gitman, and Yang You. "Large batch training of convolutional networks with layer-wise adaptive rate scaling." ICLR'18, and LARGE BATCH OPTIMIZATION FOR DEEP LEARNING:TRAINING BERT IN 76 MINUTES. ICLR'20

Dependencies

PyTorch >= 1.0

Experiments of classification

on mnist:

Batch Size	epochs	warm-up	LR (LARS)	acc. (LARS)	LR(SGD)	acc. (SGD)	LR(Adam)	acc. (Adam)
32	10	5	0.003	99.52	0.040	99.45	0.001	99.36
64	10	5	0.006	99.54	0.020	99.39	0.002	99.45
128	10	5	0.006	99.54	0.150	99.39	0.005	99.26
512	10	5	0.012	99.51	0.100	99.29	0.005	99.37
1024	10	5	0.024	99.45	0.155	99.22	0.0155	99.30
2048	10	5	0.030	99.42	0.050	98.71	0.0261	99.27
4096	10	5	0.040	99.32	0.261	98.48	0.0155	99.15

on cifar-100:

Backbone	Initialization	Augmentation	Optimizer	Hyper-parameters	Training time	Top1 Acc.
ResNet18	PyTorch default	RandomCrop+HorizontalFlip	SGD	bz:128, lr:0.1, warmup:1, multistep	-	76.39
ResNet18	PyTorch default	RandomCrop+HorizontalFlip	LARS	bz:4096, max lr:0.04, warmup:20,one cycle policy	~ 25 min. (4x 2080Ti)	78.05

on ImageNet (tiny):

Experiments of segmentation

cityscapes:

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
optimizers		optimizers
LICENSE		LICENSE
README.md		README.md
cifar100_lr_finder.png		cifar100_lr_finder.png
data_prefetcher.py		data_prefetcher.py
lars_CIFAR100_4k_warmup20.log		lars_CIFAR100_4k_warmup20.log
lr_finder.py		lr_finder.py
main.py		main.py
networks.py		networks.py
resnet.py		resnet.py
test_mnist.sh		test_mnist.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizers

optimizers

LICENSE

LICENSE

README.md

README.md

cifar100_lr_finder.png

cifar100_lr_finder.png

data_prefetcher.py

data_prefetcher.py

lars_CIFAR100_4k_warmup20.log

lars_CIFAR100_4k_warmup20.log

lr_finder.py

lr_finder.py

main.py

main.py

networks.py

networks.py

resnet.py

resnet.py

test_mnist.sh

test_mnist.sh

utils.py

utils.py

Repository files navigation

Large Batch Training of Convolutional Networks with Layer-wise Adaptive Rate Scaling

Dependencies

Experiments of classification

Experiments of segmentation

About

Releases

Packages

Languages

License

JosephChenHub/pytorch-lars

Folders and files

Latest commit

History

Repository files navigation

Large Batch Training of Convolutional Networks with Layer-wise Adaptive Rate Scaling

Dependencies

Experiments of classification

Experiments of segmentation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages