GradAug: A New Regularization Method for Deep Neural Networks (NeurIPS'20) [arXiv]

This work proposes to utilize randomly transformed training samples to regularize a set of sub-networks. The motivation is that a well-generalized network, and its sub-networks, should recognize transformed images as the same object. The proposed method is simple, general yet effective. It achieves state-of-the-art performance on ImageNet and Cifar classification, and can further improve downstream tasks such as object detection and instance segmentation. The effectiveness is also validated on model robustness and low data regimes.

Install

Pytorch 1.0.0+, torchvision, Numpy, pyyaml
Follow the PyTorch example to prepare ImageNet dataset.

Run

ImageNet experiments are conducted on 8 GPUs.

To train ResNet-50,

python train.py app:configs/resnet50_randwidth.yml

To test a pre-trained model,

Modify test_only: False to test_only: True in .yml file to enable testing.

Modify pretrained: /PATH/TO/YOUR/WEIGHTS to assign pre-trained weights.

Cifar experiments are conducted on 2 GPUs.

To train WideResNet-28-10,

python train_cifar.py app:configs/wideresnet_randwidth.yml

To train PyramidNet-200,

python train_cifar.py app:configs/pyramidnet_randwidth.yml

Results

ImageNet classification accuacy. Note that we report the final-epoch results.

Model	FLOPs	Top-1	Top-5
ResNet-50	4.1 G	76.32	92.95
+Dropblock	4.1 G	78.13	94.02
+Mixup	4.1 G	77.9	93.9
+CutMix	4.1 G	78.60	94.08
+StochDepth	4.1 G	77.53	93.73
+ShakeDrop	4.1 G	77.5	-
+GradAug (Model)	4.1 G	78.78	94.43
+bag of tricks	4.3 G	79.29	94.38
+GradAug+CutMix (Model)	4.1 G	79.67	94.93

Cifar-100 classification accuracy. Note that we report the final-epoch results.

WideResNet-28-10	Top-1	Top-5
Baseline	81.53	95.59
+Mixup	82.5	-
+CutMix	84.08	96.28
+ShakeDrop	81.65	96.19
+GradAug (Model)	83.98	96.17
+GradAug+CutMix (Model)	85.35	96.85

PyramidNet-200	Top-1	Top-5
Baseline	83.49	94.31
+Mixup	84.37	96.01
+CutMix	84.83	96.73
+ShakeDrop	84.57	97.08
+GradAug	84.98	97.08
+GradAug+CutMix (Model)	86.24	97.33

Citation

If you find this useful in your work, please consider citing,

@article{yang2020gradaug,
  title={GradAug: A New Regularization Method for Deep Neural Networks},
  author={Yang, Taojiannan and Zhu, Sijie and Chen, Chen},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
train.py		train.py
train_cifar.py		train_cifar.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

models

models

utils

utils

LICENSE

LICENSE

README.md

README.md

train.py

train.py

train_cifar.py

train_cifar.py

Repository files navigation

GradAug: A New Regularization Method for Deep Neural Networks (NeurIPS'20) [arXiv]

Install

Run

Results

Citation

About

Releases

Packages

Languages

License

hellloxiaotian/GradAug

Folders and files

Latest commit

History

Repository files navigation

GradAug: A New Regularization Method for Deep Neural Networks (NeurIPS'20) [arXiv]

Install

Run

Results

Citation

About

Resources

License

Stars

Watchers

Forks

Languages