Skip to content

Latest commit

 

History

History
 
 

ImageNetModels

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

ImageNet training code of ResNet, ShuffleNet, DoReFa-Net, AlexNet, Inception, VGG with tensorpack.

To train any of the models, just do ./{model}.py --data /path/to/ilsvrc. More options are available in ./{model}.py --help. Expected format of data directory is described in docs. Some pretrained models can be downloaded at tensorpack model zoo.

ShuffleNet

Reproduce ImageNet results of the following two papers:

Model Flops Top-1 Error Paper's Error Flags
ShuffleNetV1 0.5x ⬇️ 40M 40.8% 42.3% -r=0.5
ShuffleNetV1 1x ⬇️ 140M 32.6% 32.4% -r=1
ShuffleNetV2 0.5x ⬇️ 41M 39.5% 39.7% -r=0.5 --v2
ShuffleNetV2 1x ⬇️ 146M 30.6% 30.6% -r=1 --v2

To print flops:

./shufflenet.py --flops [--other-flags]

Download and evaluate a pretrained model:

wget http://models.tensorpack.com/ImageNetModels/ShuffleNetV2-0.5x.npz
./shufflenet.py --eval --data /path/to/ilsvrc --load ShuffleNetV2-0.5x.npz --v2 -r=0.5

AlexNet

This AlexNet script is quite close to the settings in its original paper. Trained with 2 GPUs and 64 batch size per GPU, the script reaches 58% single-crop validation accuracy after 100 epochs (21h on 2 V100s). It also puts in tensorboard the first-layer filter visualizations similar to the paper. See ./alexnet.py --help for usage.

VGG16

This VGG16 script, when trained with 8 GPUs and 32 batch size per GPU, reaches the following validation error after 100 epochs (30h with 8 P100s). This reproduces the VGG experiments in the paper Group Normalization (more code about this paper can be found at GroupNorm-reproduce).

See ./vgg16.py --help for usage.

No Normalization Batch Normalization Group Normalization
29~30% (large variation with random seed) 28% 27.6%

Note that this single experiment does not constitute a valid claim that GroupNorm has better performance than BatchNorm.

Inception-BN

This Inception-BN script reaches 27% single-crop validation error after 300k steps with 6 GPUs. The training recipe is very different from the original paper because the paper is a bit vague on these details.

ResNet

See ResNet examples. It includes variants like pre-activation ResNet, squeeze-and-excitation networks.

DoReFa-Net

See DoReFa-Net examples. It includes other quantization methods such as Binary Weight Network, Trained Ternary Quantization.