Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Training code of 4 variants of ResNet on ImageNet:

The training follows the exact standard recipe used by the Training ImageNet in 1 Hour paper and gets the same performance. Distributed training code & results can be found at tensorpack/benchmarks.

This recipe has better performance than most open source implementations. In fact, many papers that claim to "improve" ResNet by .5% only compete with a lower baseline and they actually cannot beat this standard ResNet recipe.

Model Top 5 Error Top 1 Error Download
ResNet18 10.50% 29.66% ⬇️
ResNet34 8.56% 26.17% ⬇️
ResNet50 6.85% 23.61% ⬇️
ResNet50-SE 6.24% 22.64% ⬇️
ResNet101 6.04% 21.95% ⬇️
ResNeXt101-32x4d 5.73% 21.05% ⬇️
ResNet152 5.78% 21.51% ⬇️

To reproduce training or evaluation in the above table, first decompress ImageNet data into this structure, then:

./ --data /directory/of/ILSVRC -d 50 --batch 512
./ --data /directory/of/ILSVRC -d 50 --load ResNet50.npz --eval
# See ./ -h for other options.

You should be able to see good GPU utilization (95%~99%) in training, if your data is fast enough. With batch=64x8, ResNet50 training can finish 100 epochs in 16 hours on AWS p3.16xlarge (8 V100s).

The default data pipeline is probably OK for machines with SSD & 20 CPU cores. See the tutorial on other options to speed up your data.


This script only converts and runs ImageNet-ResNet{50,101,152} Caffe models released by MSRA. Note that the architecture is different from the script and the models are not compatible. ResNets have evolved, generally you'd better not cite these old numbers as baselines in your paper.


# download and convert caffe model to npz format
python -m tensorpack.utils.loadcaffe PATH/TO/{ResNet-101-deploy.prototxt,ResNet-101-model.caffemodel} ResNet101.npz
# run on an image
./ --load ResNet-101.npz --input cat.jpg --depth 101

The converted models are verified on ILSVRC12 validation set. The per-pixel mean used here is slightly different from the original, but has negligible effect.

Model Top 5 Error Top 1 Error
ResNet 50 7.78% 24.77%
ResNet 101 7.11% 23.54%
ResNet 152 6.71% 23.21%

Reproduce pre-activation ResNet on CIFAR10.


Also see a DenseNet implementation of the paper Densely Connected Convolutional Networks.

Reproduce the mixup pre-act ResNet-18 CIFAR10 experiment, in the paper:

This implementation follows exact settings from the author's code. Note that the architecture is different from the offcial preact-ResNet18 in the ResNet paper.


./  # train without mixup
./ --mixup   # with mixup

Results of the reference code can be reproduced. In one run it gives me: 5.48% without mixup; 4.17% with mixup (alpha=1).

You can’t perform that action at this time.