NOTE: This is an implementation of SkipResNets for performance evaluation. If you would like use SkipResNets for your tasks, see SkipResNet reference implementation where SkipResNet models are implemented as models for PyTorch Image Models (timm).
SkipResNet is a Skip connected Residual convolutional neural Network for image recognition tasks. Though an architecture of SkipResNets is a stack of Residual Blocks just like ResNets, each residual block has several inbounds from the previous blocks in the same manner as DenseNets. In order to improve the performance, a residual block in SkipResNets includes a Gate Module instead of element-wise additions in ResNets or concatenations in DenseNets. A Gate Module contains an attention mechanism which selects useful features dynamically. Experimental results indicate that an architecture of SkipResNets improves the performance in image classification tasks.
DenseResNet is a Densely connected Residual convolutional neural Network for image recognition tasks. An architecture of DenseResNets is similar to SkipResNets, but the shortcut design is different.
DenseResNets are published in a following paper:
- Atsushi Takeda. "画像分類のためのDense Residual Networkの提案 (Dense Residual Networks for Image Classification)." The 23rd Meeting on Image Recognition and Understanding (MIRU2020), 2020 (in Japanese).
If you want to use the ImageNet dataset, you need to download the dataset archives and save them to data/imagenet
(see readme.txt for details). If you want to train a model with the CIFAR dataset, dataset preparation is not needed because the dataset will be downloaded automatically.
Run a training script which trains the model from scratch.
python src/train.py [config file] [output path]
For example, a following command trains a ResNet-110 model using 2 GPUs with the CIFAR-100 dataset, and the results are saved in a output directory named output_dir
.
python src/train.py \
config/train/cifar100/ResNet-110.txt \
output_directory \
--gpus 0,1
This implementation supports training by using TPUs. A following command trains a ResNet-50 model using 8 TPUs with the ImageNet dataset loaded from Google Cloud Storage. In this case, you need to make shard files of the ImageNet dataset and stored them to Google Cloud Storage before starting the training.
PYTORCH_JIT=0 python src/train.py \
configs/train/imagenet/ResNet-50.txt \
output_directory \
--tpus 8 \
--data gs://<your backet>/data/imagenet
The subscript of each model is the number of training runs, and the row indicates the median of the training results. For example, a row of "model(5)" shows the median performance of the 5 trained models.
Following models are trainied using the ImageNet-1k dataset from scratch. The image used for the training is cropped to a 224x224 size image, and no extra images are used.
Model | # of params | flops (224x224) | settings |
---|---|---|---|
ResNet-34 Skip-ResNet-34 |
21.80M 22.72M |
3.681G 3.694G |
ResNet-34.txt Skip-ResNet-34.txt |
ResNet-50 SE-ResNet-50 Skip-ResNet-50 |
25.56M 28.09M 40.15M |
4.138G 4.151G 4.201G |
ResNet-50.txt SE-ResNet-50.txt Skip-ResNet-50.txt |
ResNet-101 SE-ResNet-101 Skip-ResNet-101 |
44.55M 49.33M 83.36M |
7.874G 7.897G 8.017G |
ResNet-101.txt SE-ResNet-101.txt Skip-ResNet-101.txt |
ResNeXt-50-32x4d SE-ResNeXt-50-32x4d Skip-ResNeXt-50-32x4d |
25.03M 27.56M 39.63M |
4.292G 4.306G 4.355G |
ResNeXt-50-32x4d.txt SE-ResNeXt-50-32x4d.txt Skip-ResNeXt-50-32x4d.txt |
ResNeXt-101-32x4d SE-ResNeXt-101-32x4d Skip-ResNeXt-101-32x4d |
44.18M 48.96M 82.99M |
8.063G 8.085G 8.205G |
ResNeXt-101-32x4d.txt SE-ResNeXt-101-32x4d.txt Skip-ResNeXt-101-32x4d.txt |
RegNetY-1.6 Skip-RegNetY-1.6 |
11.20M 14.76M |
1.650G 1.677G |
RegNetY-1.6.txt Skip-RegNetY-1.6.txt |
RegNetY-3.2 Skip-RegNetY-3.2 |
19.44M 25.35M |
3.229G 3.265G |
RegNetY-3.2.txt Skip-RegNetY-3.2.txt |
ConvNeXt-T Skip-ConvNeXt-T |
28.59M 31.14M |
4.569G 4.591G |
ConvNeXt-T.txt Skip-ConvNeXt-T.txt |
ConvNeXt-S Skip-ConvNeXt-S |
50.22M 56.58M |
8.863G 8.912G |
ConvNeXt-S.txt Skip-ConvNeXt-S.txt |
ConvNeXt-B Skip-ConvNeXt-B |
88.59M 99.85M |
15.58G 15.65G |
ConvNeXt-B.txt Skip-ConvNeXt-B.txt |
SwinTiny-224 Skip-SwinTiny-224 |
SwinTinyPatch4-224.txt Skip-SwinTinyPatch4-224.txt |
||
SwinSmall-224 Skip-SwinSmall-224 |
SwinSmallPatch4-224.txt Skip-SwinSmallPatch4-224.txt |
Model | top-1 acc. (224x224) |
top-1 acc. (256x256) |
top-1 acc. (288x288) |
top-1 acc. (320x320) |
---|---|---|---|---|
ResNet-34(3) Skip-ResNet-34(3) |
0.7553 0.7675 |
0.7622 0.7759 |
0.7654 0.7778 |
0.7665 0.7782 |
ResNet-50(5) SE-ResNet-50(3) Skip-ResNet-50(5) |
0.7901 0.7991 0.8041 |
0.7953 0.8055 0.8103 |
0.7964 0.8081 0.8120 |
0.7954 0.8072 0.8104 |
ResNet-101(3) SE-ResNet-101(3) Skip-ResNet-101(3) |
0.8036 0.8102 0.8139 |
0.8100 0.8157 0.8217 |
0.8095 0.8184 0.8234 |
0.8086 0.8177 0.8208 |
ResNeXt-50-32x4d(3) SE-ResNeXt-50-32x4d(3) Skip-ResNeXt-50-32x4d(3) |
0.7973 0.8041 0.8067 |
0.8015 0.8093 0.8125 |
0.8030 0.8117 0.8131 |
0.8011 0.8110 0.8126 |
ResNeXt-101-32x4d(3) SE-ResNeXt-101-32x4d(3) Skip-ResNeXt-101-32x4d(3) |
0.8066 0.8085 0.8139 |
0.8102 0.8137 0.8203 |
0.8112 0.8165 0.8216 |
0.8101 0.8188 0.8210 |
RegNetY-1.6(3) Skip-RegNetY-1.6(3) |
0.7736 0.7794 |
0.7841 0.7887 |
0.7879 0.7936 |
0.7904 0.7946 |
RegNetY-3.2(3) Skip-RegNetY-3.2(3) |
0.7849 0.7884 |
0.7933 0.7960 |
0.7974 0.7997 |
0.7981 0.8000 |
ConvNeXt-T(3) Skip-ConvNeXt-T(3) |
0.8157 0.8158 |
0.8171 0.8205 |
0.8157 0.8224 |
0.8094 0.8224 |
ConvNeXt-S(3) Skip-ConvNeXt-S(3) |
0.8314 0.8333 |
0.8344 0.8367 |
0.8341 0.8374 |
0.8307 0.8375 |
ConvNeXt-B(3) Skip-ConvNeXt-B(3) |
0.8355 0.8337 |
0.8376 0.8391 |
0.8372 0.8399 |
0.8350 0.8400 |
SwinTiny-224(3) Skip-SwinTiny-224(3) |
0.8124 0.8150 |
- - |
- - |
- - |
SwinSmall-224(3) Skip-SwinSmall-224(3) |
0.8288 0.8273 |
- - |
- - |
- - |
Model | top-1 acc. (224x224) |
top-1 acc. (256x256) |
top-1 acc. (288x288) |
top-1 acc. (320x320) |
---|---|---|---|---|
ResNet-34(3) Skip-ResNet-34(3) |
0.0143 0.0259 |
0.0204 0.0328 |
0.0297 0.0416 |
0.0321 0.0448 |
ResNet-50(5) SE-ResNet-50(3) Skip-ResNet-50(5) |
0.0304 0.0560 0.0695 |
0.0477 0.0763 0.0889 |
0.0583 0.0855 0.0987 |
0.0625 0.0868 0.1015 |
ResNet-101(3) SE-ResNet-101(3) Skip-ResNet-101(3) |
0.0635 0.0841 0.1157 |
0.0869 0.1083 0.1324 |
0.1015 0.1203 0.1481 |
0.1023 0.1253 0.1455 |
ResNeXt-50-32x4d(3) SE-ResNeXt-50-32x4d(3) Skip-ResNeXt-50-32x4d(3) |
0.0537 0.0749 0.0916 |
0.0743 0.0929 0.1072 |
0.0844 0.1028 0.1179 |
0.0843 0.1063 0.1179 |
ResNeXt-101-32x4d(3) SE-ResNeXt-101-32x4d(3) Skip-ResNeXt-101-32x4d(3) |
0.0889 0.1033 0.1319 |
0.1155 0.1235 0.1528 |
0.1261 0.1325 0.1628 |
0.1288 0.1340 0.1557 |
RegNetY-1.6(3) Skip-RegNetY-1.6(3) |
0.0376 0.0520 |
0.0493 0.0649 |
0.0580 0.0748 |
0.0621 0.0779 |
RegNetY-3.2(3) Skip-RegNetY-3.2(3) |
0.0504 0.0599 |
0.0616 0.0720 |
0.0709 0.0775 |
0.0752 0.0821 |
ConvNeXt-T(3) Skip-ConvNeXt-T(3) |
0.1032 0.1231 |
0.1277 0.1432 |
0.1383 0.1519 |
0.1300 0.1497 |
ConvNeXt-S(3) Skip-ConvNeXt-S(3) |
0.1455 0.1625 |
0.1723 0.1839 |
0.1839 0.1965 |
0.1780 0.1973 |
ConvNeXt-B(3) Skip-ConvNeXt-B(3) |
0.1653 0.1815 |
0.1973 0.2056 |
0.2016 0.2157 |
0.1949 0.2143 |
SwinTiny-224(3) Skip-SwinTiny-224(3) |
0.0988 0.1145 |
- - |
- - |
- - |
SwinSmall-224(3) Skip-SwinSmall-224(3) |
0.1503 0.1531 |
- - |
- - |
- - |
Following models are trainied using the CIFAR-10/100 dataset from scratch. No extra images are used.
Model | # of params | flops (32x32) | settings |
---|---|---|---|
ResNet-110 Skip-ResNet-110 |
1.737M 2.189M |
257.9M 265.4M |
ResNet-110.txt Skip-ResNet-110.txt |
WideResNet-28-k10 Skip-WideResNet-28-k10 |
36.54M 38.18M |
5.254G 5.266G |
WideResNet-28-k10.txt Skip-WideResNet-28-k10.txt |
WideResNet-40-k10 Skip-WideResNet-40-k10 |
55.90M 58.64M |
8.091G 8.111G |
WideResNet-40-k10.txt Skip-WideResNet-40-k10.txt |
Model | top-1 acc. (CIFAR-10) | top-1 acc (CIFAR-100) |
---|---|---|
ResNet-110(5) Skip-ResNet-110(5) |
0.9623 0.9660 |
0.7798 0.7988 |
WideResNet-28-k10(5) Skip-WideResNet-28-k10(5) |
0.9787 0.9780 |
0.8425 0.8508 |
WideResNet-40-k10(5) Skip-WideResNet-40-k10(5) |
0.9793 0.9792 |
0.8439 0.8498 |
Model | Backbone | mAPbox | mAPseg |
---|---|---|---|
Cascade Mask RCNN |
ResNet-50 Skip-ResNet-50 |
0.44806 0.47716 |
0.38977 0.41421 |
Cascade Mask RCNN |
ResNet-101 Skip-ResNet-101 |
0.44813 0.48224 |
0.38817 0.42012 |
Cascade Mask RCNN |
ResNeXt-50-32x4d Skip-ResNeXt-50-32x4d |
0.45450 0.47445 |
0.39470 0.41304 |
Cascade Mask RCNN |
ResNeXt-101-32x4d Skip-ResNeXt-101-32x4d |
0.46085 0.48862 |
0.39817 0.42241 |
Model | Backbone | aAcc | mIoU |
---|---|---|---|
UPerNet |
ResNet-50 Skip-ResNet-50 |
0.8019 0.8016 |
0.4194 0.4244 |
UPerNet |
ResNet-101 Skip-ResNet-101 |
0.8101 0.8138 |
0.4402 0.4443 |
UPerNet |
ResNeXt-50-32x4d Skip-ResNeXt-50-32x4d |
0.8036 0.8074 |
0.4269 0.4339 |
UPerNet |
ResNeXt-101-32x4d Skip-ResNeXt-101-32x4d |
0.8114 0.8185 |
0.4407 0.4565 |
This work is supported by JSPS KAKENHI Grant Number JP20K11871, and a part of this experiment is supported by the TPU Research Cloud program.