This is an code implementation of NeurIPS2022 paper (DropCov: A Simple yet Effective Method for Improving Deep Architectures (poster)), created by Qilong Wang, Mingze Gao and Zhaolin Zhang.
Post-normalization plays a key role in deep global covariance pooling (GCP) networks. In this paper, we for the first time show that effective post-normalization can make a good trade-off between representation decorrelation and information preservation for GCP, which are crucial to alleviate over-fitting and increase representation ability of deep GCP networks, respectively. Based on this finding, we propose a simple yet effective pre-normalization method for GCP (namely DropCov), which performs an adaptive channel dropout before GCP to achieve tradeoff between representation decorrelation and information preservation. The proposed DropCov improves performance of both deep CNNs and ViTs.
Method | Acc@1(%) | #Params.(M) | FLOPs(G) | Checkpoint |
---|---|---|---|---|
ResNet-34 | 74.19 | 21.8 | 3.66 | |
ResNet-50 | 76.02 | 25.6 | 3.86 | |
ResNet-101 | 77.67 | 44.6 | 7.57 | |
ResNet-34+DropCov(Ours) | 76.81 | 29.6 | 5.56 | Download |
ResNet-50+DropCov(Ours) | 78.19 | 32.0 | 6.19 | Download |
ResNet-101+DropCov(Ours) | 79.51 | 51.0 | 9.90 | Download |
DeiT-S | 79.8 | 22.1 | 4.6 | Link |
Swin-T | 81.2 | 28.3 | 4.5 | Link |
T2T-ViT-14 | 81.5 | 21.5 | 5.2 | Link |
DeiT-S+DropCov(Ours) | 82.4 | 25.6 | 5.5 | Download |
Swin-T-S+DropCov(Ours) | 82.5 | 31.6 | 6.0 | Download |
T2T-ViT-14-S+DropCov(Ours) | 82.7 | 24.9 | 5.4 | Download |
●OS:18.04
●CUDA:11.0
●Toolkit:PyTorch 1.7\1.8
●GPU:GTX 2080Ti\3090Ti
First, clone the repo and install requirements:
git clone https://github.com/mingzeG/DropCov.git
pip install -r requirements.txt
Download and extract ImageNet train and val images from http://image-net.org/.
The directory structure is the standard layout for the torchvision datasets.ImageFolder
,
and the training and validation data is expected to be in the train/
folder and val/
folder respectively:
/path/to/imagenet/
train/
class1/
img1.jpeg
class2/
img2.jpeg
val/
class1/
img3.jpeg
class/2
img4.jpeg
To evaluate a pre-trained model on ImageNet val with GPUs run:
CUDA_VISIBLE_DEVICES={device_ids} python -u main.py -e -a {model_name} --resume {checkpoint-path} {imagenet-path}
For example, to evaluate the Dropcov method, run
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py -e -a resnet18_ACD --resume ./r18_64_acd_best.pth.tar ./dataset/ILSVRC2012
giving
* Acc@1 73.5 Acc@5 91.2
You can run the main.py
to train as follow:
CUDA_VISIBLE_DEVICES={device_ids} python -u main.py -a {model_name} --epochs {epoch_num} --b {batch_size} --lr_mode {the schedule of learning rate decline} {imagenet-path}
For example:
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py -a resnet18_ACD --epochs 100 --b 256 --lr_mode LRnorm ./dataset/ILSVRC2012
Swin-T
:
python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345 main.py --cfg configs/swin/swin_tiny_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 128
Deit-S
:
sh ./scripts/train_Deit_drop_Small.sh
@inproceedings{wang2022nips,
title={A Simple yet Effective Method for Improving Deep Architectures},
author={Qilong Wang and Mingze Gao and Zhaolin Zhang and Jiangtao Xie and Peihua Li and Qinghua Hu},
booktitle = {NeurIPS},
year={2022}
}
Our code are built following GCP_Optimization, DeiT, Swin Transformer , thanks for their excellent work