Skip to content

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

License

Notifications You must be signed in to change notification settings

BIT-DA/MetaSAug

Repository files navigation

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Shuang Li, Kaixiong Gong, et al.

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [CVPR 2021 PDF]

Paper

This repository contains the code of our CVPR 2021 work "MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition".

PWC PWC PWC PWC PWC PWC PWC

Abstract

Real-world training data usually exhibits long-tailed distribution, where several majority classes have a significantly larger number of samples than the remaining minority classes. This imbalance degrades the performance of typical supervised learning algorithms designed for balanced training sets. In this paper, we address this issue by augmenting minority classes with a recently proposed implicit semantic data augmentation (ISDA) algorithm, which produces diversified augmented samples by translating deep features along many semantically meaningful directions. Importantly, given that ISDA estimates the classconditional statistics to obtain semantic directions, we find it ineffective to do this on minority classes due to the insufficient training data. To this end, we propose a novel approach to learn transformed semantic directions with metalearning automatically. In specific, the augmentation strategy during training is dynamically optimized, aiming to minimize the loss on a small balanced validation set, which is approximated via a meta update step. Extensive empirical results on CIFAR-LT-10/100, ImageNet-LT, and iNaturalist2017/2018 validate the effectiveness of our method

drawing

If you find this idea or code useful for your research, please consider citing our paper:

@inproceedings{li2021metasaug,
  title={Metasaug: Meta semantic augmentation for long-tailed visual recognition},
  author={Li, Shuang and Gong, Kaixiong and Liu, Chi Harold and Wang, Yulin and Qiao, Feng and Cheng, Xinjing},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5212--5221},
  year={2021}
}

Prerequisite

  • PyTorch >= 1.2.0
  • Python3
  • torchvision
  • PIL
  • argparse
  • numpy

Evaluation

We provide several trained models of MetaSAug for evaluation.

Testing on CIFAR-LT-10/100:

  • sh scripts/MetaSAug_CE_test.sh
  • sh scripts/MetaSAug_LDAM_test.sh

Testing on ImageNet and iNaturalist18:

  • sh ImageNet_iNat/test.sh

The trained models are in Google Drive.

Getting Started

Dataset

Training

Training on CIFAR-LT-10/100:

CIFAR-LT-100, MetaSAug with LDAM loss
python3.6 MetaSAug_LDAM_train.py --gpu 0 --lr 0.1 --lam 0.75 --imb_factor 0.05 --dataset cifar100 --num_classes 100 --save_name MetaSAug_cifar100_LDAM_imb0.05 --idx 1

Or run the script:

sh scripts/MetaSAug_LDAM_train.sh

Training on ImageNet-LT:

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 --master_port 53212 train.py  --lr 0.0003 --meta_lr 0.1 --workers 0 --batch_size 256 --epochs 20 --dataset ImageNet_LT --num_classes 1000 --data_root ../ImageNet

Or run the script:

sh ImageNet_iNat/scripts/train.sh

Note: Training on large scale datasets like ImageNet-LT and iNaturalist2017/2018 involves multiple gpus for faster speed. To achieve better generalizable representations, vanilla CE loss is used for training the network in the early training stage. For convenience, the training starts from the pre-trained models, e.g., ImageNet-LT, iNat18 (both from project cRT).

Results and models

CIFAR-LT-10

Model Imb. Top-1 Error Download Model Imb. Top-1 Error Download
MetaSAug+LDAM 200 22.65 ResNet32 MetaSAug+CE 200 23.11 ResNet32
MetaSAug+LDAM 100 19.34 ResNet32 MetaSAug+CE 100 19.46 ResNet32
MetaSAug+LDAM 50 15.66 ResNet32 MetaSAug+CE 50 15.97 ResNet32
MetaSAug+LDAM 20 11.90 ResNet32 MetaSAug+CE 20 12.36 ResNet32
MetaSAug+LDAM 10 10.32 ResNet32 MetaSAug+CE 10 10.56 ResNet32

CIFAR-LT-100

Model Imb. Top-1 Error Download Model Imb. Top-1 Error Download
MetaSAug+LDAM 200 56.91 ResNet32 MetaSAug+CE 200 60.06 ResNet32
MetaSAug+LDAM 100 51.99 ResNet32 MetaSAug+CE 100 53.13 ResNet32
MetaSAug+LDAM 50 47.73 ResNet32 MetaSAug+CE 50 48.10 ResNet32
MetaSAug+LDAM 20 42.47 ResNet32 MetaSAug+CE 20 42.15 ResNet32
MetaSAug+LDAM 10 38.72 ResNet32 MetaSAug+CE 10 38.27 ResNet32

ImageNet-LT

Model Top-1 Error Download
MetaSAug 52.33 ResNet50

iNaturalist18

Model Top-1 Error Download
MetaSAug 30.50 ResNet50

Acknowledgements

Some codes in this project are adapted from Meta-class-weight and cRT. We thank them for their excellent projects.

About

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published