Skip to content

JianhongBai/BaCon

Repository files navigation

Towards Distribution-Agnostic Generalized Category Discovery.

NeurIPS 2023: This repository is the official implementation of BaCon.

Introduction

Data imbalance and open-ended distribution are two intrinsic characteristics of the real visual world. Though encouraging progress has been made in tackling each challenge separately, few works dedicated to combining them towards real-world scenarios. While several previous works have focused on classifying close-set samples and detecting open-set samples during testing, it's still essential to be able to classify unknown subjects as human beings. In this paper, we formally define a more realistic task as distribution-agnostic generalized category discovery (DA-GCD): generating fine-grained predictions for both close- and open-set classes in a long-tailed open-world setting. To tackle the challenging problem, we propose a Self-Balanced Co-Advice contrastive framework (BaCon), which consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task. In particular, the contrastive-learning branch provides reliable distribution estimation to regularize the predictions of the pseudo-labeling branch, which in turn guides contrastive learning through self-balanced knowledge transfer and a proposed novel contrastive loss. We compare BaCon with state-of-the-art methods from two closely related fields: imbalanced semi-supervised learning and generalized category discovery. The effectiveness of BaCon is demonstrated with superior performance over all baselines and comprehensive analysis across various datasets.

Method

Overview of the self-balanced co-advice contrastive framework (BaCon).

Environment

Requirements:

loguru
numpy
pandas
scikit_learn
scipy
torch==1.10.0
torchvision==0.11.1
tqdm

Data

We provide the specific train split of CIFAR-10 and CIFAR-100 with different imbalance ratios, please refer to data_uq_idxs for details. We also provide the source code in data/imagenet.py for splitting data to the proposed DA-GCD setting.

Pretrained models downloading

CIFAR-10

CIFAR-100

ImageNet-100

Training

CIFAR-10

bash run_cifar10.sh

CIFAR-100

bash run_cifar100.sh

ImageNet-100

bash run_imagenet100.sh

Acknowledgments

The codebase is largely built on GCD and SimGCD. Thanks for their great work!

Citation

@article{bai2023towards,
  title={Towards Distribution-Agnostic Generalized Category Discovery},
  author={Bai, Jianhong and Liu, Zuozhu and Wang, Hualiang and Chen, Ruizhe and Mu, Lianrui and Li, Xiaomeng and Zhou, Joey Tianyi and Feng, Yang and Wu, Jian and Hu, Haoji},
  journal={arXiv preprint arXiv:2310.01376},
  year={2023}
}

What's More?

Our work on self-supervised long-tail learning: On the Effectiveness of Out-of-Distribution Data on Self-Supervised Long-Tail Learning.

About

Official implementation of "Towards Distribution-Agnostic Generalized Category Discovery" (NIPS 2023)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published