GitHub - lijian16/FCC: [CVPR 2023] This repository contains code for the paper "FCC: Feature Clusters Compression for Long-Tailed Visual Recognition", accepted by CVPR 2023

FCC: Feature Clusters Compression for Long-Tailed Visual Recognition

This repository is the official PyTorch implementation of the paper in CVPR 2023:

FCC: Feature Clusters Compression for Long-Tailed Visual Recognition
Jian Li, Ziyao Meng, Daqian Shi, Rui Song, Xiaolei Diao, Jingwen Wang
[PDF]

Feature Clusters Compression (FCC)

FCC is a simple and generic method for long-tailed visual recognition, which can be easily achieved and friendly combined with existing long-tailed methods to further boost them. FCC works on backbone features from the last layer of backbone networks. The core code of FCC is available at "lib/fcc/fcc_functions.py.

Main requirements

torch >= 1.7.1 
tensorboardX >= 2.1 
tensorflow >= 1.14.0 
Python 3.6
apex

Detailed requirement

pip install -r requirements.txt

The apex is recommended to be installed for saving GPU memories:

pip install -U pip
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Prepare datasets

This part is mainly based on https://github.com/zhangyongshun/BagofTricks-LT and https://github.com/Bazinga699/NCL

Three widely used datasets are provided in this repo: long-tailed CIFAR (CIFAR-LT), long-tailed ImageNet (ImageNet-LT) and iNaturalist 2018 (iNat18).

The detailed information of these datasets are shown as follows:

Datasets	CIFAR-10-LT		CIFAR-100-LT		ImageNet-LT	iNat18
	Imbalance factor
	100	50	100	50
Training images	12,406	13,996	10,847	12,608	11,5846	437,513
Classes	50	50	100	100	1,000	8,142
Max images	5,000	5,000	500	500	1,280	1,000
Min images	50	100	5	10	5	2
Imbalance factor	100	50	100	50	256	500

-"Max images" and "Min images" represents the number of training images in the largest and smallest classes, respectively.

-"CIFAR-10-LT-100" means the long-tailed CIFAR-10 dataset with the imbalance factor beta = 100.

-"Imbalance factor" is defined as: beta = Max images / Min images.

Data format

The annotation of a dataset is a dict consisting of two field: annotations and num_classes. The field annotations is a list of dict with image_id, fpath, im_height, im_width and category_id.

Here is an example.

{
    'annotations': [
                    {
                        'image_id': 1,
                        'fpath': '/data/iNat18/images/train_val2018/Plantae/7477/3b60c9486db1d2ee875f11a669fbde4a.jpg',
                        'im_height': 600,
                        'im_width': 800,
                        'category_id': 7477
                    },
                    ...
                   ]
    'num_classes': 8142
}

CIFAR-LT

Cui et al., CVPR 2019 firstly proposed the CIFAR-LT. They provided the download link of CIFAR-LT, and also the codes to generate the data, which are in TensorFlow.

You can follow the steps below to get this version of CIFAR-LT:
1. Download the Cui's CIFAR-LT in GoogleDrive or Baidu Netdisk (password: 5rsq). Suppose you download the data and unzip them at path /downloaded/data/.
2. Run tools/convert_from_tfrecords, and the converted CIFAR-LT and corresponding jsons will be generated at /downloaded/converted/.
```
# Convert from the original format of CIFAR-LT
python tools/convert_from_tfrecords.py  --input_path /downloaded/data/ --output_path /downloaded/converted/
```
ImageNet-LT

You can use the following steps to convert from the original images of ImageNet-LT.
1. Download the original ILSVRC-2012. Suppose you have downloaded and reorgnized them at path /downloaded/ImageNet/, which should contain two sub-directories: /downloaded/ImageNet/train and /downloaded/ImageNet/val.
2. Directly replace the data root directory in the file dataset_json/ImageNet_LT_train.json, dataset_json/ImageNet_LT_val.json,You can handle this with any editor, or use the following command.
```
# replace data root
python tools/replace_path.py --json_file dataset_json/ImageNet_LT_train.json --find_root /media/ssd1/lijun/ImageNet_LT --replaces_to /downloaded/ImageNet

python tools/replace_path.py --json_file dataset_json/ImageNet_LT_val.json --find_root /media/ssd1/lijun/ImageNet_LT --replaces_to /downloaded/ImageNet
```
iNat18

You can use the following steps to convert from the original format of iNaturalist 2018.
1. The images and annotations should be downloaded at iNaturalist 2018 firstly. Suppose you have downloaded them at path /downloaded/iNat18/.
2. Directly replace the data root directory in the file dataset_json/iNat18_train.json, dataset_json/iNat18_val.json,You can handle this with any editor, or use the following command.
```
# replace data root
python tools/replace_path.py --json_file dataset_json/iNat18_train.json --find_root /media/ssd1/lijun/inaturalist2018/train_val2018 --replaces_to /downloaded/iNat18

python tools/replace_path.py --json_file dataset_json/iNat18_val.json --find_root /media/ssd1/lijun/inaturalist2018/train_val2018 --replaces_to /downloaded/iNat18
```

Usage

First, prepare the dataset and modify the relevant paths in configs/FCC/xxx.yaml

Parallel training with DataParallel

1, Train
# Train long-tailed CIFAR-100 with imbalanced ratio of 100. 
# In run.sh, `GPUs` are the GPUs you want to use, such as '0' or`0,1,2,3`.
bash run.sh

2, If you want to train different methods with FCC.
# Just modify the "configs/xxx.ymal" in run.sh.

Citation

@InProceedings{Li_2023_CVPR,
    author    = {Li, Jian and Meng, Ziyao and Shi, Daqian and Song, Rui and Diao, Xiaolei and Wang, Jingwen and Xu, Hao},
    title     = {FCC: Feature Clusters Compression for Long-Tailed Visual Recognition},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {24080-24089}
}

Acknowledgements

This is a project based on Bag of tricks.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
configs		configs
lib		lib
main		main
resources		resources
tools		tools
LICENSE		LICENSE
README.md		README.md
data_parallel_train.sh		data_parallel_train.sh
distributed_data_parallel_train.sh		distributed_data_parallel_train.sh
requirements.txt		requirements.txt
run.sh		run.sh

License

lijian16/FCC

Folders and files

Latest commit

History

Repository files navigation

FCC: Feature Clusters Compression for Long-Tailed Visual Recognition

Feature Clusters Compression (FCC)

Main requirements

Detailed requirement

Prepare datasets

Data format

CIFAR-LT

ImageNet-LT

iNat18

Usage

Parallel training with DataParallel

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages