Skip to content

kostas1515/FRACAL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

[CVPR2025] Fractal Calibration for long-tailed object detection

Static Badge PWC

Fractal calibration method.

Abstract:

Real-world datasets follow an imbalanced distribution, which poses significant challenges in rare-category object detection. Recent studies tackle this problem by developing re-weighting and re-sampling methods, that utilise the class frequencies of the dataset. However, these techniques focus solely on the frequency statistics and ignore the distribution of the classes in image space, missing important information. In contrast to them, we propose FRActal CALibration (FRACAL): a novel post-calibration method for long-tailed object detection. FRACAL devises a logit adjustment method that utilises the fractal dimension to estimate how uniformly classes are distributed in image space. During inference, it uses the fractal dimension to inversely downweight the probabilities of uniformly spaced class predictions achieving balance in two axes: between frequent and rare categories, and between uniformly spaced and sparsely spaced classes. FRACAL is a post-processing method and it does not require any training, also it can be combined with many off-the-shelf models such as one-stage sigmoid detectors and two-stage instance segmentation models. FRACAL boosts the rare class performance by up to 8.6% and surpasses all previous methods on LVIS dataset, while showing good generalisation to other datasets such as COCO, V3Det and OpenImages.

Progress

  • Training code.
  • Evaluation code.
  • Provide instance segmentation checkpoint models.

Getting Started

Create a virtual environment
conda create --name fracal python=3.11 -y
conda activate fracal
  1. Install dependency packages
conda install pytorch torchvision -c pytorch
  1. Install MMDetection
pip install -U openmim
mim install mmengine
mim install "mmcv==2.1.0"
git clone https://github.com/kostas1515/FRACAL.git
  1. Create data directory, download COCO 2017 datasets at https://cocodataset.org/#download (2017 Train images [118K/18GB], 2017 Val images [5K/1GB], 2017 Train/Val annotations [241MB]) and extract the zip files:
mkdir data
cd data
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip

#download and unzip LVIS annotations
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_train.json.zip
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_val.json.zip

  1. modify mmdetection/configs/base/datasets/lvis_v1_instance.py and make sure data_root variable points to the above data directory, e.g., data_root = '<user_path>'

Training

Train a baseline model on multiple GPUs using tools/dist_train.sh e.g.:
./tools/dist_train.sh ./configs/<folder>/<model.py> <#GPUs>

Inference with Baseline Model

To test the MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 8 GPUs run:
./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8

Inference with FRACAL

To test the FRACAL-MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 4 GPUs run:

./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/fracal_r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8

Optional - Get Dataset statistics

We already provide the frequency and fractal dimension measures for LVIS_v1 train set, in the folder stat_files. If one needs to reproduce them then to generate the fractal dimension measures:
  1. run the get_statistics.py inside the folder ./stat_files/:
python get_statistics.py --dset_name lvis --path ../../../datasets/coco/annotations/lvis_v1_train.json  --output ./lvis_v1_train_stats.csv

This will create a csv containing various bounding box statistics such the class, width, height, location etc...

  1. compute the fractal dimension based on those statistics:
python calculate_fractality.py --dset_name lvisv1 --path ./lvis_v1_train_stats.csv --output lvis_v1_train_fractal_dim.csv

To generate the frequency weights run:

python get_frequency.py --path ../../../datasets/coco/annotations/lvis_v1_train.json --output freq_lvis_v1_train.csv

This will create a csv containing various frequency weights based on instance frequency or image frequency using various link functions. The lvis_v1_train_fractal_dim.csv and freq_lvis_v1_train.csv are used inside the \mmdet\models\roi_heads\bbox_heads\fracal_bbox_head.py script.

The statistical calculations scripts support COCO,LVISv1,LVISv05,V3Det,OpenImages datasets.

Pretrained Models on LVIS

Method AP APr APc APf APb Model
FRACAL-MaskRCNN-R50 28.5 23.0 28.1 31.5 28.4 weights
FRACAL-MaskRCNN-R101 29.9 24.6 29.3 32.8 29.8 weights
FRACAL-MaskRCNN-Swin-B 38.5 35.5 39.5 38.7 39.4 weights

BibTeX

@article{alexandridis2024fractal,
  title={Fractal Calibration for long-tailed object detection},
  author={Alexandridis, Konstantinos Panagiotis and Elezi, Ismail and Deng, Jiankang and Nguyen, Anh and Luo, Shan},
  journal={arXiv preprint arXiv:2410.11774},
  year={2024}
}

Acknowledgements

This code uses Pytorch and the mmdet framework. Thank you for your wonderfull work!

About

[CVPR2025] Fractal calibration for long-tailed object detection

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages