The paper introduces a novel unsupervised object localization method that leverages self-supervised pre-trained models without the need for additional finetuning. Traditional methods often utilize class-agnostic activation maps or self-similarity maps from a pre-trained model. However, these maps have limitations in explaining the model's predictions. This work proposes an unsupervised object localization technique based on representer point selection. In this approach, the model's predictions are represented as a linear combination of representer values of training points. By selecting these representer points, which are the most influential examples for the model's predictions, the model can offer insights into its prediction process by showcasing relevant examples and their significance. The proposed method surpasses the performance of state-of-the-art unsupervised and self-supervised object localization techniques on various datasets. It even outperforms some of the latest weakly supervised and few-shot methods.
If you find this repository useful for your publications, please consider citing out paper.
@InProceedings{Song_2023_ICCV,
author = {Song, Yeonghwan and Jang, Seokwoo and Katabi, Dina and Son, Jeany},
title = {Unsupervised Object Localization with Representer Point Selection},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {6534-6544}
}
- pytorch >= 1.10.0
- torchvision >= 0.11.0
- efficientnet-pytorch >= 0.7.1
- tqdm >= 4.65.0
You will need to download the images and structure the data directory referring to this repository.
You will need to download the images in your data root directory to evaluate our model on each dataset.
-
Stanford Cars: http://ai.stanford.edu/~jkrause/cars/car_dataset.html (Not accessible now)
-
FGVC-Aircraft: https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/
-
Stanford Dogs: http://vision.stanford.edu/aditya86/ImageNetDogs/
-
Download pre-trained weights: Google Drive
-
Run the following command to reproduce our results
python main.py --dataset EVALUATION_DATASET --loc_network EVALUATION_NETWORK --data_dir YOUR_DATAROOT
python main.py --dataset EVALUATION_DATASET --loc_network EVALUATION_NETWORK --data_dir YOUR_DATAROOT --image_size 480 --crop_size 448 --resnet_downscale 32
python main.py --dataset CUBSEG --loc_network EVALUATION_NETWORK --data_dir YOUR_DATAROOT --image_size 480 --crop_size 448 --resnet_downscale 32
Employing class-specific parameters: --classwise
Setting the sampling ratio: i.e. --sampling_ratio 0.1
Zero-shot transferring: i.e. --base_dataset CIFAR10
Evaluating with classifier: i.e. --cls_network ResNet50