ImaginaryNet: Learning Object Detectors without Real Images and Annotations

This repository is for the ICLR 2023 paper: ImaginaryNet: Learning Object Detectors without Real Images and Annotations

If you use any source codes or ideas included in this repository for your work, please cite the following paper.

@article{ni2022imaginarynet,
  title={ImaginaryNet: Learning Object Detectors without Real Images and Annotations},
  author={Ni, Minheng and Huang, Zitong and Feng, Kailai and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2210.06886},
  year={2022}
}

If you have any questions, feel free to email me.

Abstract

Without the demand of training in reality, humans are able of detecting a new category of object simply based on the language description on its visual characteristics. Empowering deep learning with this ability undoubtedly enables the neural network to handle complex vision tasks, e.g., object detection, without collecting and annotating real images. To this end, this paper introduces a novel challenging learning paradigm Imaginary-Supervised Object Detection (ISOD), where neither real images nor manual annotations are allowed for training object detectors. To resolve this challenge, we propose ImaginaryNet, a framework to synthesize images by combining pretrained language model and text-to-image synthesis model. Given a class label, the language model is used to generate a full description of a scene with a target object, and the text-to-image model is deployed to generate a photo-realistic image. With the synthesized images and class labels, weakly supervised object detection can then be leveraged to accomplish ISOD. By gradually introducing real images and manual annotations, ImaginaryNet can collaborate with other supervision settings to further boost detection performance. Experiments show that ImaginaryNet can (i) obtain about 75% performance in ISOD compared with the weakly supervised counterpart of the same backbone trained on real data, (ii) significantly improve the baseline while achieving state-of-the-art or comparable performance by incorporating ImaginaryNet with other supervision settings.

Illustration of Framework

Preparation

You can run the following commands to start up the environment.

conda env create -f environment.yaml

conda activate imaginarynet

pip install --upgrade jax==0.3.25 jaxlib==0.3.25+cuda11.cudnn82 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

conda install -c conda-forge cudatoolkit-dev

Pipeline Usage

This pipeline provide the core function of ImaginaryNet: to generate images based on class label.

Quick Start

python imaginarynet.py --num 10000 --classfile voc.txt --gpt --clip --backend dalle-mini

Parameters Explanation

--seed Random seed.

--num Number of generated images.

--classfile Initial classes.

--outputdir Output dir.

--gpt Use GPT to extend prompt or not.

--clip Use CLIP to filter image or not.

--backend Use dalle-mini or stablediffusion.

--cpu Use CLIP as filter on CPU or not.

--threshold The min score CLIP can accept.

Reproducibility

To help improve the reproducibility of the community, we provide generated datasets, trained checkpoints, and training logs. Please note that generated images may not be re-generated exactly the same because of the update of the backend and the change of the environment. We did not modify the code of detection backbones. To start training of these backbones, please refer to their original repos. If you want to access the original data or experiments, please download our archives.

Generated Images

Name	Download Link
10,000 Imaginary Data	Download

Save Checkpoints and Logs

Imaginary-Supervised Object Detection (ISOD)

Backbone	Imaginary Data	mAP	Checkpoint	Log
OICR	5K Imaginary	35.43	Download	Download

Weakly-Supervised Object Detection (WSOD)

Backbone	Imaginary Data	mAP	Checkpoint	Log
WSDDN	5K Imaginary	39.90	Download	Download
OICR	5K Imaginary	51.39	Download	Download
W2N	5K Imaginary	65.05	Download	Download

Semi-Supervised Object Detection (SSOD)

Backbone	Real Data	Imaginary Data	mAP	Checkpoint	Log
Unbiased-Teacher	5K VOC2007	5K Imaginary	80.36	Download	Download
Unbiased-Teacher	5K VOC2007	10K Imaginary	80.60	Download	Download
Unbiased-Teacher	5K VOC2007 + 10K VOC2012 (un-labeled)	10K Imaginary	81.60	Download	Download

Acknowledgement

We greatly appreciate Yeli Shen for his contribution in the public code of ImaginaryNet.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
img		img
stable-diffusion		stable-diffusion
CLIP_filter.py		CLIP_filter.py
COCO.txt		COCO.txt
README.md		README.md
StableDiffusion.py		StableDiffusion.py
VOC.txt		VOC.txt
environment.yaml		environment.yaml
gen_image.py		gen_image.py
gen_prompt.py		gen_prompt.py
imaginarynet.py		imaginarynet.py

kodenii/ImaginaryNet

Folders and files

Latest commit

History

Repository files navigation