Open-vocabulary Object Segmentation with Diffusion Models

This repository contains the official PyTorch implementation of grounded diffusion: https://arxiv.org/abs/2301.05221.

Requirements

A suitable conda environment named grounded-diffusion can be created and activated with:

conda env create -f environment.yaml
conda activate grounded-diffusion

Model Zoo

https://drive.google.com/drive/folders/1HlagN6jVhmC_UbrOAy133LkN4Qgf2Scv?usp=sharing

Train

Before training, please download the checkpoint of the off-the-shelf detector into a folder called mmdetection/checkpoint/.

python train.py --class_split 1 --train_data random --save_name pascal_1_random

Inference

python test.py --sd_ckpt 'xxx/stable_diffusion.ckpt' \
--grounding_ckpt 'xxx/grounding_module.pth' \
--prompt "a photo of a lion on a mountain top at sunset" \
--category "lion"

Citation

If you use this code for your research or project, please cite:

@article{li2023grounded,
  title   = {Open-vocabulary Object Segmentation with Diffusion Models},
  author  = {Li, Ziyi and Zhou, Qinye and Zhang, Xiaoyun and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year    = {2023}
}

Acknowledgements

Many thanks to the code bases from Stable Diffusion, CLIP, taming-transformers.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
VOC		VOC
clip		clip
configs		configs
data		data
demo		demo
ldm		ldm
mmdetection		mmdetection
models		models
src		src
README.md		README.md
environment.yaml		environment.yaml
test.py		test.py
train.py		train.py

Lipurple/Grounded-Diffusion

Folders and files

Latest commit

History

Repository files navigation

Open-vocabulary Object Segmentation with Diffusion Models

Requirements

Model Zoo

Train

Inference

Citation

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages