DatasetDM (NeurIPS2023)

Official code for 'DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models'

Project Website | Paper

🔥 News

[2023.9.24] The weights for P-Decoder on VOC2012 and COCO2017 are released.
[2023.9.22] the paper was accepted by NeurIPS2023.
[note] We will release the code within three months. Please wait.
[2023.8.11] We initialize the Repo.

🖌️ DEMO

ToDo

🎶 Introduction

📑 Supported Task

💡 Demo

To demonstrate the high-quality synthetic data, we visualized synthetic data from two domains: human-centric and urban city:

Large language model, GPT-4, is adopted to enhance the diversity of generative data:

💡 Todo

Hugging Face Demo
...

🛠️ Getting Started

Installation

conda create -n DatasetDM python=3.8

Install the corresponding torch==1.9.1, please refer to pytorch. Such as:

pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html

Then install other packages:

python -m pip install -r requirements.txt

Download the weights and configuration files of SD 1.4 and place them in the ./dataset/ckpts directory.

Or we have uploaded the corresponding SD weights used in my experiments to Google Drive (around 4.5GB) as following:

https://drive.google.com/file/d/12lrOexljsyvFB30-ltbYXnIpQ8oP4lrW/view?usp=sharing

Download the diffusers

cd model
git clone https://github.com/huggingface/diffusers.git

There may be some errors (such as #11) due to the update of the Diffuser version. We recommend using Diffuser 0.3.0 (https://pypi.org/project/diffusers/0.3.0/#files).

Alternatively, you can directly utilize our diffuser, as there have been some modifications in ./model/diffusers/models/unet_blocks.py.

Dataset Prepare

Depth Estimation: Please follow MED to prepare the dataset on ./data
Segmentation: VOC, Cityscapes, and COCO: Please follow Mask2former to prepare the dataset on ./data

The final dataset should be ordered as follow:

data/
    PascalVOC12/
	JPEGImages
	SegmentationClassAug
	splits/
	     train_aug.txt
    COCO2017/
	train2017/
		2011_003261.jpg
        	...
	annotations/
		instances_train2017.json	
		person_keypoints_train2017.json
     VirtualKITTI2/
	Depth/
		Scene01
		Scene02
		...
	Image/
		Scene01
		Scene02
		...
     nyudepthv2/
	sync/
	official_splits/
		test/
	nyu_class_list.json
	train_list.txt
	test_list.txt
     kitti/
	input/
	gt_depth/
	kitti_eigen_train.txt
     deepfashion-mm/
	images/
	segm/
	captions.json/
	train_set.txt/
	test_set.txt

Besides, you also need to order the prompt txt files as follows:

dataset/
	Prompts_From_GPT/
		deepfashion_mm/
			general.txt
		coco_pose/
			general.txt
		KITTI/
			general.txt
		NYU/
			general.txt
		coco/
			toothbrush.txt
			hair drier.txt
			book.txt
			...
		cityscapes/
			bicycle.txt
			motorcycle.txt
			bus.txt
			...

Semantic Segmentation

VOC 2012

Training the P-deocder, or we also provide the trained weight with 100 real images.

# For Segmentation Tasks
sh ./script/train_semantic_VOC.sh

# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh

# Visualization of generative data
python ./DataDiffusion/vis_VOC.py

Cityscapes

# For Segmentation Tasks
sh ./script/train_semantic_Cityscapes.sh

# Generate synthetic data for Cityscapes
sh ./script/data_generation_Cityscapes_semantic.sh

Before training the existing segmentation model~(), you should adopt the augmentation:

sh ./script/augmentation_Cityscapes.sh

# Visualization of generative data
python ./DataDiffusion/vis_Cityscapes.py

Instance Segmentation

COCO 2017

Training the P-deocder, or we also provide the trained weight with 400 real images.

# For Segmentation Tasks
sh ./script/train_COCO.sh

# Generate synthetic data for COCO
sh ./script/data_generation_coco_instance.sh

# Visualization of generative data
python ./DataDiffusion/vis_COCO.py

Data Augmentation with image splicing

# Augmentation of generative data
sh ./script/augmentation_coco.sh

Then training Mask2former with these synthetic data, enjoy!

Depth Estimation

KITTI

# Training Depth Estimation Tasks on KITTI
sh ./script/train_depth_KITTI.sh

If you want to training with Virtual_KITTI_2, using the blow script:

# Training Depth Estimation Tasks on Virtual KITTI 2
sh ./script/train_depth_Virtual_KITTI_2.sh

# Generate synthetic data for KITTI
sh ./script/data_generation_KITTI_depth.sh

Then training any existing Depth Estimation Method with these synthetic data, enjoy!

In our paper, we adopt Depthformer to valid the quality of generative data.

NYU-Depth-V2

# For Depth Estimation Tasks
sh ./script/train_depth_NYU.sh

# Generate synthetic data for NYU
sh ./script/data_generation_NYU_depth.sh

Data Augmentation with image splicing

# Augmentation of generative data
sh ./script/augmentation_NYU.sh

Then training any existing Depth Estimation Method with these synthetic data, enjoy!

In our paper, we adopt Depthformer to valid the quality of generative data.

Open Pose

COCO 2017

# Training Pose Estimation Tasks on COCO2017
sh ./script/train_pose_coco.sh

# Generate synthetic data for Pose on COCO
sh ./script/data_generation_COCO_Pose.sh

Then you need convert the data to coco format, and training any existing Pose Estimation Method with these dataset. Here, we adopt SimCC to valid the quality of generative data.

Zero-Shot Semantic Segmentation

PASCAL VOC 2012

Download VOC 2012, and order the dataset.

# For Zero Shot Segmentation Tasks
sh ./script/train_semantic_VOC_zero_shot.sh

# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh

Data Augmentation with image splicing

# Augmentation of generative data
sh ./script/augmentation_VOC.sh

Then training Mask2former with these synthetic data, enjoy!

Fashion-Segmentation

DeepFashion-MM

Download DeepFashion-MM, and order the dataset.

# Train DeepFashion Segmentation Tasks
sh ./script/train_semantic_DeepFashion_MM.sh

# Generate synthetic data for DeepFashion-MM
sh ./script/parallel_generate_Semantic_DeepFashion.py

Then training Mask2former or other Segmentation Methods~mmsegmentation with these synthetic data, enjoy!

Long-tail-Segmentation (VOC)

# For LongTail semantic segmentation
sh ./script/train_semantic_VOC_LongTail.sh

# Generate synthetic data for VOC
sh ./script/data_generation_VOC_semantic.sh

Data Augmentation with image splicing

# Augmentation of generative data
sh ./script/augmentation_VOC.sh

Long-tail-Segmentation (LVIS)

# For LongTail semantic segmentation
sh ./script/train_instance_LVIS.sh

# Generate synthetic data for VOC
sh ./script/data_generation_LVIS_instance.sh

Acknowledgements

This work draws inspiration from the following code as references. We extend our gratitude to these remarkable contributions:

Citation

@article{wu2023datasetdm,
  title={DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models},
  author={Wu, Weijia and Zhao, Yuzhong and Chen, Hao and Gu, Yuchao and Zhao, Rui and He, Yefei and Zhou, Hong and Shou, Mike Zheng and Shen, Chunhua},
  journal={Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
DataDiffusion		DataDiffusion
asset		asset
config		config
dataset		dataset
model		model
script		script
tools		tools
README.md		README.md
debug.py		debug.py
environment.yaml		environment.yaml
ptp_utils.py		ptp_utils.py
requirements.txt		requirements.txt
seq_aligner.py		seq_aligner.py
train.py		train.py

showlab/DatasetDM

Folders and files

Latest commit

History

Repository files navigation

DatasetDM (NeurIPS2023)

🔥 News

🖌️ DEMO

🎶 Introduction

📑 Supported Task

💡 Demo

💡 Todo

🛠️ Getting Started

Installation

Dataset Prepare

🖌️ Table of Contents

Semantic Segmentation

Instance Segmentation

Depth Estimation

Open Pose

Zero-Shot Semantic Segmentation

Fashion-Segmentation

Long-tail-Segmentation (VOC)

Long-tail-Segmentation (LVIS)

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Languages