Intergrating View Conditions for Image Synthesis

This is the official implementation of the paper "Intergrating View Conditions for Image Synthesis". [Paper]

Pipeline

Installation

First, clone the repository locally:

git clone https://github.com/huggingface/diffusers.git 
git clone https://github.com/luca-medeiros/lang-segment-anything.git 
git clone https://github.com/IDEA-Research/GroundingDINO.git

Then, create a conda environment and install the required packages:

conda create -n view_cond python=3.10
conda activate view_cond

cd diffusers
pip install -e .

cd ../lang-segment-anything
pip install torch torchvision
pip install -e .

cd ../GroundingDINO
pip install -e .
mkdir weights
cd weights
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
cd ..

cd ..

pip install -r requirements.txt
cd pose_synthesis
python download_ckpt.py

pip install --upgrade torchaudio
cd ..

Training

If you want to train your own pose estimator, you can use the following command:

python train_pose_estimator.py --dataset_path <path_to_dataset> --output_dir <path_to_output_dir>

You may need to adjust the hyperparameters (learning rate, batch size, etc.) in the script to get the best performance.

Your dataset should be organized as follows:

dataset
├── class_1
│   ├── obj_1
│   │   ├── x1_y1.png
│   │   ├── x2_y2.png
│   │   ├── ...
│   │   └── xN_yN.png
│   ├── obj_2
│   │   ├── x1_y1.png
│   │   ├── x2_y2.png
│   │   ├── ...
│   │   └── xN_yN.png
│   ├── ...
│   └── obj_N
│       ├── ...
|── class_N
│   ├── ...

where x1_y1.png is the image of obj_1 at pose (x1, y1), and class_1 is the class name of obj_1. The dataset can be synthetic or real. If you want to synthesize your own dataset, first prepare a set of images of the object for the same class with the same pose, then use the pose_synthesis module to synthesize the images of the object at different poses.

Inference

Our pipeline consists of three steps: pose estimation, pose synthesis and image synthesis.

Pose Estimation

To estimate the pose of a given image, you can use the following command:

python pose_estimation.py --image_path <path_to_image> --output_dir <path_to_output_dir> --model_path <path_to_model>

Pose Synthesis

To synthesize an image of one object from a given pose, you can use the following command:

cd pose_synthesis
python pose_synthesis_batch.py --input_dir input_dir --output_dir output_dir --x x_value --y y_value
cd ..

for example:

cd pose_synthesis
python pose_synthesis_batch.py --input_dir input_dir --output_dir output_dir --x 0 --y 0
cd ..

To synthesis a set of images of one object from a given set of poses, you can use the following command:

cd pose_synthesis
python pose_synthesis_batch.py --input_dir input_dir --output_dir output_dir --x x_values --y y_values
cd ..

for example:

cd pose_synthesis
python pose_synthesis_batch.py --input_dir input_dir --output_dir output_dir --x 0,10 --y 0,-10
cd ..

Image Synthesis

To synthesize an image, you can use the following command:

python image_synthesis.py --path_src_img <path_to_src_img> --path_ref_img <path_to_ref_img> --text_prompt <text_prompt> --save_path <save_path> --mask_obj_name <mask_obj_name> --ref_obj_name <ref_obj_name>

If you need faster inference, you can set the dreambooth option to False or pre train it or change it to another lightweight personalization method like LoRA.

Others

If you need to obtain a more accurate caption or class name from an image, you can use the following command:

python obj_name_synthesis.py --path_src_img <path_to_src_img> --save_path <save_path>

If you need to remove the background of an image, you can use the following command:

python utils.py --input_path <input_path> --prompt <prompt> --output_path <output_path>

Examples

Here are some examples of our method:

Citation

If you find this work useful, please cite our paper:

@misc{bai2023integrating,
      title={Integrating View Conditions for Image Synthesis}, 
      author={Jinbin Bai and Zhen Dong and Aosong Feng and Xiao Zhang and Tian Ye and Kaicheng Zhou and Mike Zheng Shou},
      year={2023},
      eprint={2310.16002},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
imgs		imgs
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

imgs

imgs

.DS_Store

.DS_Store

README.md

README.md

Repository files navigation

Intergrating View Conditions for Image Synthesis

Pipeline

Installation

Training

Inference

Pose Estimation

Pose Synthesis

Image Synthesis

Others

Examples

Citation

About

Releases

Packages

viiika/ViewControl

Folders and files

Latest commit

History

Repository files navigation

Intergrating View Conditions for Image Synthesis

Pipeline

Installation

Training

Inference

Pose Estimation

Pose Synthesis

Image Synthesis

Others

Examples

Citation

About

Resources

Stars

Watchers

Forks