More Control for Free! Image Synthesis with Semantic Diffusion Guidance

This is the codebase for More Control for Free! Image Synthesis with Semantic Diffusion Guidance.

This repository is based on openai/guided-diffusion, with modifications for semantic guidance.

Installation

git clone https://github.com/xh-liu/SDG_code
cd SDG
pip install -r requirements.txt
pip install -e .

Download pre-trained models

The pretrained unconditional diffusion models are from openai/guided-diffusion and jychoi118/ilvr_adm.

LSUN bedroom unconditional diffusion: lsun_bedroom.pt
LSUN cat unconditional diffusion: lsun_cat.pt
LSUN horse unconditional diffusion: lsun_horse.pt
LSUN horse (no dropout): lsun_horse_nodropout.pt
FFHQ unconditional diffusion: ffhq.pt

We finetune the CLIP image encoders on noisy images for the semantic guidance. We provide the checkpoint as follows:

FFHQ semantic guidance: clip_ffhq.pt
LSUN bedroom semantic guidance: clipbedroom.pt
LSUN cat semantic guidance: clip_cat.pt
LSUN horse semantic guidance: clip_horse.pt

Sampling with semantic diffusion guidance

To sample from these models, you can use scripts/sample.py. Here, we provide flags for sampling from all of these models. We assume that you have downloaded the relevant model checkpoints into a folder called models/.

For LSUN cat, LSUN horse, and LSUN bedroom, the model flags are defined as:

MODEL_FLAGS="--attention_resolutions 32,16,8 --class_cond False --diffusion_steps 1000 --dropout 0.1 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --model_path models/lsun_bedroom.pt"

For FFHQ dataset, the model flags are defined as:

MODEL_FLAGS="--attention_resolutions 16 --class_cond False --diffusion_steps 1000 --dropout 0.0 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 128 --num_head_channels 64 --num_res_blocks 1 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --model_path models/ffhq_10m.pt"

Sampling flags:

SAMPLE_FLAGS="--batch_size 8 --timestep_respacing 100"

Sampling with image content(semantics) guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 0 --image_weight 100 --image_loss semantic --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with image style guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 0 --image_weight 100 --image_loss style --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_style_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with language guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 160 --image_weight 0 --text_instruction_file ref/bedroom_instructions.txt --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_language_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with both language and image guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 160 --image_weight 100 --image_loss semantic --text_instruction_file ref/bedroom_instructions.txt --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_language_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

You may need to adjust the text_weight and image_weight for better visual quality of generated samples.

Citation

If you find our work useful for your research, please cite our papers.

@inproceedings{liu2023more,
  title={More control for free! image synthesis with semantic diffusion guidance},
  author={Liu, Xihui and Park, Dong Huk and Azadi, Samaneh and Zhang, Gong and Chopikyan, Arman and Hu, Yuxiao and Shi, Humphrey and Rohrbach, Anna and Darrell, Trevor},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ref		ref
scripts		scripts
sdg		sdg
.DS_Store		.DS_Store
.gitignore		.gitignore
README.html		README.html
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ref

ref

scripts

scripts

sdg

sdg

.DS_Store

.DS_Store

.gitignore

.gitignore

README.html

README.html

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

teaser.png

teaser.png

Repository files navigation

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Installation

Download pre-trained models

Sampling with semantic diffusion guidance

Citation

About

Releases

Packages

Languages

xh-liu/SDG_code

Folders and files

Latest commit

History

Repository files navigation

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Installation

Download pre-trained models

Sampling with semantic diffusion guidance

Citation

About

Resources

Stars

Watchers

Forks

Languages