Skip to content

xh-liu/SDG_code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

This is the codebase for More Control for Free! Image Synthesis with Semantic Diffusion Guidance.

This repository is based on openai/guided-diffusion, with modifications for semantic guidance.

results

Installation

git clone https://github.com/xh-liu/SDG_code
cd SDG
pip install -r requirements.txt
pip install -e .

Download pre-trained models

The pretrained unconditional diffusion models are from openai/guided-diffusion and jychoi118/ilvr_adm.

We finetune the CLIP image encoders on noisy images for the semantic guidance. We provide the checkpoint as follows:

Sampling with semantic diffusion guidance

To sample from these models, you can use scripts/sample.py. Here, we provide flags for sampling from all of these models. We assume that you have downloaded the relevant model checkpoints into a folder called models/.

For LSUN cat, LSUN horse, and LSUN bedroom, the model flags are defined as:

MODEL_FLAGS="--attention_resolutions 32,16,8 --class_cond False --diffusion_steps 1000 --dropout 0.1 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --model_path models/lsun_bedroom.pt"

For FFHQ dataset, the model flags are defined as:

MODEL_FLAGS="--attention_resolutions 16 --class_cond False --diffusion_steps 1000 --dropout 0.0 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 128 --num_head_channels 64 --num_res_blocks 1 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --model_path models/ffhq_10m.pt"

Sampling flags:

SAMPLE_FLAGS="--batch_size 8 --timestep_respacing 100"

Sampling with image content(semantics) guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 0 --image_weight 100 --image_loss semantic --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with image style guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 0 --image_weight 100 --image_loss style --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_style_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with language guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 160 --image_weight 0 --text_instruction_file ref/bedroom_instructions.txt --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_language_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

Sampling with both language and image guidance:

GUIDANCE_FLAGS="--data_dir ref/ref_bedroom --text_weight 160 --image_weight 100 --image_loss semantic --text_instruction_file ref/bedroom_instructions.txt --clip_path models/CLIP_bedroom.pt"
CUDA_VISIBLE_DEVICES=0 python -u scripts/sample.py --exp_name bedroom_image_language_guidance --single_gpu $MODEL_FLAGS $SAMPLE_FLAGS $GUIDANCE_FLAGS

You may need to adjust the text_weight and image_weight for better visual quality of generated samples.

Citation

If you find our work useful for your research, please cite our papers.

@inproceedings{liu2023more,
  title={More control for free! image synthesis with semantic diffusion guidance},
  author={Liu, Xihui and Park, Dong Huk and Azadi, Samaneh and Zhang, Gong and Chopikyan, Arman and Hu, Yuxiao and Shi, Humphrey and Rohrbach, Anna and Darrell, Trevor},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  year={2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages