This is the repository for Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models, published at SIGGRAPH 2024.
[Project Page] [Paper]
Build conda environment by running:
conda create -n sepen python=3.10
conda activate sepen
pip install -r requirement.txt
see src/run_individual.sh
for a sample training script.
see src/run_large.sh
for a sample training script.
see src/sample.py
for refernce.
install clean-fid via pip install clean-fid
then refer to src/eval/fid/eval_fid.py
for FID evaluation.
We adopt the implementation from A&E. See src/eval/blip/eval_blip.py
for BLIP similarity score evaluation.
Clone and build Detic from their official repo. Then move the Python files under src/eval/detic
to the cloned folder. See src/eval/detic/eval_detic.py
for details.
The 220 concepts we used for the large-scale experiment is at src/concepts/large_scale.py
.
The 200 evaluation prompts are at src/concepts/large_test.txt
.
Part of our codes is inspired by Custom Diffusion and Attend and Excite.
We leverage Detic and clean-fid for our evaluation.
@inproceedings{bao2024sepen,
Author = {Bao, Zhipeng and Li, Yijun and Singh, Krishna Kumar and Wang, Yu-Xiong and Hebert, Martial},
Title = {Separate-and-Enhance: Compositional Finetuning for Text2Image Diffusion Models},
Booktitle = {SIGGRAPH},
Year = {2024},
}