SPDiffusion: Semantic Protection Diffusion Models for Multi-concept Text-to-image Generation
Yang Zhang, Rui Zhang, Xuecheng Nie, Haochen Li, Jikun Chen, Yifan Hao, Xin Zhang, Luoqi Liu , Ling Li
This paper proposes a unified approach to address the challenges of improper attribute binding and concept entanglement. We introduce a novel method, SPDiffusion, which detects concept regions from both cross- and self-attention maps, while safeguarding these regions from interference by irrelevant tokens.
For technical details, please refer to our paper.
torch>=2.0.1
diffusers>=0.29.0
stanza==1.8.2
nltk==3.8.1
from SPD_Pipeline import SPDiffusionPipeline
import torch
pipe = SPDiffusionPipeline.from_pretrained("SG161222/RealVisXL_V4.0").to("cuda")
generator = torch.Generator(device="cuda").manual_seed(2048)
image=pipe("A red book and a yellow vase",run_sdxl=True,generator=generator,cross_threshold=0.9,self_threshold=0.1).images[0]
image.save("result.png")The parameters for the SPDiffusion pipeline are as follows:
prompttext prompt for generationcross_thresholdthreshold value for cross attention mapself_thresholdthreshold value for self attention mapst_steplayout keeping and SP-Extration stepsfilter_loclayers for SP-Extrationrun_sdxlgenerate original sdxl image
This project builds upon valuable work and resources from the following repositories:
We extend our sincere thanks to the creators of these projects for their contributions to the field and for making their code available. 🙌

