Skip to content

QY-H00/attention-interpolation-diffusion

Repository files navigation

PAID: (Prompt-guided) Attention Interpolation of Text-to-Image Diffusion


He Qiyuan1Wang Jinghao2Liu Ziwei2Angela Yao1,✉;
Computer Vision & Machine Learning Group, National University of Singapore 1
S-Lab, Nanyang Technological University 2
Corresponding Author

📌 Release

[03/2024] Code and paper are publicly available.

📑 Abstract

TL;DR: AID (Attention Interpolation via Diffusion) is a training-free method that enables the text-to-image diffusion model to generate interpolation between different conditions with high consistency, smoothness and fidelity. Its variant, PAID, provides further control of the interpolation via prompt guidance.

▶️ PAID Results

🏍️ Google Colab

Directly try PAID with Stable Diffusion 2.1 or SDXL using Google's Free GPU!

🚗 Local Setup using Jupyter Notebook

  1. Clone the repository and install the requirements:
git clone https://github.com/QY-H00/attention-interpolation-diffusion.git
cd attention-interpolation-diffusion
pip install requirements.txt
  1. Go to play.ipynb or play_sdxl.ipynb for fun!

🛳️ Local Setup using Gradio

  1. install Gradio
pip install gradio
  1. Launch the Gradio interface
gradio gradio_src/app.py

🎲 Customized Interpolation

Our method offers users customized and diverse configurations to experiment with, allowing them to freely adjust settings and achieve a wide range of interesting interpolation results. Here are some examples:

Prompt guidance

1. "A dog driving car"

2. "A car with dog furry texture"

3. "A toy named dog-car"

4. "A painting of car and dog drawn by Vincent van Gogh"

$\alpha$ and $\beta$ of the Beta prior

1. $\alpha=1, \beta=1$

2. $\alpha=1, \beta=8$

3. $\alpha=8, \beta=1$

📝 Supporting Models

Model Name Link
Stable Diffusion 1.4-512 CompVis/stable-diffusion-v1-4
Stable Diffusion 1.5-512 runwayml/stable-diffusion-v1-5
Stable Diffusion 2.1-768 stabilityai/stable-diffusion-2-1
Stable Diffusion XL-1024 stabilityai/stable-diffusion-xl-base-1.0
Animagine XL 3.1 cagliostrolab/animagine-xl-3.1

✒️Citation

If you found this repository/our paper useful, please consider citing:

@misc{he2024aid,
      title={AID: Attention Interpolation of Text-to-Image Diffusion}, 
      author={Qiyuan He and Jinghao Wang and Ziwei Liu and Angela Yao},
      year={2024},
      eprint={2403.17924},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

❤️ Acknowledgement

We thank the following repositories for their great work: diffusers, transformers.

➕️ More Results with SD1.5

Realist Style

Pikachu -> Gundam

Computer -> Phone

Anime Style

Ninja -> Cat

Ninja -> Dog

Oil-Painting Style

Starry night -> Mona Lisas

SkyCraper -> Town