Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAAM with mu #40

Closed
andreemic opened this issue May 20, 2023 · 3 comments
Closed

DAAM with mu #40

andreemic opened this issue May 20, 2023 · 3 comments

Comments

@andreemic
Copy link

andreemic commented May 20, 2023

Hey! Great job on this repo! Very clean documentation and a useful idea.

@daemon
Copy link
Member

daemon commented May 20, 2023

Hey, thanks. I may be wrong as I'm not too familiar with the InstructPix2Pix architecture, but I think focusing on the cross-attention heads between the key text embeddings and the usual latent embeddings could work. If the attention key vectors are instead a concatenation of text embeddings and, say, image embeddings, then you could look at cross attention restricted to the text dimensions/area. If the text and image embeddings are unseparable (e.g., multimodal fusion), then that would likely be outside of the scope of DAAM/cross-attention and require a separate set of techniques.

@nityanandmathur
Copy link
Contributor

@andreemic Please let me know if you were able to generate cross-attention maps for IP2P or ControlNet.

I am trying to visualize cross-attention maps for Stable Diffusion image-to-image pipeline and facing same errors.

nityanandmathur added a commit to nityanandmathur/daam that referenced this issue Apr 1, 2024
@nityanandmathur
Copy link
Contributor

@daemon Opened a pull request which fixes this. Please have a look.

#60

@andreemic andreemic changed the title DAAM with multi-conditioned SD models (ControlNet, IP2P, etc.) DAAM with mu Apr 2, 2024
@daemon daemon closed this as completed in c30493e Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants