Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative implementation in Refiners #92

Open
deltheil opened this issue Sep 30, 2023 · 1 comment
Open

Alternative implementation in Refiners #92

deltheil opened this issue Sep 30, 2023 · 1 comment

Comments

@deltheil
Copy link

We are building Refiners, an open source, PyTorch-based framework made to easily train and run adapters on top of foundational models. Just wanted to let you know that IP-Adapter is now fully supported in Refiners! (congrats on the great work, by the way!!)

E.g. an equivalent to the "IP-Adapter with fine-grained features" demo would look like this:

  1. Follow these install steps
  2. Run the code snippet below which gives:

output

import torch

from PIL import Image

from refiners.foundationals.latent_diffusion import StableDiffusion_1, SD1IPAdapter
from refiners.foundationals.latent_diffusion.schedulers import DDIM
from refiners.fluxion.utils import load_from_safetensors, manual_seed


device = "cuda"
image = Image.open("statue.png")

ddim_scheduler = DDIM(num_inference_steps=50)
sd15 = StableDiffusion_1(scheduler=ddim_scheduler, device="cuda", dtype=torch.float16)
sd15.clip_text_encoder.load_from_safetensors("clip_text.safetensors")
sd15.lda.load_from_safetensors("lda.safetensors")
sd15.unet.load_from_safetensors("unet.safetensors")

with torch.no_grad():
    prompt = "best quality, high quality, wearing a hat on the beach"
    negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality"

    ip_adapter = SD1IPAdapter(
        target=sd15.unet,
        weights=load_from_safetensors("ip-adapter-plus_sd15.safetensors"),
        fine_grained=True,
        scale=0.6,
    )
    ip_adapter.clip_image_encoder.load_from_safetensors("clip_image.safetensors")
    ip_adapter.inject()

    clip_text_embedding = sd15.compute_clip_text_embedding(text=prompt, negative_text=negative_prompt)
    clip_image_embedding = ip_adapter.compute_clip_image_embedding(ip_adapter.preprocess_image(image))

    negative_text_embedding, conditional_text_embedding = clip_text_embedding.chunk(2)
    negative_image_embedding, conditional_image_embedding = clip_image_embedding.chunk(2)

    clip_text_embedding = torch.cat(
        (
            torch.cat([negative_text_embedding, negative_image_embedding], dim=1),
            torch.cat([conditional_text_embedding, conditional_image_embedding], dim=1),
        )
    )

    manual_seed(42)
    x = torch.randn(1, 4, 64, 64, device=device, dtype=torch.float16)

    for step in sd15.steps:
        x = sd15(
            x,
            step=step,
            clip_text_embedding=clip_text_embedding,
            condition_scale=7.5,
        )
    predicted_image = sd15.lda.decode_latents(x)

predicted_image.save("output.png")
print("done: see output.png")

Note: other variants of IP-Adapter are supported too (SDXL, with or without fine-grained features)

A few more things:

  • SD1IPAdapter implements the IP-Adapter logic: it “targets” the UNet on which it can be injected (= all cross-attentions are replaced with the decoupled cross-attentions) or ejected (= get back to the original UNet)
  • It builds upon Refiners’ Adapter API
  • Adapters can be combined e.g. in the tests we showcase how to combine IP-Adapter with Controlnet.

Feedback welcome!

@deltheil
Copy link
Author

FYI, we have also written a blog post with additional details about this implementation -> https://blog.finegrain.ai/posts/supercharge-stable-diffusion-ip-adapter/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant