Weighted Prompts for Diffusers stable diffusion pipeline #1506

UglyStupidHonest · 2022-12-01T15:22:14Z

I could not find anything for diffusers and unfortunately I'm not on the Level yet where I can implement it myself. :)

It would be amazing to be able to weight prompts like "a dog with a hat:0.5"

Thank you for this amazing library !!

WASasquatch · 2022-12-01T20:22:54Z

This has unfortunately only been added as community pipeline, which imo, is a very broken system that just adds tons of work to the end-usage managing all these pipes, and not very API friendly.

https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py

With community pipelines, you get only what it advertises, and nothing else. It's not like the many other repos out there like AUTOMATIC where these things are more packages together for usage with all available features, creating a robust and feature rich system.

github-actions · 2023-01-01T15:02:41Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2023-01-02T14:02:51Z

For future readers:

For a direct use case, we have the following community pipeline: https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py

You can also define your own attention processor that weighs certain prompts differently by making use of this API:
#1639

Ephil012 · 2023-01-08T18:38:15Z

@patrickvonplaten Are there any plans to integrate this into the main pipeline? As @WASasquatch said the community pipeline implementation is not very user friendly. It seems like it would be pretty useful to have it built in as a feature given how often prompt weighting is used in the community

alexisrolland · 2023-01-09T09:36:39Z

Upvoting this as I think prompt weighting is indeed an important feature that should be added to diffusers to compete with other alternative solutions. All other alternatives support it (Stable Diffusion WebUI, DreamStudio, Midjourney...).

Thanks for your hard work! <3

patrickvonplaten · 2023-01-13T11:20:05Z

cc @patil-suraj what do you think?

patrickvonplaten · 2023-01-13T11:24:58Z

My opinion here is that diffusers doesn't aim at being a full-fledged UI , but rather a backend for UIs such as:

InvokeAI: use 🧨diffusers model invoke-ai/InvokeAI#1583
diffuzers: https://github.com/abhishekkrthakur/diffuzers

Nevertheless, we could/should try to more actively maintain: https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py and potentially write a documentation page about it.

Also @SkyTNT what do you think maybe :-)

alexisrolland · 2023-01-13T14:41:38Z

How does supporting prompt weighting transform diffusers toward a UI? I think the kind of usage that would be expected here is to be able to use weights in a way similar to this and let the backend do it’s magic ;) :

pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-5")
pipe = pipe.to("cuda")

prompt = "a photo of an ((astronaut)) riding a horse on mars"
# or
prompt = "a photo of an (astronaut:0.5) riding a horse on mars"
image = pipe(prompt).images[0]

SkyTNT · 2023-01-14T12:39:12Z

I agree with @Ephil012 . But I'm busy recently, so I may not be able to contribute.

WASasquatch · 2023-01-14T19:47:05Z

What does a user interface have to do with back-end functionality?

Ephil012 · 2023-01-15T16:27:37Z

@patrickvonplaten I'd argue that adding this feature does not lead to diffusers becoming a full fledged UI. This would simply be a feature on the backend when inputting prompts (like how alexisrolland mentioned).

You mentioned that the goal of diffusers is to act as a backend for projects providing a SD UI. However, by not implementing this feature it's arguably making it harder to use diffusers as a backend. When building a UI, most users expect there to be prompt weighting built in. By not having it in diffusers, it leads to each project having to build their own implementation. This causes duplicated work between projects and in general makes using diffusers harder. Personally, I started looking for other alternatives to diffusers to build my side project on top of simply because it was missing essential features like prompt weighting. I'd also argue other common features should be built in, such as long prompts (this may have already been added, not sure), but that's a discussion for another thread. Yes there are community pipelines that can be used, but it would make sense to have it in the main pipeline too for maintainability and reliability.

As far as implementation goes, I do think that some projects might not want to follow the A111 syntax. I think there could be a default syntax, which you could customize via code. Or you could take the approach imaginAIry does where they allow you to create a list of prompts and set weights in code (example below). Either approach would allow for using your own syntax

ImaginePrompt([
    WeightedPrompt("cat", weight=1),
    WeightedPrompt("dog", weight=1),
])

keturn · 2023-01-15T17:02:59Z

My opinion here is that diffusers doesn't aim at being a full-fledged UI , but rather a backend for UIs such as:

InvokeAI: use 🧨diffusers model invoke-ai/InvokeAI#1583

If you are going to refer people to the current InvokeAI code as an example of how to use diffusers as a backend, be warned that there are parts that are not pretty. 😆

This is definitely a place where we had to work around the StableDiffusionPipeline rather than with it. I see that _encode_prompt is its own method now, which at least allows the possibility of overriding it, but there are still a couple of reasons why Invoke had to work around it:

Under its current architecture, Invoke has already prepared the text embeddings by the time it's ready to do inference, and the pipeline doesn't have any method that takes that form of input.
The _encode_prompt method has the tokenization and encoding too entangled with the structure of the batch and the conditioned/unconditioned data.

You've already identified other use cases for exposing an API that takes text embeddings directly, such as #205 and #1869. It's also always easier to pass values to things than it is to subclass and override template methods, so factoring such a method out of the existing StableDiffusionPipeline.__call__ sounds like the way to go.

damian0815 · 2023-01-15T19:05:45Z

I have a work in progress project of turning the prompt weighting code i built for InvokeAI into a library called Incite that would theoretically be able to plug in to any transformers-based system that takes a text string, tokenizes it, and then produces an embedding vector.

A simple way of providing painless weighting support would be for the stable diffusion pipeline to support conditioning vectors as alternative input to prompt strings. The process of doing weighted prompting would then look something like this:

pipeline = StableDiffusionPipeline.from_pretrained(...)
incite = Incite(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

# weight of 'fluffy' is increased, weight of 'dark' is decreased
positive_conditioning_tensor = incite.build_conditioning_tensor(
    "a fluffy+++ cat playing with a ball in a dark-- forest"
) 
negative_conditioning_tensor = incite.build_conditioning_tensor(
    "ugly, poorly drawn, etc."
)

images = pipeline(positive_conditioning=positive_conditioning_tensor,
    negative_conditioning=negative_conditioning_tensor).images

This in itself is just a first step, however, - because being able to to alter prompts on the fly unlocks all sorts of other possibilities. Here's a more advanced design:

pipeline = StableDiffusionPipeline.from_pretrained(...)
incite = Incite(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

# at 50% of the way through the diffusion process, replace the word "cat" with "dog"
prompt="a cat.swap(dog, start=0.5) playing with a ball in the forest" 
conditioning_scheduler = incite.build_conditioning_scheduler(
    positive_prompt=prompt, 
    negative_prompt=""
)

images = pipeline(conditioning_scheduler=conditioning_scheduler).images
# at the start of every diffusion step the pipeline queries the conditioning_scheduler 
# for positive and negative conditioning tensors to apply for that step

This unlocks the capability for, as one early reviewer, @raefu, put it, "a generalized macro language that ultimately creates conditioning vectors for every step of the image generation".

With such a flexible model it would be possible to do wild things like performing image comparison operations with the latent image vector part-way through the diffusion process and then programmatically altering the conditioning/prompt based on what has been partially diffused already. The possibilities are endless, and really quite exciting.

patrickvonplaten · 2023-01-16T13:18:28Z

Opening a PR that allows text_embeddings to be passed via the __call__ method. This makes a lot of sense to me and is in line with #1869 .

damian0815 · 2023-01-26T10:22:10Z

thanks @patrickvonplaten - with 0.12 and my prompt weighting library Compel (based on the InvokeAI weighting code) I can now do this to apply weights to different parts of the prompt:

from compel import Compel
from diffusers import StableDiffusionPipeline

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

# upweight "ball"
prompt = "a cat playing with a ball++ in the forest"
embeds = compel.build_conditioning_tensor(prompt)
image = pipeline(prompt_embeds=embeds).images[0]

works great - thank you!

patil-suraj · 2023-01-26T11:37:31Z

Very cool @damian0815 !

UglyStupidHonest · 2023-01-26T11:39:42Z

So coool I need to try this !! Thank you!!

alexisrolland · 2023-01-28T12:10:34Z

@damian0815 very cool!

What would be the syntax if we want to add weight to a group of words rather than just a single word?

Thanks!

damian0815 · 2023-01-28T14:37:16Z

@damian0815 very cool!

What would be the syntax if we want to add weight to a group of words rather than just a single word?

Thanks!

you can put the (words you want to weight)++ in parentheses

this (also (supports)-- nesting)+

speech marks "also work"+ like this

alexisrolland · 2023-01-28T14:53:28Z

Thanks @damian0815 ! Do you actually have the link of a documentation describing the different syntaxes? I am also wondering how to add different level of weights to different bags of words... is it just something like:

(this bag is heavy)+++ while (this bag is medium)+ and (this one is really light)---

?

damian0815 · 2023-01-28T15:20:18Z

that's right @alexisrolland . docs are linked on the readme but it's basically adapted from what i wrote for InvokeAI - https://invoke-ai.github.io/InvokeAI/features/PROMPTS/#prompt-syntax-features

alexisrolland · 2023-01-30T09:49:40Z

@damian0815 If I may, I think it would be nice if your compel library supports the same syntax as SD WebUI since it is hugely popular. For example if it could accept () to increase weight and [] to decrease weight. See Doc here: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#attentionemphasis

damian0815 · 2023-01-30T10:38:42Z

nope, not happening. the Auto111 syntax is rubbish

alexisrolland · 2023-01-30T10:42:32Z

nope, not happening. the Auto111 syntax is rubbish

Ha ha ha as much as I agree with you, it's becoming the defacto standard 😀

I prefer your syntax too...

damian0815 · 2023-01-30T13:07:26Z

what i might consider adding is a converter that can convert auto syntax to invoke syntax. pull requests welcome :)

alexisrolland · 2023-01-30T14:06:31Z

That would be fantastic... the best of both worlds ^^

patrickvonplaten · 2023-01-31T08:53:26Z

BTW, another use case that should be somewhat easily enabled by this is long-weight prompting: #2136 (comment)

Ephil012 · 2023-02-05T23:05:04Z

@patrickvonplaten I saw that the PR added the ability to pass embeddings in now. From my understanding, you still need to either write the prompt weighting code yourself or use a third party library (like compel). Do you know if there's any plans to add built in prompt weighting (similar to the LPW community pipeline) into one of the main Stable Diffusion pipelines? That way people don't have to use a third party code for this functionality.

patil-suraj · 2023-02-07T13:10:55Z

Prompt weightin won't be included in the main pipeline in order to keep the pipeline simple so that users can easily follow and modify the pipeline on their own. The philosphy behind this is explained in this doc, we encourage users to give it a read :)

WASasquatch · 2023-02-12T16:45:11Z

Maybe drop all that state of the art stuff, then. It's antiquated already. You all need to do better. People are going to be modifying this pipe, and be lost, because of the lack of proper support, for shenanigans. As it stands most big places using Diffusers aren't even using your pipes, but the community ones, and racking their heads on your backwards logic and "philosophy" (one of the worse things to talk about in open source code, your philosophy should be whatever the people want, otherwise just sell an API and be a business where this is expected behavior)

…

On Tue, Feb 7, 2023, 5:11 AM Suraj Patil ***@***.***> wrote: Prompt weightin won't be included in the main pipeline in order to keep the pipeline simple so that users can easily follow and modify the pipeline on their own. The philosphy behind this is explained in this doc <https://huggingface.co/docs/diffusers/main/en/conceptual/philosophy>, we encourage users to give it a read :) — Reply to this email directly, view it on GitHub <#1506 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIZEZPKO625MAKSEDA5R7DWWJCWVANCNFSM6AAAAAASQ5AD7Y> . You are receiving this because you were mentioned.Message ID: ***@***.***>

alexisrolland · 2023-02-15T10:47:53Z

Hello @damian0815

I am trying to use your compel library to convert prompts / negative prompts into embeddings. It works like a charm with StableDiffusionPipeline but with StableDiffusionImg2ImgPipeline I get the error message below when calling the pipeline:

[...]
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
prompt_embeds = compel.build_conditioning_tensor(payload.prompt) if payload.prompt else None
negative_prompt_embeds = compel.build_conditioning_tensor(payload.negative_prompt) if payload.negative_prompt else None
[...]
 pipeline(
        prompt_embeds=prompt_embeds,
        negative_prompt_embeds = negative_prompt_embeds,
        image=init_images,
        strength=payload.init_image_noise,
        num_inference_steps=payload.steps,
        guidance_scale=payload.guidance,
        num_images_per_prompt=payload.num_images,
        generator=generators
    )

Returns

ValueError("prompt has to be of type str or list but is <class 'NoneType'>")

I checked my prompt_embeds and it does contain data. Am I doing anyting wrong?
Thanks

damian0815 · 2023-02-15T15:28:27Z

hi @alexisrolland , check that you're on at least diffusers v0.12 . if that doesn't fix it, please post the full stack trace (on the compel github issues rather than here)

alexisrolland · 2023-02-15T15:32:39Z

@damian0815 yes I'm on v0.12.1... actually I think that's more of a problem with StableDiffusionImg2ImgPipeline than Compel... but based on your answer I assume it should be working. I will fill in another bug report on diffusers instead of this thread. Thanks for the prompt answer.

Ephil012 · 2023-02-16T03:29:55Z

@patil-suraj I read the doc you sent. It helped clarify a lot of things for me. Thanks!

However, the one concern I have is about community pipeline support. Some of these pipelines provide essential features to devs, but seem to be less well maintained than the main pipeline. As a result, it makes devs hesitant to build on top of them or diffusers in general. The same goes for third party libs.

Would it make sense to keep a simple main pipeline and then make some of the community pipelines part of the official pipelines list that are more actively maintained by huggingface? That way the philosophy of keeping stuff simple is adhered to, but it also provides devs with the features they need without worrying about if a community pipeline will be abandoned in the future. I know it involves a lot of commitment to support a new pipeline, but I figured I might as well ask in case. I feel like officially supporting this will attract more people to diffusers vs other libraries.

On an unrelated note, but should some of the stuff for compel be moved to another thread on that repo? It seems a lot of this thread has become a troubleshooting thread for a separate library. It might make sense to move the talk to compel's repo so that it's easier for people to find in the future while also keeping this thread more on topic.

damian0815 · 2023-02-16T19:55:08Z

@Ephil012

should some of the stuff for compel be moved to another thread on that repo?

yeah that's probably my bad for not immediately redirecting people there. i'll be sure to do so in the future.

github-actions · 2023-03-13T15:04:25Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot added the stale Issues that haven't received updates label Jan 1, 2023

alexisrolland mentioned this issue Jan 9, 2023

Prompt weighting would need better documentation #1943

Closed

patrickvonplaten mentioned this issue Jan 23, 2023

Allow directly passing text embeddings to Stable Diffusion Pipeline for prompt weighting #2071

Merged

patrickvonplaten mentioned this issue Mar 6, 2023

Is it possible diffusers implement an official support on the increasing or decreasing weight of prompt with () & []? #2431

Closed

github-actions bot closed this as completed Mar 22, 2023

Weighted Prompts for Diffusers stable diffusion pipeline #1506

Weighted Prompts for Diffusers stable diffusion pipeline #1506

Comments

UglyStupidHonest commented Dec 1, 2022

WASasquatch commented Dec 1, 2022 • edited Loading

github-actions bot commented Jan 1, 2023

patrickvonplaten commented Jan 2, 2023

Ephil012 commented Jan 8, 2023 • edited Loading

alexisrolland commented Jan 9, 2023

patrickvonplaten commented Jan 13, 2023

patrickvonplaten commented Jan 13, 2023

alexisrolland commented Jan 13, 2023 • edited Loading

SkyTNT commented Jan 14, 2023

WASasquatch commented Jan 14, 2023 • edited Loading

Ephil012 commented Jan 15, 2023 • edited Loading

keturn commented Jan 15, 2023 • edited Loading

damian0815 commented Jan 15, 2023 • edited Loading

patrickvonplaten commented Jan 16, 2023

damian0815 commented Jan 26, 2023 • edited Loading

patil-suraj commented Jan 26, 2023

UglyStupidHonest commented Jan 26, 2023

alexisrolland commented Jan 28, 2023

damian0815 commented Jan 28, 2023 • edited Loading

alexisrolland commented Jan 28, 2023

damian0815 commented Jan 28, 2023

alexisrolland commented Jan 30, 2023

damian0815 commented Jan 30, 2023

alexisrolland commented Jan 30, 2023

damian0815 commented Jan 30, 2023

alexisrolland commented Jan 30, 2023

patrickvonplaten commented Jan 31, 2023

Ephil012 commented Feb 5, 2023

patil-suraj commented Feb 7, 2023

WASasquatch commented Feb 12, 2023 via email • edited Loading

alexisrolland commented Feb 15, 2023 • edited Loading

damian0815 commented Feb 15, 2023

alexisrolland commented Feb 15, 2023

Ephil012 commented Feb 16, 2023 • edited Loading

damian0815 commented Feb 16, 2023 • edited Loading

github-actions bot commented Mar 13, 2023

WASasquatch commented Dec 1, 2022 •

edited

Loading

Ephil012 commented Jan 8, 2023 •

edited

Loading

alexisrolland commented Jan 13, 2023 •

edited

Loading

WASasquatch commented Jan 14, 2023 •

edited

Loading

Ephil012 commented Jan 15, 2023 •

edited

Loading

keturn commented Jan 15, 2023 •

edited

Loading

damian0815 commented Jan 15, 2023 •

edited

Loading

damian0815 commented Jan 26, 2023 •

edited

Loading

damian0815 commented Jan 28, 2023 •

edited

Loading

WASasquatch commented Feb 12, 2023 via email •

edited

Loading

alexisrolland commented Feb 15, 2023 •

edited

Loading

Ephil012 commented Feb 16, 2023 •

edited

Loading

damian0815 commented Feb 16, 2023 •

edited

Loading