Differential Diffusion: Giving Each Pixel Its Strength #2851

exx8 · 2024-02-20T08:36:34Z

Hello,
I would like to suggest implementing my paper: Differential Diffusion: Giving Each Pixel Its Strength.
The paper allows a user to edit a picture by a change map that describes how much each region should change.
The editing process is typically guided by textual instructions, although it can also be applied without guidance.
We support both continuous and discrete editing.
Our framework is training and fine tuning free! And has negligible penalty of the inference time.
Our implementation is diffusers-based.
We already tested it on 4 different diffusion models (Kadinsky, DeepFloyd IF, SD, SD XL).
We are confident that the framework can also be ported to other diffusion models, such as SD Turbo, Stable Cascade, and amused.
I notice that you usually stick to white==change convention, which is opposite to the convention we used in the paper.
The paper can be thought of as a generalization to some of the existing techniques.
A black map is just regular txt2img ("0"),
A map of one color (which isn't black) can be thought as img2img,
A map of two colors which one color is white can be thought as inpaint.
And the rest? It's completely new!
In the paper, we suggest some further applications such as soft inpainting and strength visualization.

Site:
https://differential-diffusion.github.io/
Paper:
https://differential-diffusion.github.io/paper.pdf
Repo:
https://github.com/exx8/differential-diffusion

NeedsMoar · 2024-02-22T16:41:42Z

So is the primary difference between this and something like the QRCodeMonster controlnet in speed? The controlnets can slow things down quite a bit but it has the same effect in terms of masking out which areas will change when applied on a latent.

Here's a fun thought for you... have you considered allowing the use of normal maps as a method of describing directionality in the area with two channels and overall change with the intensity of the third? I don't know how hard this would be but it would be iteresting if normals could be painted such that the direction of things like water flows, trees, grass, and hair would generally try to follow the specified direction. It could probably be used to help fix things that tend to generate wrong on a regular basis (e.g. try the prompt "smoking a cigarette" on a character or human and it'll nearly always be floating but on the off chance it's in their mouth it's almost always backwards. This would allow a bit of control.

exx8 · 2024-02-23T19:44:42Z

So is the primary difference between this and something like the QRCodeMonster controlnet in speed? The controlnets can slow things down quite a bit but it has the same effect in terms of masking out which areas will change when applied on a latent.

Here's a fun thought for you... have you considered allowing the use of normal maps as a method of describing directionality in the area with two channels and overall change with the intensity of the third? I don't know how hard this would be but it would be iteresting if normals could be painted such that the direction of things like water flows, trees, grass, and hair would generally try to follow the specified direction. It could probably be used to help fix things that tend to generate wrong on a regular basis (e.g. try the prompt "smoking a cigarette" on a character or human and it'll nearly always be floating but on the off chance it's in their mouth it's almost always backwards. This would allow a bit of control.

I don't know this specific model, I assume it is one of ControlNet models that creates QR codes?
So the key difference between ControlNet and differential diffusion, is that diff diff edits pictures, in a new way. You do keep some of the original picture, not only position wise, but strength wise, you decide how much each part changes. In ControlNet you create something new. Both of these methods can be used together! ControlNet is more related in some sense to the prompt than change map, those inputs guide the model on its decision, while change map specifies how much it changes.
Regarding inference cost, the estimated impact of diff diff is measured to be 0.25% which is zero for any practical usage.

You suggested an interesting idea. I believe it is more related to the ControlNet domain than diff diff's, and I see it has already been implemented there:
https://huggingface.co/lllyasviel/sd-controlnet-normal

CognitiveDiffusion · 2024-02-26T19:28:24Z

I would so love to have Differential Diffusion in Comfy. It's absolutely powerful and I think inpainting will look so much better...

That being said - any chance you could try to implement it as a Custom Node, @exx8 ?

exx8 · 2024-02-26T20:44:44Z

I would so love to have Differential Diffusion in Comfy. It's absolutely powerful and I think inpainting will look so much better...

That being said - any chance you could try to implement it as a Custom Node, @exx8 ?

It seems that @shiimizu is working on this:
#2876
I hope it will be merged soon.

shiimizu mentioned this issue Feb 23, 2024

Implement Differential Diffusion #2876

Merged

comfyanonymous closed this as completed in #2876 Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differential Diffusion: Giving Each Pixel Its Strength #2851

Differential Diffusion: Giving Each Pixel Its Strength #2851

exx8 commented Feb 20, 2024

NeedsMoar commented Feb 22, 2024

exx8 commented Feb 23, 2024

CognitiveDiffusion commented Feb 26, 2024

exx8 commented Feb 26, 2024

Differential Diffusion: Giving Each Pixel Its Strength #2851

Differential Diffusion: Giving Each Pixel Its Strength #2851

Comments

exx8 commented Feb 20, 2024

NeedsMoar commented Feb 22, 2024

exx8 commented Feb 23, 2024

CognitiveDiffusion commented Feb 26, 2024

exx8 commented Feb 26, 2024