Support runwayML custom inpainting model #1243

lstein · 2022-10-25T14:56:24Z

Inpaint using the runwayML custom inpainting model

This is still a work in progress but seems functional. It supports inpainting, txt2img and img2img on the ddim, plms and k* samplers.

Installation

To test this, get the file sd-v1-5-inpainting.ckpt from https://huggingface.co/runwayml/stable-diffusion-inpainting and place it at models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt

Usage

Launch invoke.py with --model inpainting-1.5 and proceed as usual. All the usual arguments and settings should work (but haven't been systematically tested)

Caveats

The inpainting model takes about 800 Mb more memory than the standard 1.5 model. This model will not work on 4 GB cards.
I think performance is a bit slower as well, but have not benchmarked.
The inpainting model is temperamental. It wants you to describe the entire scene and not just the masked area to replace. So if you want to replace the parrot on a man's shoulder with a crow, the prompt "crow" may fail. Try "man with a crow on shoulder" instead. The symptom of a failed inpainting is that the area will be erased and replaced with background.
When using img2img mode, the inpainting model really does not like to change the image much compared to standard 1.4 or 1.5. High configuration guidance scales, strengths and step counts are needed. This seems like a feature of the model, but I can't be sure.

This is still a work in progress but seems functional. It supports inpainting, txt2img and img2img on the ddim and k* samplers (plms still needs work, but I know what to do). To test this, get the file `sd-v1-5-inpainting.ckpt' from https://huggingface.co/runwayml/stable-diffusion-inpainting and place it at `models/ldm/stable-diffusion-v1/sd-v1-5-inpainting.ckpt` Launch invoke.py with --model inpainting-1.5 and proceed as usual. Caveats: 1. The inpainting model takes about 800 Mb more memory than the standard 1.5 model. This model will not work on 4 GB cards. 2. The inpainting model is temperamental. It wants you to describe the entire scene and not just the masked area to replace. So if you want to replace the parrot on a man's shoulder with a crow, the prompt "crow" may fail. Try "man with a crow on shoulder" instead. The symptom of a failed inpainting is that the area will be erased and replaced with background. 3. This has not been tested well. Please report bugs.

- The plms sampler now works with custom inpainting model - Quashed bug that was causing generation on normal models to fail (oops!) - Can now generate non-square images with custom inpainting model

@Any-Winter-4079

- The plms sampler now works with custom inpainting model - Quashed bug that was causing generation on normal models to fail (oops!) - Can now generate non-square images with custom inpainting model Credits for advice and assistance during porting: @Any-Winter-4079 (http://github.com/any-winter-4079) @db3000 (Danny Beer http://github.com/db3000)

…paint-model

Any-Winter-4079 · 2022-10-25T15:54:10Z

I'll test after class. Thanks!

…paint-model

Any-Winter-4079 · 2022-10-25T18:03:09Z

There's a lot to test, but starting with basic txt2img.
The 1.5 inpainting model seems to be able to generate coherent images with txt2img. Nice! The output is different, but it's a different model with more training steps done (so it's somewhat expected).

Inpainting 1.5

"an anime girl" -s 50 -S 3031912972 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -A plms

1.4

Now, for regular 1.4, I see images have changed.
!switch stable-diffusion-1.4
"an anime girl" -s 50 -S 3031912972 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -A plms

DDIM
"an anime girl" -s 50 -S 3031912972 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -A ddim

So this will be the first thing I investigate.

lstein · 2022-10-25T19:01:47Z

@Any-Winter-4079 The changes that you are seeing in the 1.4 model might actually be due to an unrelated PR I worked on a couple of days ago and merged last night (I'll have to look it up). Appallingly enough, it turned out that when you surround the prompt with quotation marks ("), the quotes were being passed to the generation engine. So "an anime girl" and an anime girl without the quotes, could give different results!

You might want to try the comparison without quotation marks in the prompts.

lstein · 2022-10-25T19:08:23Z

This model is very intriguing. Even in straight img2img mode it is great at making targeted changes. For example, I can change the pattern of a subject's clothing from leopard to zebra print without changing their posture, face, or background. On the other hand, I can't make big changes, such as changing their posture or the overall style of the image, even with high step and CFG.

Outpainting, which I tested with the outcrop restoration module, works quite well with this model. Better than 1.4 by far.

Any-Winter-4079 · 2022-10-25T20:21:30Z

It seems to produce the same result with an without "" for me.

lstein · 2022-10-25T20:27:43Z

So you're seeing differences using the 1.4 model between the PR and the pre-PR code base? I'll check it out on my own end. There were a bunch of fiddly changes, but hope I didn't inadvertently change the noise generation part of the code, which would most likely cause this.

Any-Winter-4079 · 2022-10-25T20:32:22Z

So you're seeing differences using the 1.4 model between the PR and the pre-PR code base?

Yes, but I pulled other changes (not just this PR).
You can check using mps_noise, so you can recreate my own images.
Old prompts and results on Mac: https://github.com/invoke-ai/InvokeAI/blob/development/docs/help/SAMPLER_CONVERGENCE.md

Let me know if you can reproduce it, so we know it's not something on my end.

Any-Winter-4079 · 2022-10-25T21:03:24Z

About 1.5-inpainting:
txt2img seems to work. From the limited experiments I've done, I'd say regular 1.5 is better than 1.5-inpainting (also, 1.4 and 1.5 are related)

img2img works. I'm still not sure what my conclusions are. 1.5 seems to do pretty poorly, so 1.5-inpainting is an improvement, but they all seem to have issues.

img2img with clipseg

lstein · 2022-10-25T22:06:18Z

There are so many variations of parameters that it has hard to do an apples-to-apples comparison.

One conclusion that I've reached is that the strength option (-f) only makes things worse for the inpainting model. I am thinking of ignoring its value and using 1.0. I think it made sense to have this in the non-inpainting model because that model is blindly drawing on top of an image encoded in latent space and strength controls how much modification is allowed. In the inpainting model the model "understands" that it is replacing part of the image.

Any-Winter-4079 · 2022-10-25T22:19:32Z

There are so many variations of parameters that it has hard to do an apples-to-apples comparison.

The above results are using the same parameters, including seeds, strength, source images, masks, etc. Everything is the same except for the model.

I'll try with strength 0.99 and keep testing.

Results with 1.5-inpainting:

img2img with clipseg:

Strength (-f) is ignored with clipseg. Using -f0.99 and -f0.01.
mirkerr.png

"blonde hair" -W512 -H512 -C7.5 -S3031912972 -I mirkerr.png -tm "hair" -f0.01

"blonde hair" -W512 -H512 -C7.5 -S3031912972 -I mirkerr.png -tm "hair" -f0.99

img2img without clipseg:
Mask:

For starters, removing the hair seems to work much better than using clipseg (the hair is more blond!). But other than that, the strength value is ignored, as the results are the same.

One conclusion that I've reached is that the strength option (-f) only makes things worse for the inpainting model. I am thinking of ignoring its value and using 1.0.

Does -f affect the final result then, @lstein ? I would've thought we ignore -f given the results.
"miranda kerr with blonde hair" -W512 -H512 -C7.5 -S3031912972 -I mirkerr.png -M mirkerr_mask.png -f0.01 (but also -f0.99 and -f0.75)

Any-Winter-4079 · 2022-10-25T22:51:03Z

Oh, by the way. When using inpaint_st.py, the other day I could load a Dreambooth .ckpt as main model (in my case, to define the style) and use 1.5 inpainting on top. So it's like using 2 models at the same same time.

I'm not sure how to do this now that 1.5 inpainting is the main model.
I mean, I would've sworn I was using both simultaneously. I''ll check again just to be sure.
...

I''ll check again just to be sure.

Yeah, not sure how could I have done it. It probably was Dreambooth .ckpt + regular img2img on that model.

lstein · 2022-10-26T01:07:39Z

I’ve done img2img slightly wrong. I’m going to remove some code that over constrains the image. This should give us more variability.

The strength parameter is inappropriate for this model and will be disabled. Sorry for the confusion, but it’s taken me some time to realize how the pieces fit together.

Any-Winter-4079 · 2022-10-26T10:43:12Z

1.5-inpaint after 906dafe
Original:

Using mask:
"miranda kerr with blonde hair" -W512 -H512 -C7.5 -S3031912972 -I mirkerr.png -M mirkerr_mask.png

vs. yesterday

Using clipseg:
"blonde hair" -W512 -H512 -C7.5 -S3031912972 -I mirkerr.png -tm "hair"

vs. yesterday

btw I don't see clipseg in the output:
"blonde hair" -s 50 -S 3031912972 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -I mirkerr.png -A k_lms -f 0.75

lstein · 2022-10-26T10:52:51Z

This PR will break both --hires and --embiggen, as they reimplement some low-level image generation steps that don't work with the new model. If you try to use these switches they will be ignored.

I will fix these before marking the PR as ready for merging.

lstein · 2022-10-26T10:57:14Z

No much of a difference between yesterday and today.

What happens to your test image when you raise -C modestly to anything between 10.0 and 15.0?

Any-Winter-4079 · 2022-10-26T11:08:25Z

Using 1.5 inpaint and 906dafe
-C 7.5

-C 15

Any-Winter-4079 · 2022-10-26T11:13:38Z

I barely see any change between yesterday and today (left side of image, hair is a bit different)

For the Miranda Kerr example, I prefer yesterday's (it abides more by the original image in hair length, etc.) but it's hard to draw meaningful conclusions. I'll test a bit more.

Any-Winter-4079 · 2022-10-26T11:18:51Z

With the latest version, single word prompts seem to work.
"macaw" -s 50 -S 3096140878 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -I 197632255-563dc05b-58cf-498b-88c2-8e2ae274b2a4.png -A k_lms -M 197632285-8d7f0f15-0c2d-4adb-8bb5-b9e4914a1de3.png -f 0.1

Also using the same seed we used in inpaint_st.py (3).

vs. what happened when we run inpaint_st.py (painted background)

With yesterday's version
"macaw" -s 50 -S 3096140878 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -I 197632255-563dc05b-58cf-498b-88c2-8e2ae274b2a4.png -A k_lms -M 197632285-8d7f0f15-0c2d-4adb-8bb5-b9e4914a1de3.png -f 0.1

And using the same seed we used in inpaint_st.py (3).

I'm confused now. With yesterday's version, sometimes it works, sometimes it doesn't. So I guess w/ today's version, here it did work, but we can't guarantee it always works of course.

Any-Winter-4079 · 2022-10-26T11:46:17Z

I'm doing a small experiment with 20 images comparing yesterday's vs. today's code version, using single-word prompts && img2img.

Update:
Results
"macaw" -s 10 -W 512 -H 512 -C 7.5 --fnformat {prefix}.{seed}.png -I 197632255-563dc05b-58cf-498b-88c2-8e2ae274b2a4.png -A k_lms -M 197632285-8d7f0f15-0c2d-4adb-8bb5-b9e4914a1de3.png -n 20
Used 10 steps to speed it up, but we can see the macaws forming.

Yesterday's

Today's

It looks like it happens to both code versions (about 10% of the time).

This PR will break both --hires and --embiggen, as they reimplement some low-level image generation steps that don't work with the new model. If you try to use these switches they will be ignored.

About this, I've never even set up embiggen so I couldn't tell. This might be a good excuse to do so.

- change default model back to 1.4 - remove --fnformat from canonicalized dream prompt arguments (not needed for image reproducibility) - add -tm to canonicalized dream prompt arguments (definitely needed for image reproducibility)

This was a difficult merge because both PR #1108 and #1243 made changes to obscure parts of the diffusion code. - prompt weighting, merging and cross-attention working - cross-attention does not work with runwayML inpainting model, but weighting and merging are tested and working - CLI command parsing code rewritten in order to get embedded quotes right - --hires now works with runwayML inpainting - --embiggen does not work with runwayML and will give an error - Added an --invert option to invert masks applied to inpainting - Updated documentation

lstein added 6 commits October 25, 2022 00:30

start support for 1.5 inpainting model, not complete

83a3cc9

add missing file

be8a992

fixed synax errors; now channel mismatch issue

a2e5389

add missing inpainting yaml file

175c7bd

inpaint and txt2img working with ddim sampler

aaf7a4f

lstein marked this pull request as draft October 25, 2022 14:56

lstein requested a review from Any-Winter-4079 October 25, 2022 14:56

lstein mentioned this pull request Oct 25, 2022

Inpainting-specific model support #1184

Closed

lstein added 4 commits October 25, 2022 11:42

plms works, bugs quashed

83e1c39

- The plms sampler now works with custom inpainting model - Quashed bug that was causing generation on normal models to fail (oops!) - Can now generate non-square images with custom inpainting model

Merge branch 'inpaint-model' of github.com:invoke-ai/InvokeAI into in…

dd07392

…paint-model

Merge branch 'development' into inpaint-model

1ae269b

lstein added 5 commits October 25, 2022 13:17

stop crashes on non-square images

4352eb6

Merge branch 'inpaint-model' of github.com:invoke-ai/InvokeAI into in…

04c8937

…paint-model

Merge branch 'inpaint-model' of github.com:invoke-ai/InvokeAI into in…

c732fd0

…paint-model

fix crash when doing img2img with ddim sampler and SD 1.5

3c1ef48

Merge branch 'inpaint-model' of github.com:invoke-ai/InvokeAI into in…

b1a2f4a

…paint-model

prevent crash when providing empty quoted prompt ("")

ca2f579

allow for empty prompts (useful for inpaint removal)

8d5a225

lstein added 2 commits October 25, 2022 22:44

do not encode init image in starting latent

d3047c7

make variations work with inpainting model

906dafe

This was referenced Oct 26, 2022

test -H 1024 -W 1024 fail on MacOS(VENTURA) with NDArray > 2**32 #1244

Closed

Images have slightly changed (txt2img) #1254

Closed

minor cleanups

b1da13a

- change default model back to 1.4 - remove --fnformat from canonicalized dream prompt arguments (not needed for image reproducibility) - add -tm to canonicalized dream prompt arguments (definitely needed for image reproducibility)

lstein mentioned this pull request Oct 26, 2022

Prompting enhancements (weighting, blending, cross-attention control, conjunction) #1108

Closed

29 tasks

lstein merged commit 9b71597 into development Oct 27, 2022

lstein deleted the inpaint-model branch October 27, 2022 06:06

carson-katri mentioned this pull request Oct 28, 2022

RunwayML Inpainting Model Support carson-katri/dream-textures#358

Closed

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support runwayML custom inpainting model #1243

Support runwayML custom inpainting model #1243

lstein commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

lstein commented Oct 25, 2022

lstein commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022

lstein commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

lstein commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

lstein commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

lstein commented Oct 26, 2022

lstein commented Oct 26, 2022

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Support runwayML custom inpainting model #1243

Support runwayML custom inpainting model #1243

Conversation

lstein commented Oct 25, 2022 • edited Loading

Inpaint using the runwayML custom inpainting model

Installation

Usage

Caveats

Any-Winter-4079 commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 • edited Loading

Inpainting 1.5

1.4

lstein commented Oct 25, 2022

lstein commented Oct 25, 2022 • edited Loading

Any-Winter-4079 commented Oct 25, 2022

lstein commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 • edited Loading

Any-Winter-4079 commented Oct 25, 2022 • edited Loading

lstein commented Oct 25, 2022

Any-Winter-4079 commented Oct 25, 2022 • edited Loading

Any-Winter-4079 commented Oct 25, 2022 • edited Loading

lstein commented Oct 26, 2022 • edited Loading

Any-Winter-4079 commented Oct 26, 2022 • edited Loading

lstein commented Oct 26, 2022

lstein commented Oct 26, 2022

Any-Winter-4079 commented Oct 26, 2022 • edited Loading

Any-Winter-4079 commented Oct 26, 2022 • edited Loading

Any-Winter-4079 commented Oct 26, 2022 • edited Loading

Any-Winter-4079 commented Oct 26, 2022 • edited Loading

Yesterday's

Today's

lstein commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

lstein commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 25, 2022 •

edited

Loading

lstein commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 26, 2022 •

edited

Loading