Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ControlNet implementation suggestion #139

Open
wants to merge 70 commits into
base: main
Choose a base branch
from

Conversation

JasonS09
Copy link

@JasonS09 JasonS09 commented Apr 5, 2023

Hello! I couldn't wait anymore for a controlnet implementation for this plugin (much needed for me, and the constant swapping between the webui and Krita was driving me crazy), so I worked in implementing controlnet for the plugin in my end. I managed to make it work.

This controlnet implementation uses the official API endpoints for txt2img and img2img (which I thought would be convenient since I read you're planning to switch to official API in the future), yet I still didn't touch much of what's already made. The logic I implemented should only work if at least one controlnet unit is enabled. Here is a list of features:

  • Allows different sources for input images for annotators: users can import images from disk or paste a selected image in clipboard. If none of those inputs are used, the plugin will automatically use selected image as input for the annotator.

  • Allows annotator preview.

  • Allows annotator input to switch rgb to bgr and/or invert colors (requires more testing).

  • Different parameters change dynamically depending on chosen annotator.

  • Using selection directly as input without preprocessing (choose preprocessor "none").

  • txt2img, img2img, and inpainting with ControlNet.

  • Use of official API when ControlNet is activated, but extension backend when it's not.

Limitations:

  • Fixed annotator config: it can store the config of each controlnet unit individually between sessions, but every time the user switches preprocessor, config of current one is discarded (this is for each unit separately).

  • For inpainting, mask should be black color (recommended). This is because I found out official API for some reason erases all the content of the mask if there is transparency in the image (it expects white mask on black background). So the current implementation converts transparency to white and then inverts color. If user draw a white mask for inpainting, that mask could be ignored because it will bet inverted to black.

  • The current approach for removing unmasked content for inpainting is very slow. It is suggested to switch to a new one involving transparency masks.

I have made some testing on my end and everything seems in order. But I'd suggest to test it further, and either inform me about a specific bug or fix it yourself ;)

@Rogal80
Copy link

Rogal80 commented Apr 6, 2023

hi, can you write basic breakdown tutorial how to use it - step-by-step example?

@JasonS09
Copy link
Author

JasonS09 commented Apr 6, 2023

Sure! I'll work on it when I finish my current project.

@rexelbartolome
Copy link

rexelbartolome commented Apr 8, 2023

Hello, managed to make it work, found a bug about changing canny high and canny low will reset to 200 and 100 after a few seconds.

krita_r85ymXdohZ.mp4

This might be difficult to implement but hopefully in the future we can paste the annotated preview into Krita so we can erase which lines need to be followed etc. That also paves the way for the Controlnet input to not be annotated everytime to save on resources (afaik that's how it works? correct me if I'm wrong) so you can just annotate once, reimport the annotated image as the Controlnet input, then remove the preprocessor once it's done.

Buttons like "use as input and remove preprocessor" and "paste to Krita" (similar to how an img2img generation is fitted to the selected region) would be great :)
image

And of course thanks for implementing Controlnet itself! I was also looking for one but only found some for Photoshop 😭 Might be able to help with documentation too :)

Edit: just found out that the img2img controlnet isn't working

Error running process: D:\stable-diffusion\empire-install2\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
  File "D:\stable-diffusion\empire-install2\stable-diffusion-webui\modules\scripts.py", line 417, in process
    script.process(p, *script_args)
  File "D:\stable-diffusion\empire-install2\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 628, in process
    unit = self.parse_remote_call(p, unit, idx)
  File "D:\stable-diffusion\empire-install2\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 540, in parse_remote_call
    unit.enabled = selector(p, "control_net_enabled", unit.enabled, idx, strict=True)
AttributeError: 'str' object has no attribute 'enabled'

txt2img controlnet works fine though 🤔

@Interpause
Copy link
Owner

-Images are not upscaled in backend, they're just scaled by the plugin once received (unless you check hires fix for txt2img). I have yet to find a workaround for this (there is an insinuation of upscaling in the front end in code, so I decided to wait for your opinion on this instead. I admit I'm not really sure of how this upscaling thing works in this plugin).

The webUI's highres fix used to be bad. So I kept the upscaling system from the original plugin. But recently (well as in a few months ago), webui highres fix was improved. The upscaling system is done completely by the custom backend; The frontend "scaling" code is to handle downscaling the image or increasing the canvas size depending on whether there is a canvas selection. I don't think it is necessary to try and get the upscaling system to work with the official endpoints since it would be complicated (would probably need a second API call to upscale).

qmainwindow.tabifyDockWidget(dockers[TAB_SDCOMMON], dockers[TAB_PREVIEW])
qmainwindow.tabifyDockWidget(dockers[TAB_TXT2IMG], dockers[TAB_IMG2IMG])
qmainwindow.tabifyDockWidget(dockers[TAB_TXT2IMG], dockers[TAB_INPAINT])
qmainwindow.tabifyDockWidget(dockers[TAB_TXT2IMG], dockers[TAB_UPSCALE])
dockers[TAB_SDCOMMON].raise_()
dockers[TAB_INPAINT].raise_()

def remove_unmasked_content_for_inpaint(img, mask):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very slow for large images, better approach is to insert mask as transparency layer: https://api.kde.org/krita/html/classTransparencyMask.html.

Copy link
Author

@JasonS09 JasonS09 Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently working on this, but can't find a way to import this class to the script. Doesn't seem to work from krita module. No error is shown but it messes up the whole plugin.

Edit: nvm, found the way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. It seems Google Colab has changed their ToS to prohibit the use of remote UIs. This means I'm unable to perform testing and use the plugin, so unfortunately I'll have to stop development in this implementation. Anyone interested can rescue it and follow development.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's so unfortunate :(

is it possible for you to use Runpod instead of colab @JasonS09 ? it's generally affordable to use, and i can even sponsor your GPU hours if you want

Copy link
Author

@JasonS09 JasonS09 Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thank you for the offer. Honestly, I don't want to pay for this since I'm broke (got 17 dollars in my bank account, no incomes). If you say you can sponsor that could be an option, but I'm not sure if that service provider allows remote connections to the UI. Last time I tried in Paperspace, I couldn't use it as backend (simply didn't work even with the --api flag), and after a while I got kicked from my session, then unable to login again. I'm suspect they banned me for trying to use the UI as backend.

EDIT: I reinstalled webui in my local machine, and it somehow works better now. It's still really slow, but I think this will do the work. I'm going to be away this weekend, then I'll continue the work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Creating a transparency mask with `self.doc.createNode(name, "transparencyMask")`  in `script.py` is not working properly. It creates a node but you can't do anything with it (not really a transparency mask).

* `self.doc.createTransparencyMask(name)` doesn't seem to be a thing, even though you can [find it in the documentation](https://api.kde.org/krita/html/classDocument.html#abbd8e5ca62dd2952623c2d5cbc6faf5f).

* Alternatively, it is possible to create one by calling actions `self.app.action("add_new_transparency_mask")`. However, you seemingly can' use `setPixelData()` [to draw the mask into the mask layer](https://api.kde.org/krita/html/classNode.html#a4e0b624db748aa8cf63ba84131dfc1a7). Or at least all my attempts have failed to do so.

* So this only leaves me with one option I can think of: create a paint layer first, set the mask pixel data, then convert it into a transparency mask. However, there is another issue with this approach. Setting active node with `self.doc.setActiveNode(layer)` will work but not for the actions like `self.app.action("convert_to_transparency_mask")`. They will completelly ignore the current active node even if it's explicitly set. Making it really difficult to work with.

Well... I've independently verified most of these.

On my most successful attempt so far, where I attempted to retrofit transparency_mask_inserter(self) to handle these masks, I managed to get four functioning transparency masks on a batch and wrote data to one of them, though it did not appear correct, possibly because it needs to be converted to grayscale first? I did that by passing the mask all the way in that function and putting it right after add_mask_action.trigger() which is far less than ideal... I suspect that the issues you're encountering with setting pixel data on a layer created through app.action are more related to the race conditions that the inserter function is meant to work around, though, based on this experience.

I'm going to see if I can set pixel data on masks without completely butchering the mask inserter function, I'll let you know if I find anything out.

Copy link

@drhead drhead Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a pull request on your branch for a partially working implementation... It is not very clean, mostly due to being shoehorned into old API code.

self.app.action("add_new_transparency_mask") does work, it does produce a correct transparency mask layer. But it does this on another thread, so if you try to immediately set the pixel data that causes a race condition. You also have to convert it to grayscale for it to write correctly, which isn't really documented explicitly anywhere. So my provisional solution, add the transparency mask to the layer, wait for it to appear, then write the data.

If you inpaint from a selection, with "Add transparency mask" enabled, it will work. However, the second that you touch one of the layers, the inpainting mask will be overwritten with solid white. It can be proven that it actually works by converting one of the layers to a paint layer, you will see the proper mask data that way. Why it gets erased when you unhide the layer or its parents, though, I have no idea -- perhaps it has something to do with the method of trying to work around the race condition by recursion. Once that is solved through whatever means (most likely by moving this into its own function and simply waiting for there to be a new child layer), that'll be a working implementation and one merge blocker down. I will try to work on it more later to see if I can make this into a fully working implementation.

This API is going to give me nightmares. I see references to the issues with the other methods of adding transparency masks from years ago, and we're stuck with the one that causes a race condition.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a pull request on your branch for a partially working implementation...

Can you point me to it? I can't find it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a pull request on your branch for a partially working implementation...

Can you point me to it? I can't find it.

I seem to have opened the pull request on my own fork... Should be fixed now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it out. Fully working implementation.

setPixelData on a node is broken for some unexplainable reason. But you can setPixelData on a Selection, and then call to create a transparency mask... because the selection will become the content of that mask. This also seems to handily make every race condition irrelevant.

I am going to clean it up a bit, and then I will PR what should be a complete implementation

@JasonS09
Copy link
Author

Hello! I'm aware of the changes and bug fixes to do, it's just that I have been working on another commercial project, so I haven't had time to fix. I'll probably be back in a couple of days though. If anyone wanna do the work by themselves, got no problem with it.

@Miraihi
Copy link

Miraihi commented Apr 13, 2023

Hello @JasonS09, thank you for your work on ControlNet implementation.
Unfortunately I get the Script Error AssertionError: Raw data size:1441792, Expected size:7680000 right after the processing is done. Though I'm using the DirectML fork of the SD A1111 interface and the memory monitor is disabled, so that may be the problem.

EDIT: Okay, seems like this is the problem with the extension's upscaling algorithm, just as you mentioned. I've unchecked "Disable base/max size" and now it works fine, though I have to constrain the image size to what I'm usually working with in SD web ui. Still a massive improvement for the workflow.

@JasonS09
Copy link
Author

Hello, managed to make it work, found a bug about changing canny high and canny low will reset to 200 and 100 after a few seconds.

I think this bug has been fixed.

@drhead
Copy link

drhead commented Jun 7, 2023

@Interpause With the changes I'm working on internally to implement post-img2img upscaling, I'm somewhat close to bringing the new API to full feature parity with the old API. It's not getting in the way too much, but it would be nice to be able to tidy things up some without it. Do you want me to delete it on this PR, or to handle that on a separate PR afterwards to allow for more testing? There are a few race conditions that may be down to slower hardware that I'm not sure are completely dealt with.

@JasonS09
Copy link
Author

JasonS09 commented Jun 7, 2023

@Interpause With the changes I'm working on internally to implement post-img2img upscaling...

It's nice you're bringing this up. I'm working on a new branch, implementing tiled diffusion and tiled VAE as well. I'm planning to use it as upscaler. I don't know if this idea is interesting enough to be present in the core repo, (it's more of a personal whim), but I just wanted to inform you.

@Dekker3D
Copy link

As the last reply was a month and a half ago, I'd just like to note that I'm looking for a Gimp or Krita plugin that'll let me use ControlNets seamlessly for a workflow combining 2D sketches and generated imagery with inpainting. This PR looks very promising to me and I hope it gets added soon.

@JasonS09
Copy link
Author

As the last reply was a month and a half ago, I'd just like to note that I'm looking for a Gimp or Krita plugin that'll let me use ControlNets seamlessly for a workflow combining 2D sketches and generated imagery with inpainting. This PR looks very promising to me and I hope it gets added soon.

I'm currently working on an adaptation of this plugin incorporating ComfyUI. It will have controlnet and other new features (still a work in progress though). I'm going to continue with its development as soon as I finish implementing reference only preprocessor for Comfy (its taking a while, not gonna lie).

In the meantime you can work with my fork for Automatic1111. Its functional right now, we're just waiting for the merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ControlNet support
8 participants