Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for AUTOMATIC1111's extras - image upscaling #1692

Merged
merged 1 commit into from
Apr 20, 2023

Conversation

mateusz
Copy link
Contributor

@mateusz mateusz commented Apr 2, 2023

Found this bit of AUTOMATIC1111's webui quite useful, and wanted to use it in the interface. "Single Image" is the only bit that's really necessary - your tool can deal with batching via iterators.

Note I tried implementing dynamic size and asserts, but the logic is very convoluted and I couldn't figure it out - and I struggled with the syntax of ExpressionJson. The logic is dependent on multiple inputs - if upscaler_1 is None, output is the same shape, if absolute rescaling is on, then shape is different, if clip is on the output is assymetric, plus some rounding to deal with. I tried multiple versions but couldn't figure it out :(

View of Webui's interface for extras

extras

View in ChaiNNer

chainner-extras

@joeyballentine
Copy link
Member

Sorry for the late reply, but thanks for submitting this. @adodge is going to look it over when he has time.

As for the ExpressionJson stuff, those are Navi types. It's a custom language for our type system. You can read more about it here. Though, IMO the best way to learn is usually just by looking at a bunch of other nodes that perform a similar function and seeing how they implement it.

@RunDevelopment
Copy link
Member

Sorry for the delay here too!

I think we should slightly restructure our inputs before we tackle the output type. In the webui, there are 2 modes: "Scale by" and "Scale to". So let's do that too. Let's define another enum for those 2 states.

class ScaleMode(Enum):
    SCALE_BY = 0
    SCALE_TO = 1

We can then use an if_enum_group to hide unnecessary inputs depending on the currently selected mode. So e.g. width and height will only be shown when "Scale to" is selected. A good example for how this works is Create Gradient. The first argument of if_enum_group is the ID of the enum input we want to check. All inputs get assigned an ID implicitly, but we typically use with_id(X) on the enum input to make this more explicit.

         self.inputs = [
            ImageInput(),
            EnumInput(ScaleMode, default_value=ScaleMode.SCALE_BY).with_id(1),
            if_enum_group(1, ScaleMode.SCALE_BY)(
                SliderInput(
                    "Relative resize multiplier",
                    minimum=0,
                    default=4,
                    maximum=8,
                    slider_step=0.1,
                    controls_step=0.1,
                    precision=1,
                ),
            ),
            if_enum_group(1, ScaleMode.SCALE_TO)(
                NumberInput("Width", controls_step=1, default=512),
                NumberInput("Height", controls_step=1, default=512),
                BoolInput("Crop to fit", default=True),
            ),
            ...
        ]

@adodge
Copy link
Contributor

adodge commented Apr 5, 2023

Very cool! Thank you for this!

I have a few notes:

  1. With the default settings (both upscalers set to None) it doesn't actually do any upscaling. So there's a third "scale mode" for "no scaling".
  2. The name "extras" doesn't immediately mean anything to me. It would if I was used to using it in webui, but I've never really explored that tab (partly because of how vague the name is.) Maybe we could use a different name for the node, and we can mention that it uses the extras tab in the description, or in a parenthetical.
  3. When I picked LDSR, it downloaded a 2GB file before running the model. (I know this is just what webui does, but it was surprising to me. It could be undesired by some users.)
  4. I get a server error if I select ScuNET, which I don't have a model for in webui. Maybe we could catch this and show a better message. Alternatively, (or additionally) there's this endpoint, which we could use to get a list of the models they actually have installed. This is a similar problem to picking an SD model in the text-to-image node. We punted there because we could just not specify and it would use whatever they have loaded already, but I don't think we can do that here. If we find a good solution to this, we could also probably use it in the other node. Or we could just restrict this to models that are available by default or can be automatically downloaded.

It looks like what it does is some number (maybe zero) of the following operations (I believe in this order):

A. Upscales with one upscaler
B. Upscales with another upscaler and blends it with the output
C. Applies GFPGAN and blends it with the output.
D. Applies CodeFormer and blends it with the output

Maybe we could have separate nodes that each do one of these steps: Upscale, GFPGAN, and CodeFormer. That might make it easier to understand what they're doing. It's also more modular and node-ish.

Another point that I don't really have a strong opinion on but is probably worth noting: I believe all of these things are already supported natively by ChaiNNer. I can think of a few scenarios where someone would want to use this instead of the built-in chainner nodes:

  1. They're running webui on a different machine
  2. They're already accustomed to the webui interface and they just want to use it from chainner (in which case we wouldn't want to break it up into separate nodes like I suggest above, as the whole point is to give the same UI they're used to)
  3. They've already got the models they want installed in webui, and they don't want to bother with having to find those files again in the "Load Model" node. Or they're taking advantage of the automatic download in webui.
  4. There's some model where webui is faster or better than chainner, or a model that webui supports that chainner doesn't

Once we have internal SD, assuming we keep the webui integration, we'll have sort of retroactively established a precedent for webui nodes that duplicate built-in behavior, so it's not automatically a deal-breaker for me. The scenarios above are compelling reasons to have this node, IMO.

@mateusz
Copy link
Contributor Author

mateusz commented Apr 5, 2023

Thank you for an in-depth review and tons of references to look up.

I appreciate this PR might not make sense. As @adodge mentioned: "all of these things are already supported natively by ChaiNNer."

To elaborate on the use case: I wanted to use LDSR for upscaling in the middle of a pipeline, and LDSR specifically because it gives me the best results.

The context is that webui/LDSR works quite well, I tried SDv2 superresolution and it gave mediocre results (for my use case), so I have a feeling someone tuned that LDSR model pretty hard. Webui implementation looks pretty hacked though, and I have a feeling it'd be hard to actually run in chaiNNer :)

Let me know if I should continue a bit further. I'm not advocating either way, happy for you to close this PR - I've done my thing by having it hacked in :)

@adodge
Copy link
Contributor

adodge commented Apr 5, 2023

Actually, I don't see LDSR on the list of things ChaiNNer supports, so I take it back that it's all stuff we already have.

For me the only sticking point is figuring out how we want to translate it into nodes (one or several) and how to handle picking from the models the user has installed. I'm certainly not arguing for closing the PR. I think this would be useful. (As it was useful for you, it would probably be useful for others too.)

@mateusz
Copy link
Contributor Author

mateusz commented Apr 5, 2023

the only sticking point is figuring out how we want to translate it into nodes

Ok I'll take this PR one step further, I think I'll just reduce it to Upscaling via webui, because GFPGAN and CodeFormer are already supported via pytorch.

@mateusz mateusz changed the title Add support for AUTOMATIC1111's extras - image upscaling WIP Add support for AUTOMATIC1111's extras - image upscaling Apr 6, 2023
@joeyballentine
Copy link
Member

and how to handle picking from the models the user has installed.

I think querying that endpoint is the best option

@mateusz
Copy link
Contributor Author

mateusz commented Apr 7, 2023

I've followed your suggestions and done the following (see the animated gif too):

  • tidied up naming, e.g. renamed the block to "Upscale" and file to upscaling.py
  • added dynamic fetching of upscalers from sdapi/v1/upscalers endpoint at import time
  • removed GFPGAN and CodeFormer on the premise this is not upscaling, and it's supported elsewhere
  • removed "None" upscaler option on the assumption it was only needed for GFPGAN and CodeFormer - it unnecessarily confuses the interface
  • hide Upscaler 2 behind a checkbox, to simplify the UI
  • hide "scale to" section (width, height, crop) unless selected from dropdown
  • implemented navi spec for all possibilites
  • added asserts for all possibilities

chainner-upscale

@mateusz mateusz changed the title WIP Add support for AUTOMATIC1111's extras - image upscaling Add support for AUTOMATIC1111's extras - image upscaling Apr 7, 2023
@adodge
Copy link
Contributor

adodge commented Apr 7, 2023

Nice!

I get a Navi error on load:

12:32:14.658 › Error: Unable to add type definitions of chainner:external_stable_diffusion:upscaling > Upscaler 1 (id: 6):
Error: SyntaxError: At 1:77: no viable alternative at input 'DynamicEnum::4'
Type definitions: let DynamicEnum = DynamicEnum::Lanczos | DynamicEnum::Nearest | DynamicEnum::4X-ultrasharp | DynamicEnum::4XFoolhardyRemacri | DynamicEnum::Esrgan4X | DynamicEnum::Ldsr | DynamicEnum::R-esrgan4X+ | DynamicEnum::R-esrgan4X+Anime6B | DynamicEnum::Swinir4X

Probably need more aggressive sanitization of the upscaler name. I'm not sure what the set of valid names is. Maybe we could just hash the names to hex, prepend an "x" so it doesn't start with a number, and use that as the ID in Navi. That seems safe.

I was going to test what happens when we load a chain that references an upscaler that doesn't exist. I don't know what it should do, but we should make sure it does something reasonable.

@adodge
Copy link
Contributor

adodge commented Apr 7, 2023

Also we should call "DynamicEnum" something more specific, in case we want to use this strategy in other places. (like to specify models for the text-to-image node)

@RunDevelopment
Copy link
Member

RunDevelopment commented Apr 7, 2023

Awesome work @mateusz!

I get a Navi error on load:

Oh god. I didn't know that it was possible to dynamically create enums like this. EnumInput assume that all enum variants have SNAKE_CASE names. So names that don't follow this will cause it to produce invalid Navi code.

(I should probably add a check to EnumInput that verifies that variant names are actually snake case.)

@RunDevelopment
Copy link
Member

I was going to test what happens when we load a chain that references an upscaler that doesn't exist. I don't know what it should do, but we should make sure it does something reasonable.

In that case, Dropdown will reset the input to its default value (typically the first variant).

@mateusz
Copy link
Contributor Author

mateusz commented Apr 9, 2023

@adodge I haven't run into error while testing and I'm just wondering how I missed it - is there anything specific I need to do before npm run make? Do I need to wipe rm -fr out?

call "DynamicEnum" something more specific

Oh true, missed that, fixed.

all enum variants have SNAKE_CASE names

Converted to SNAKE_CASE.

@joeyballentine
Copy link
Member

This will need to be updated/rebased with the new node importing system btw

@adodge
Copy link
Contributor

adodge commented Apr 9, 2023

I haven't run into error while testing and I'm just wondering how I missed it - is there anything specific I need to do before npm run make? Do I need to wipe rm -fr out?

I probably have a different set of upscalers from you installed in webui. I think the one it was complaining about was called 4x_foolhardy_Remacri, which turned into an invalid token name in Navi. I suspect it doesn't like names that start with numbers. We should also make sure it doesn't choke on names like R-ESRGAN 4x+ Anime6B, with special characters in the middle.

The name of the model is based on whatever the filename is on the user's machine, so we should try to be resilient to all sorts of possible strings. אַפּסקייל.pt, etc

@RunDevelopment
Copy link
Member

I suspect it doesn't like names that start with numbers.

Yes, it doesn't like them.

I think the easiest solution here would probably be to use a regex to replace all [^a-zA-Z0-9_] characters with _. To make sure that models start with a letter, we could use a common prefix (e.g. MODEL_).

@mateusz mateusz force-pushed the sd-webui-extras branch 2 times, most recently from ccf0f96 to 16c4785 Compare April 12, 2023 08:01
@mateusz
Copy link
Contributor Author

mateusz commented Apr 12, 2023

Another push:

  • rebased onto new main (new node importing system)
  • switched to use UPSCALER_ + md5 enum ids to support 4x_foolhardy_Remacri and hebrew models 😛
  • added second ret val to get_upscalers, to provide explicit labels (previously, id would be used instead of the human-friendly name)

Copy link
Member

@RunDevelopment RunDevelopment left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't test this, since I don't have AUTOMATIC1111's webui installed. So someone that has it and has used it before, please test the new node.
@joeyballentine @adodge

The code itself looks good.

@mateusz
Copy link
Contributor Author

mateusz commented Apr 18, 2023

Hey folks - just wanted to mention, don't feel bad if you want to reject and close the PR, if it doesn't suit the philosophy of chaiNNer. I don't mind, really. It was fun to hack something together :)

@joeyballentine
Copy link
Member

No you're good, i fully intend to merge this, I just forgot to test it last night.

@joeyballentine
Copy link
Member

I'm getting errors when trying to use any of the models. It's always this same error, saying it can't import the model
image

@joeyballentine
Copy link
Member

I'm guessing this probably doesn't have to do with your code here, and is probably due to me using a somewhat older version of the webui, but if its at all fixable we should probably try to do something about it

@mateusz
Copy link
Contributor Author

mateusz commented Apr 19, 2023

Similar problem here, although only with SwinIR model . Possibly something to do with dependencies for webui, maybe update pip deps? Do you get the same error when you try to use the UI directly?

Unless you mean we should handle webui errors better in chaiNNer?

@joeyballentine
Copy link
Member

I just wasn't sure if there was maybe some config thing wrong. But yeah it just looks like a webui issue. So I'm good with calling this good

@joeyballentine joeyballentine merged commit 0181d4c into chaiNNer-org:main Apr 20, 2023
8 checks passed
@mateusz mateusz deleted the sd-webui-extras branch April 20, 2023 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants