Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added an API layer #194

Merged
merged 5 commits into from
Feb 20, 2023
Merged

added an API layer #194

merged 5 commits into from
Feb 20, 2023

Conversation

sangww
Copy link
Contributor

@sangww sangww commented Feb 18, 2023

This API layer replicates SD-WebUI's txt2img and img2img pipeline, with added script parameters for ControlNet.

It has two caveats that needs addressed, but currently in a working state with full support of SD-WebUI + ControlNet features.

  • I still don't know what's the first arg of p.script_args: I observed it to be 0. If someone can validate this!
  • This api layer soon need to be able to automatically pull in other AlwaysVisible scripts' parameters, and only update those for ControlNet. Any advice appreciated!

@kft334
Copy link
Contributor

kft334 commented Feb 19, 2023

Can anyone confirm that Text-To-Image works? I'm getting bad outputs that seem like normal Text-To-Image. Everything loads and there are no errors though. And could you also return the generated control image at either the first or last image index if the pre-processor was used?

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

"controlnet_input_image": [img],
"controlnet_module": 'openpose',
"controlnet_model": 'control_sd15_openpose [fef5e48e]',

This is how json objects are formated and those that are necessary. My setup works fine this way with depth, pose and scribble. Img is a base64 string for the image that goes into the ControlNet UI on the SD WebUI.

EDIT: control img can be added but procrastinating on it. If interested in doing so, just need to encode the img in the second index of the output that is in numpy array format.

@Mikubill
Copy link
Owner

Good. Small advice:

  1. Move API stuff to another file, like api.py
  2. Maybe we should add some examples in the readme.

@kft334
Copy link
Contributor

kft334 commented Feb 19, 2023

I see the issue now. Doesn't work with Euler apparently? Switched to DDIM and I'm getting proper results now.

Edit: Just tried with DPM++ 2M and the output was bad as well. Using canny btw

Edit: Just tested with depth and the same thing happens. It works with DDIM and PLMS but with Euler, LMS, Heun or DPM++ 2M the results become random. I've tried most of the samplers now and ControlNet appears to only be applying to DDIM and PLMS. They all work in the webui though. It could just be an issue with my installation.

@chrisbward
Copy link

I've dropped this change in, but seems to break when visiting the docs; http://127.0.0.1:7860/docs#/

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

For those this doesn't work, forgot the highlight it only works only when ControlNet is installed as a script in img2img and txt2img. Try it without other scripts installed in the extension tab.

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

Good. Small advice:

  1. Move API stuff to another file, like api.py
  2. Maybe we should add some examples in the readme.

Will do when I have time. Quick Q:

I see in your script, that you arrange script args based on script index. Any tips in populating script args for multiple alwayson_scripts properly? Thanks!

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

For those this doesn't work, forgot the highlight it only works only when ControlNet is installed as a script in img2img and txt2img. Try it without other scripts installed in the extension tab.

@kft334 this might be the issue

This was linked to issues Feb 19, 2023
@kiriri
Copy link

kiriri commented Feb 19, 2023

Creating a new API was a wonderful idea, even if it's 'makeshift'. It makes things so much easier to edit.

One thing though, I pass in the controlnet_input_image as an array of base64 strings, and it throws an error ("Incorrect Padding").

Replacing

cn_image = Image.open(io.BytesIO(base64.b64decode(controlnet_input_image[0])))

with

cn_image = decode_base64_to_image(controlnet_input_image[0])   

Fixed that error.

@stassius
Copy link
Contributor

Hello! I'm the author of the post on Reddit. I have a couple of suggestions to make your api layer better.

  1. You should add GET methods to retrieve the available Preprocessors and Models, just like in the standard A1111 api.
  2. It's better to make it in a different file. It will be easier to maintain.

I still don't know what would be the best solution from the architectural standpoint. You cloned the whole A1111 api for txt2img and img2img. I don't think it's a good idea, as it's hard to maintain. You'll have to track changes both in A1111 api and in the ControlNet extension. My solution is not great either. I made the api only to retrieve the models, and created an A1111 script to control the ControlNet extension, so a user only have to add the Script name and arguments to the existing API call. It's hacky too (and adds an extra script with UI), but at least, you don't have to worry about maintaining the cloned API. I don't know if there is another way to do this apart from adding the ControlNet to the core of A1111. Any suggestions would be appreciated.

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

Hello! I'm the author of the post on Reddit. I have a couple of suggestions to make your api layer better.

  1. You should add GET methods to retrieve the available Preprocessors and Models, just like in the standard A1111 api.
  2. It's better to make it in a different file. It will be easier to maintain.

I still don't know what would be the best solution from the architectural standpoint. You cloned the whole A1111 api for txt2img and img2img. I don't think it's a good idea, as it's hard to maintain. You'll have to track changes both in A1111 api and in the ControlNet extension. My solution is not great either. I made the api only to retrieve the models, and created an A1111 script to control the ControlNet extension, so a user only have to add the Script name and arguments to the existing API call. It's hacky too (and adds an extra script with UI), but at least, you don't have to worry about maintaining the cloned API. I don't know if there is another way to do this apart from adding the ControlNet to the core of A1111. Any suggestions would be appreciated.

Hey, I really loved your work! And great suggestions : )

Yeah definitely this is a super makeshift effort : ) I just needed quick and dirty for the moment, but as you said maintaining will be a headache potentially. I think adding the 1) GET methods, 2) move to separate script (done on my system locally), 3) returning control results as you and others suggested would be good for the moment. But also a fan of exploring the right implementation model (which would be why I wanted to start this thread) that would be needed going forward.

@fmac2000
Copy link

@sangww - having problems running this, the fallback case line 350 always is True. therefore control net is not used? Am I missing something?

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

@sangww - having problems running this, the fallback case line 350 always is True. therefore control net is not used? Am I missing something?

Shouldn't be true. I have updated the commit which also includes an example api call from a python notebook. This could be helpful in comparing with your setup There are a few things to check given how makeshift is this:

  • Do you have other scripts installed on txt2img and img2img? Should have none for this to correctly work in the current implementation. You could simply disable those (such as additional network)
  • Do you see any error message? I wonder if the input image for controlnet is correctly encoded, which could lead to a failed processing.

@fmac2000
Copy link

fmac2000 commented Feb 19, 2023

No error message, no extra modules, installed your branch on a fresh install - I use the same encoding for the input for ControlNet as the initImage (without the "data:image/png;base64," substring) and still the flag is checked True.

Could you double check if this is occurring for you too? -> There is no exception clause in the chain of execution afterward - whence there is no error displayed.

@sangww
Copy link
Contributor Author

sangww commented Feb 19, 2023

No error message, no extra modules, installed your branch on a fresh install - I use the same encoding for the input for ControlNet as the initImage (without the "data:image/png;base64," substring) and still the flag is checked True.

Could you double check if this is occurring for you too? -> There is no exception clause in the chain of execution afterward - whence there is no error displayed.

That is strange indeed. I am suspecting it could be the extension's own seeting.
Could you try: go to settings tab -> ControlNet -> ensure true is set for "Allow other script to control this extension"?
Edit: tested with my config and it worked as expected.

@Mikubill Mikubill merged commit 5fed282 into Mikubill:main Feb 20, 2023
@Ericxgao
Copy link

Ericxgao commented Feb 20, 2023

I have the same issue still, with the settings changes.

        # todo: extend to include wither alwaysvisible scripts
        processed = scripts.scripts_img2img.run(p, *(p.script_args))

        if processed is None:  # fall back
            processed = process_images(p)

Processed is always none here

@KotoriKoi
Copy link
Contributor

image
image
image

The new version of ControlNet adds Guidance strength, but there is no such parameter in the api. The default value of Guidance strength is zero when calling the api to generate, resulting in serious error in the result. Please add this parameter to the api or tell me how to add it? Please refer to the above three pictures for the difference in results.

@Mikubill
Copy link
Owner

add more entries to script_args should fix it. (PR welcome

"threshold_a": controlnet_threshold_a,
"threshold_b": controlnet_threshold_b,
}
p.scripts = scripts.scripts_txt2img
p.script_args = (
0, # todo: why
cn_args["enabled"],
cn_args["module"],
cn_args["model"],
cn_args["weight"],

@KotoriKoi
Copy link
Contributor

@Mikubill Confused, when I add the following code like other codes it doesn't work.

controlnet_guidance: float = Body(1.0, title='ControlNet Guidance Strength'),
cn_args={``````````"guidance":controlnet_guidance,}
p.script_args = (```````````````cn_args["guidance"],)

@ljleb
Copy link
Collaborator

ljleb commented Feb 26, 2023

is there a way to apply multi-controlnet?

Working on it in #384

@chrisbward
Copy link

Just started getting this error;

Error running process_batch: /home/user/stable-diffusion-webui/extensions/sd-webui-additional-networks/scripts/additional_networks.py
Traceback (most recent call last):
  File "/home/user/stable-diffusion-webui/modules/scripts.py", line 395, in process_batch
    script.process_batch(p, *script_args, **kwargs)
  File "/home/user/stable-diffusion-webui/extensions/sd-webui-additional-networks/scripts/additional_networks.py", line 158, in process_batch
    if not args[0]:
IndexError: tuple index out of range

I've installed the LoRA network today?

@chrisbward
Copy link

Hey, so I had to disable "additional networks" extension and it started working again.

Another issue, GET request on "controlnet/model_list" returns an empty array?

@marcsyp
Copy link

marcsyp commented Mar 3, 2023

Is there any documentation of how to use the controlnet api other than what is in the A1111 api documentation? It would be great to see an example of successfully running a txt2img and an img2img using controlnet with POST requests. I've used POST for regular img2img from a 3d software environement and would like to add the capability to use CN as well. Thanks!

@ljleb
Copy link
Collaborator

ljleb commented Mar 3, 2023

We definitely need to describe what each field of controlnet_units objects expects as input somewhere.

In the meantime, tl;dr: /controlnet/txt2img is the same route as /sdapi/v1/txt2img, and /controlnet/img2img is the same route as /sdapi/v1/img2img, except for an extra property that has been added at the root of the json object in the body: "control_units": [], which expects a list of ControlNetUnitRequest.

Each field of the ControlNetUnitRequest object has its own value and maybe we should look into clarifying what the expected value is for each. You can find each field at http://localhost/docs -> model-ControlNetUnitRequest at the bottom of the page.

You can start by specifying only the "model": "..." property IIRC, which expects one of the values returned by GET /controlnet/model_list. The defaults of all values are dumped at http://localhost/docs. (although you seem to already be aware of this)

Here's an example img2img object that I used when testing the multi-controlnet api pr: (ellipsis ... is for omitted base64 image values)

json object

{
  "init_images": [...],
  "resize_mode": 0,
  "denoising_strength": 0.5,
  "image_cfg_scale": 0,
  "mask_blur": 4,
  "inpainting_fill": 0,
  "inpaint_full_res": true,
  "inpaint_full_res_padding": 0,
  "inpainting_mask_invert": 0,
  "initial_noise_multiplier": 0,
  "prompt": "1girl",
  "styles": [],
  "seed": -1,
  "subseed": -1,
  "subseed_strength": 0,
  "seed_resize_from_h": -1,
  "seed_resize_from_w": -1,
  "sampler_name": "Euler",
  "batch_size": 1,
  "n_iter": 1,
  "steps": 50,
  "cfg_scale": 7,
  "width": 512,
  "height": 512,
  "restore_faces": false,
  "tiling": false,
  "negative_prompt": "",
  "eta": 0,
  "s_churn": 0,
  "s_tmax": 0,
  "s_tmin": 0,
  "s_noise": 1,
  "override_settings": {},
  "override_settings_restore_afterwards": true,
  "sampler_index": "Euler",
  "include_init_images": false,
  "controlnet_units": [
    {
      "model": "diff_control_sd15_depth_fp16 [978ef0a1]"
    }
  ]
}

In this case, the controlnet unit specified inherits the image specified in the "init_images" property. To specify a custom image, use the controlnet property "input_image": "base64...".

You can probably omit most of the root values. In my case I used a diff model.

@marcsyp
Copy link

marcsyp commented Mar 3, 2023

This is great info, I think I understand better already -- only thing is that I'm not seeing the ControlNetUnitRequest documented in the localhost docs, so I can't see the fields available. I only see 3 things documented related to controlnet:

And in Schemas:

Body_detect_controlnet_detect_post{
controlnet_module Controlnet Module[...]
controlnet_input_images Controlnet Input Images[...]
controlnet_processor_res Controlnet Processor Resolution[...]
controlnet_threshold_a Controlnet Threshold a[...]
controlnet_threshold_b Controlnet Threshold b[...]
 

}


I may be missing something about how the docs are used.

@ljleb
Copy link
Collaborator

ljleb commented Mar 3, 2023

@marcsyp Do you see any /sdapi/v1 routes? There should also be these routes in the docs:

  • POST /controlnet/txt2img
  • POST /controlnet/img2img

Plus the ControlNetUnitRequest schema.

If you don't see any /sdapi/v1 routes, please refer to the discussion in #421. Might be related.

@marcsyp
Copy link

marcsyp commented Mar 3, 2023

I don't, which is weird -- I don't see any txt2img or img2img routes at all.... in the past I did see those routes, that's how I learned how to do my first successful POSTs on txt2img and img2img...

@marcsyp
Copy link

marcsyp commented Mar 3, 2023

fastapiDocs.pdf

This is what I'm seeing @ljleb

@ljleb
Copy link
Collaborator

ljleb commented Mar 3, 2023

Yeah I'm not sure why it happens to be honest. One fix could be to hijack the webui *2img routes differently, so that they still appear even when the webui does not show basic api routes.

@ljleb
Copy link
Collaborator

ljleb commented Mar 3, 2023

Note that even with a fix, soon the webui will start supporting always on scripts in api calls: AUTOMATIC1111/stable-diffusion-webui#8253

When/if this PR is eventually merged, we will probably deprepcate/remove our own /txt2img and /img2img routes.

@marcsyp
Copy link

marcsyp commented Mar 3, 2023

so is this a bug in CN or in A1111? How do I get back to seeing all the routes in the api docs, even just for basic a1111? wait and keep trying? :)

@ljleb
Copy link
Collaborator

ljleb commented Mar 3, 2023

If you try uninstalling the controlnet extension you should still not see any /sdapi/v1 route. It is probably not a controlnet issue. Maybe it is because you have encryption/middleware enabled, or something like that? Environment variables maybe? I am not sure.

@paulo-coronado
Copy link

Guys, I am experiencing the same issue as @kft334:

"Can anyone confirm that Text-To-Image works? I'm getting bad outputs that seem like normal Text-To-Image. Everything loads and there are no errors though."

I am using the route /sdapi/v1/txt2img and the following payload:

Click to see the payload
{
  "enable_hr": false,
  "denoising_strength": 0,
  "prompt": "...",
  "styles": [
    "string"
  ],
  "seed": -1,
  "subseed": -1,
  "subseed_strength": 0,
  "seed_resize_from_h": -1,
  "seed_resize_from_w": -1,
  "sampler_name": "DDIM",
  "batch_size": 1,
  "n_iter": 1,
  "steps": 20,
  "cfg_scale": 7,
  "width": 512,
  "height": 512,
  "restore_faces": false,
  "tiling": false,
  "negative_prompt": "...",
  "eta": 0,
  "s_churn": 0,
  "s_tmax": 0,
  "s_tmin": 0,
  "s_noise": 1,
  "override_settings": {},
  "override_settings_restore_afterwards": true,
  "script_args": [],
  "sampler_index": "DDIM",
  "controlnet_units": [
    {
      "module": "canny",
      "model": "control_canny-fp16 [e3fe7712]",
      "weight": 1,
      "resize_mode": "Envelope (Outer Fit)",
      "lowvram": false,
      "guessmode": false,
      "input_image": "..."
    }
  ]
}

Reading the thread, I tried the following options (without success):

  • Tried DDIM and other samplers;
  • Disabled installed scripts. I just kept the built-in ones (LDSR, Lora, ScuNET, SwinIR and prompt-bracket-checker);
  • Enabled "Allow other script to control this extension".

No error message is shown.

Could you guys please help me? @sangww @ljleb @Mikubill @chrisbward

@kiriri
Copy link

kiriri commented Mar 5, 2023

@paulo-coronado
The route is /controlnet/txt2img

@alexcode0
Copy link

Hi all, thanks for the development and discussion in this thread, it's very helpful.
I've managed to get the controlnet working through the API, and I'm successfully generating a canny preprocess output. How can I take this canny preprocess and use it to drive the canny model with txt2img via the API?
Thanks again

@ljleb
Copy link
Collaborator

ljleb commented Mar 6, 2023

@alexcode0 you can pass preprocessed images as a base64 string to the "input_image" property of the controlnet processing unit you want to use. Then you put "module": "none" or just leave out this property, and use the "model" property to select the right canny model. Check the wiki for more info: https://github.com/Mikubill/sd-webui-controlnet/wiki/API#controlnetunitrequest-json-object

You will also find an example of the json structure via localhost:7869/docs.

@alexcode0
Copy link

alexcode0 commented Mar 6, 2023

@alexcode0 you can pass preprocessed images as a base64 string to the "input_image" property of the controlnet processing unit you want to use. Then you put "module": "none" or just leave out this property, and use the "model" property to select the right canny model. Check the wiki for more info: https://github.com/Mikubill/sd-webui-controlnet/wiki/API#controlnetunitrequest-json-object

You will also find an example of the json structure via localhost:7869/docs.

Hm. I tried exactly that before posting my Q. Seemed like the logical way to implement it, but no luck.

def run_sd():
    img = cv2.imread('canny.png')
    png_img = cv2.imencode('.png', img)
    preproc_64 = base64.b64encode(png_img[1]).decode('utf-8')
    #choose model for each layer
    option_payload = {
        "sd_model_checkpoint": "galaxytimemachinesGTM_v3.safetensors [f8ad2aafb5]",
        "CLIP_stop_at_last_layers": 2
    }
    response = requests.post(url=f'{url}/sdapi/v1/options', json=option_payload)
    payload = {
        "prompt": "A PURPLE VEST",
        "negative_prompt": "",
        "width": 512,
        "height": 512,
        "steps": 100,
        "cfg": 10,
        "sampler_index": "DPM++ 2S a Karras",
        "controlnet_units": [
            {
                "input_image": preproc_64,
                "mask": '',
                "module": "none",
                "model": "control_sd15_canny [fef5e48e]",
                "weight": 1.6,
                "resize_mode": "Scale to Fit (Inner Fit)",
                "lowvram": False,
                "processor_res": 512,
                "threshold_a": 64,
                "threshold_b": 64,
                "guidance": 1,
                "guidance_start": 0,
                "guidance_end": 1,
                "guessmode": True
            }
        ]
    }
    generate_image(payload,"output")
def generate_image(payload, filename):
    response = requests.post(url=f'{url}/controlnet/txt2img', json=payload)
    r = response.json()
    #print(r)
    for i in r['images']:
        image = Image.open(io.BytesIO(base64.b64decode(i.split(",", 1)[0])))

        png_payload = {
            "image": "data:image/png;base64," + i
        }
        response2 = requests.post(url=f'{url}/sdapi/v1/png-info', json=png_payload)
        pnginfo = PngImagePlugin.PngInfo()
        pnginfo.add_text("parameters", response2.json().get("info"))
        image.save(f'{filename}.png', pnginfo=pnginfo)
    return

@ljleb
Copy link
Collaborator

ljleb commented Mar 6, 2023

Hm. I tried exactly that before posting my Q. Seemed like the logical way to implement it, but no luck.

Can you elaborate on what you expect to see, and then what you see instead happening? I don't have enough info at the moment to help.

@alexcode0
Copy link

Hm. I tried exactly that before posting my Q. Seemed like the logical way to implement it, but no luck.

Can you elaborate on what you expect to see, and then what you see instead happening? I don't have enough info at the moment to help.

Essentially when I use the webui, I can upload an image to the control net, and get a preprocessed image, which is then ported to the txt2img workflow and is generated using the prompt and the preprocessed image as a guide.

When I try to do this via the API, I get the canny preprocess successfully. When I pass that preprocessed image to the control net again and try to apply the model (control_sd15_canny [fef5e48e]), the output image is unchanged from the preprocessed input. See image below:

Untitled

@ljleb
Copy link
Collaborator

ljleb commented Mar 6, 2023

Thanks for the info. The api should return 2 images in your case, 1 for the preprocessor (unchanged when "module": "none") and 1 for the txt2img output. Are you sure only 1 image is returned?

As you save both images under the same name, I suspect the txt2img output is just overwritten by the preprocessor image. Consider adding i to your filename somewhere, i.e. f'{filename}-{i}' and use for i, img in enumerate(r['images']) to iterate over your images

@ljleb
Copy link
Collaborator

ljleb commented Mar 6, 2023

You can check "don't save detectmap to output" in webui settings if you dont want the controlnet maps as output (should probably be an api parameter)

@alexcode0
Copy link

Thanks for the info. The api should return 2 images in your case, 1 for the preprocessor (unchanged when "module": "none") and 1 for the txt2img output. Are you sure only 1 image is returned?

As you save both images under the same name, I suspect the txt2img output is just overwritten by the preprocessor image. Consider adding i to your filename somewhere, i.e. f'{filename}-{i}' and use for i, img in enumerate(r['images']) to iterate over your images

Bingo! I thought that may have been the case and checked the response json but didn't find 2. Went back and checked and yup, it's in there alright.

Thank you for taking the time to help

@ImranBug
Copy link

`import requests
import base64

Open the control image file in binary mode

with open("poses\poses_base.png", "rb") as f:
# Read the image data
image_data = f.read()

# Encode the image data as base64
image_base64 = base64.b64encode(image_data)

# Convert the base64 bytes to string
image_string = image_base64.decode("utf-8")

Define the payload with the prompt and other parameters

payload = {
"prompt": "A PURPLE VEST",
"negative_prompt": "",
"width": 512,
"height": 512,
"steps": 100,
"cfg": 10,
"sampler_index": "DPM++ 2S a Karras",
"controlnet_units": [
{
"input_image": image_string,
"mask": '',
"module": "none",
"model": "control_sd15_canny [fef5e48e]",
"weight": 1.6,
"resize_mode": "Scale to Fit (Inner Fit)",
"lowvram": False,
"processor_res": 512,
"threshold_a": 64,
"threshold_b": 64,
"guidance": 1,
"guidance_start": 0,
"guidance_end": 1,
"guessmode": True
}
]
}

Send the request to the API endpoint

response = requests.post(url=f'http://127.0.0.1:7860/controlnet/txt2img', json=payload)

Print the response

print(response)
`

The response is giving 500 server error, can any one help me what am I doing wrong?

@ljleb
Copy link
Collaborator

ljleb commented Mar 13, 2023

With latest version of webui it is possible that the controlnet API does not work anymore. There are efforts currently to put it back on its feet in #527

@dbokser
Copy link

dbokser commented Mar 15, 2023

Send the request to the API endpoint

response = requests.post(url=f'http://127.0.0.1:7860/controlnet/txt2img', json=payload)

Print the response

print(response) `

The response is giving 500 server error, can any one help me what am I doing wrong?

Have you tried decoding the response?

decoded = response.read().decode()

I also had a 500 Server Error but when decoding it gave me more information, telling me the input image wasn't encoded correctly.

But is there any update on whether controlnet API is working in the latest release? I tried both the /controlnet/txt2img and /sdapi/v1/txt2img endpoints and they both aren't going through controlnet

@ljleb
Copy link
Collaborator

ljleb commented Mar 15, 2023

Waiting on approval for #587. The API will work again once it is merged.

@Bruce-shuai
Copy link

I don't, which is weird -- I don't see any txt2img or img2img routes at all.... in the past I did see those routes, that's how I learned how to do my first successful POSTs on txt2img and img2img...

hi, i also meet this problem, and you solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adding this once available Feature Request: Add API Support