-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate /controlnet/*2img
web API in favor of new alwayson_scripts support
#527
Conversation
There are numerous bugs and it does not work correctly on the WebUI.
|
Thanks for testing. Yes of course, this is a work in progress. (hence still a draft) I haven't finished updating the code, just a rough idea of the change at the moment. There is probably a lot more than this to change. Hopefully, sharing my progress so far helps a bit communicate my intentions. Although, as there are no tests, most of the bugs I will catch/fix probably will be through debugging the scenarios I am aware of. |
Maybe we should add tests to the repo before this? I have no prior professional experience with python, so I don't know how to set them up properly. I can open a PR for tests but it will probably be the wrong way to do them. I attempted to insert the tests into the testing setup of the webui somehow, but haven't had any luck so far. When I tried setting up our own local tests, independent of webui host, the include paths were wrong and im not sure what a clean fix would be. |
Quick test of 1 controlnet unit seems to work atm. Remains to be tested:
|
For clarity, this PR makes it so that it's possible to call the web API {
...
"alwayson_scripts": {
"ControlNet": {
"args": [false, true,
{ // 1
"image": "base64...",
"model": "model of choice..."
}, {// 2
"image": "base64...",
"model": "second model of choice..."
}]
}
}
} It changes as little as possible the existing web API arguments structure. |
Hi @ljleb, this is about the adding tests to the repo. This repo is actually quite big right now to start writing tests for each method. But we can add test cases for common end to end cases. But for that a lot of efforts will be required. I haven't looked at entire code flow but if we can agree on the several test cases then I can create one PR with them to get things started. As you mentioned that paths from webui does not get resolved for independent test cases. For that If you can somehow share the code you have worked on then I might be able to help as I have good knowledge in testing with python. If you want to create PR for test cases then it would be fine too. I might jump into that PR for suggestions if you think that's okay. |
Since we are refactoring the structure, quick suggestion from my side. Is there any way to pass is_img2im and is_ui values rather then args to process method? Because that will make structure very easy as it will be just List[ControlNetUnit] to be passed on. Ignore if you think this is not required. |
(apologies in advance for long comment...) @PhoenixCreation Thanks for the help! Yes I think it is unrealistic to aim to test everything. Starting with something anywhere between 1 to 20 tests would really help, and then we could consider adding more test cases over time whenever we fix a bug, add a new feature or find a regression. At the moment I am relying a lot on manually testing this PR. Having some tests would increase my confidence that this does not break anything. I think I would prefer for you to start an effort in this direction, if that's okay with you of course. If you'd prefer that someone else start the base testing PR, then I could try putting something together real quick. I would really appreciate if you could provide some guidance into writing the tests and setting up the automatic discovery mechanism correctly. If you want, another option could be to co-author the PR? I'm not sure if you wanted to add tests on this repo before or after this PR, so in case it's after, this is what this PR does:
I am not totally certain of whether we really need to backwards support the flattened process args. I would think we should be able to just drop support for them, as developers are expected to use the external code API or web API to interface with the extension. Backwards compatibility is something we have to pay in code complexity, so I'm open to removing this feature. I'd say tests for this have a lower priority than the public facing API code + gradio interface. (if there's any way to test the gradio interface 😅) For the web API, my intention with this PR is to make it as easy as possible for web API users to upgrade their calls. For this to work, I tried to make it so that the name of the properties do not have to be changed from the existing So in other words, I believe we ideally need to verify that: (this is only for suggestive purposes, feel free to break it down into a different set of test cases)
Again, it may be unrealistic to add all of these cases in the initial test PR. I tried to put them in order of my subjective perception of their priority. For your suggestion on |
/controlnet/*2img
web API in favor of new alwayson_scripts support
…into array-of-struct
I manually tested the main use cases: txt2img/img2img in deprecated/alwayson web API routes, txt2img/img2img in gradio interface. Marked as ready for review, but maybe we should consider waiting to have a couple of automated tests for some of the code this PR touches before merging. Also maybe there is still something to do for |
I was asking to just get rid of them from API schema. It is not good design that you will have two booleans in array then object of actual information. So in simple words, just try to remove them. |
@ljleb, And about testing, I will create separate PR for that and we can take discussion about test cases there. Keep this PR for API changes only. As for list of test cases provided by you seems okay although difficulty to implement them will be the deciding factor to weather implement them or not. Also I would like others(specially @Mikubill) to also provide more insights about which things needs to be tested and which not. Also I am planning to use pytest instead of unit_test(builtin) as it is more convenient for developers. Do let me know if we want to keep it to inbuilt modules. |
Looks good. I've added a basic txt2img api test (using unit_test tho), we can add/discuss more test cases later. |
I guess the wiki will be updated soon to reflect the recent changes but, just to be sure, can you confirm that these are the proper key strings to send with the ControlNetUnit dictionary?
I'm trying to map the new implementation in a C# codebase and sending a list of arbitrary typed arguments to alwayson_scripts is proving to be quite the challenge, using the dictionary method might be useful. Edit: I guess this is the relevant block for the proper request fields. I noticed that some are missing ( Edit: Regarding the args booleans, from a strongly typed language like C#, I think that using a dictionary would be a better approach for modeling the request objects. Something like: {
...
"alwayson_scripts": {
"ControlNet": {
"args": {
"is_img2img": false,
"is_ui": false,
"units": [
{
"enabled": true,
....
}, { ... } It's not perfect and there is still the problem of mixing types but the intention is more explicit I think. |
I'm not sure if this is the right place but I'm now getting the following error when trying to call controlnet via API:
I'm using /sdapi/v1/img2img/ with this payload:
I get back a normal img2img response but ControlNet isn't used. Am I doing something wrong? |
|
@PhoenixCreation I see, was afraid that might be the case. It's a very unfriendly approach if you need to pass the information around in object models. I'm still unsure about the proper key strings for the CN unit, I was under the impression that |
@Hugo-Matias It is already merged but seems it has some issues as pointed out by @aiton-sd, I think we need to revert the changes of this MR and need to test it little bit more before we apply this changes. |
Just as an amateur user I would really appreciate if changes to the API were documented before being merged. I'm reading this thread trying to understand how to change my API call but I just keep getting different errors. |
Indeed, sorry that's what I meant with before merging and as @physis123 stated we should also focus on documentation parity. |
To pass images, you just pass the base64 string to
I agree. Not sure how to do that though? the webui just forwards the values to process and postprocess, and atm this information is needed inside process and postprocess. We need to find a good way to deliver it without needing to pass them as argument. |
I apologize for the inconvenience. |
You need to call it like this:
Otherwise the webui will not pass any argument to the extension. |
@Mikubill Thank you👍and I'm sorry.🙇 |
Hey @ljleb, It has been really confusing switching API schema. Would you be able to put one sample payload in PR description then it would help lot of people coming here unless we change the documentation. |
@ljleb Isn't the is_ui bool already set as False by default? sd-webui-controlnet/scripts/api.py Line 120 in 6a5580c
And, perhaps it's due to a poor implementation on Automatic's scripting system but in my opinion, is_img2img shouldnt be inferred from the endpoint call? In the end with good call flow both booleans could be excluded. Unfortunately you're right, as @PhoenixCreation stated earlier it seems that the |
I can update the wiki in fact. I'll add an additional section to show how to call |
I really want to know how to do this. Maybe there is already this information available somewhere, but as the webui isn't documented I couldn't find a better way. Script classes have By default |
Unfortunately I can't be much help either, I've played a bit with the script example but never really explored it in depth and never applied an api to it. I'm just trying to think why that boolean is really needed when the endpoint itself is already conditioned to the value.
I see, but as it is right now, passing |
If you pass At the moment the main purpose for |
Ah ok, I think I was getting confused with the controlnet_any2img method of the ApiHijack class, I see that's for routing the deprecated endpoints to standard relevant webui enpoint using the Regarding the ControlNetUnit object fields, are the ones included in the wiki still the valid ones apart from the |
A bit of context from what I understand on the auto1111 api changes for those who haven't gone into the weeds. Feel free to correct me if I am mistaken. Before there wasn't anyway to pass args to always on scripts (scripts that have always_visible == true like cnet) through the auto1111 *2img API. This repo worked around that limitation by cleverly hijacking the api and making their own routes. Normally the scriptrunner, which runs all scripts, has a giant list called script_args which, from what I could tell, apart from index 0, is fed by the gradio elements of the webui who just dump their values into that list at an assigned index. By hijacking the api and making their own scriptrunner, this repo was able to make a nice and friendly api schema which abstracted/fixed/bypassed this process. The new alwayson_scripts param in the API makes it possible to pass the args to those scripts the same way the webui does, which is why it's a list and is a bit different from the current cnet api schema as it is the same args that are feed from the webui. It basically emulates what the webui would pass. Unfortunately, a description of what args to pass is not available unless the extension provides it through some means or you dive in the code, add a print and deduce it yourself. A change of the design of the scriptrunner in auto1111 could be done but you run the risk of breaking every extension that currently exist. |
@Vespinian Do you know if it's possible to get the inference mode from the endpoint, ie. txt2img or img2img? Having a way to know that would help simplify the call arguments. Although if something needs to change to accommodate that functionality it would most likely be on webui's side. |
For clarity, I updated the wiki. Added a section on how to migrate, what the expected structure is, etc. Also added a banner at the top of the web API section that redirects to the migration section. The migrating section is subject to changes, it's there but I'll update it as we find ways to simplify calling the new web API. |
@Hugo-Matias They should be created in the scripts.py here There is a failsafe in the api request to create a scriptrunner if it doesn't exist though. So you might not be able to completely rely on it |
That's interesting, so we could potentially use that to initialize the |
One potential issue we may encounter by doing this is if users replace controlnet script with a copy in a script runner. I don't think this is a blocking issue as it will probably not happen, but it could lead to surprising behavior. |
Well thinking about it a bit more, maybe we could type check the p object itself. It should be either a StableDiffusionProcessingTxt2Img or a StableDiffusionProcessingImg2Img. The details of the implementation is here-ish and you can see the p object being created and passed in the api.py of auto1111 at L244 and L298. The UI must be doing something similiar. I can't test it right now. |
Upon further testing, it seems that |
I tried this approach before, but unfortunately we are swapping the pipeline to txt2img in some cases despite actually being in the img2img tab. |
Oh yeah, that makes sense, it's passed to initialize_scripts and that passes it down to the scripts in the scriptrunner. You see it in action in the api.py when no scripts in the scriptrunner are found L221 and L273 |
Great news! Hopefully it will be enough to run things smoothly and without edge cases.
Would that situation happen with a regular api call on an unaltered SD installation? I don't entirely understand the issue and how it would affect the script's behavior but I can help with testing things once I'm done remapping objects and migrating my calls. |
Regular call paths would not encounter this, but extensions or external code interfacing with scripts could do this. Not very likely to happen as you say, so it could be a valid solution IMO. I think we should try to use the existing facilities the webui provides. If that does not work, we can always go back to this alternative solution. |
Closes #571, closes #567
This combines the 2 argument passing strategies we use in
processing
andpostprocessing
into a single strategy. (using an array of structures) I find that code readability increases with this.As I am not sure whether this really is a good idea, I created a companion discussion: #528. This changes a lot of code, risks breaking something as there are no tests. Feel free to close if it is not welcome.