Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Diffuser Framework vs WebUI Extension #10

Open
paulo-coronado opened this issue Feb 24, 2023 · 18 comments
Open

[Discussion] Diffuser Framework vs WebUI Extension #10

paulo-coronado opened this issue Feb 24, 2023 · 18 comments

Comments

@paulo-coronado
Copy link

paulo-coronado commented Feb 24, 2023

Hello @haofanwang,

I am trying to replicate Mikubill Transfer Control. So, initially I followed this guide and after comparing the AnyV3_Canny_Model.pth created to the WebUI (using same prompt, seed etc.), I realized that they are not the same... is it normal? Please, check the differences below:
Blank 2 Grids Collage (1)
In addition, I am trying to save the merged files when using Mikubill repo, tried to add the following line in cldm.py in the end of the PlugableControlModel init(): torch.save(final_state_dict, './control_any3_openpose.pth')

But i don't think it is correct... do you have any thoughts on this?

@haofanwang
Copy link
Owner

(1) How does AnyV3_Canny_Model.pth created in WebUI? Isn't it using the method described in controlnet?

(2) What do you mean by not the same. The weights have different keys or values, or both?

(3) Can you provide more details about what you want to? What is your base-model and ControNet (what condition)? If possible, could you attach the input and control image (pose, depth or else), so that I can have a try.

@plcpinho

@paulo-coronado
Copy link
Author

paulo-coronado commented Feb 24, 2023

(1) I didn't create AnyV3_Canny_Model.pth via the WebUI. This is one of my goals, but I don't think it is possible because as I've understood Mikubill is not merging the models like your doing, he's doing it "on the fly" - didn't understand exactly how. The comparison above is AnyV3_Canny_Model.pth created via this guide vs Mikubill WebUI image generation;

(2) As both methods (tool_transfer_control.py and Mikubill WebUI) try to achieve the same result (CustomSD15+ControlNet), I thought about comparing both methods and checking if they generate same or similar images (using the same prompt, seed etc.). And the result was the above (different images).

(3) To be clearer, my goal is to create the best CustomSD15+ControlNet model (e.g. AnyV3_Canny_Model.pth) possible. And I found that Mikubill WebUI is generating better images if compared to the ControlNet base model replacement via tool_transfer_control.py. Does it make sense or is this a completely nonsense statement? 🤔

PS.:
- CustomSD15 models I am referring to any SD15 model, such as AnythingV3, RealisticVision etc.
- ControlNet models I am referring to SD15Canny, SD15OpenPose etc.

@haofanwang Thank you much for replying the thread!

@paulo-coronado
Copy link
Author

paulo-coronado commented Feb 24, 2023

I am trying to analyze Mikubill code in order to create a AnyV3_Canny_Model.pth file and then compare the weights of both methods (Mikubill vs tool_transfer_control.py), but I am not succeeding, because as I said I suspect there is no model merge. 😓
What do you think @haofanwang?

@haofanwang
Copy link
Owner

I totally agree with you. I also fail to figure out how does webui work on the fly. That is why I make this project.

One possible reason is that tool_transfer_control.py may miss out something, and is not fully converted. I'm not sure whether the released ControlNet weights are completely independent of UNet, or as mentioned in ControlNet repo, it may also train some layers into UNet. In such a case, we only load part of weights.

Anyway, thanks for your finding, I'm also interested in converting on the fly, we can keep updated. @plcpinho

@paulo-coronado
Copy link
Author

paulo-coronado commented Feb 24, 2023

@haofanwang I just found something very interesting! This guide was written by Illyasviel 2 weeks ago, last week this thread was created in which Illyasviel, Mikubill and Kohya-ss discussed about Transfer Control implementations. In one part of the conversation, Kohya-ss made this pull request for a method (which is btw the currently method used in Mikubill's repo) of Transfer Control. In Kohya-ss' words: "extract_controlnet_diff.py makes the difference and save the state_dict with key difference as a marker, and cldm.py handles it on the fly.".

I believe the answer to our questions is in this thread and in this PR mentioned above!

@ghpkishore
Copy link

@plcpinho can we for testing, use the Mikubill repo and get the canny model and then use that ? Should that work? I fully didn't understand how to solve this, except that I too ran into an error when trying to create canny edge based inpainting output. I would appreciate if you have can give specific steps through which we can use the new control models in our inpainting pipeline.

@haofanwang
Copy link
Owner

@plcpinho I'm glad to know this! I can work on it based on your info. If you are interested in, PR is very very welcome.

@haofanwang
Copy link
Owner

Will dive into Mikubill/sd-webui-controlnet#80 and Mikubill/sd-webui-controlnet#73. If anyone is willing to help, please let me know.

@haofanwang
Copy link
Owner

I don't find anything difference between merging beforehand and merging on the fly. They actually do the same. Still not clear what leads to such a difference, will spend more time on investing this.

For now, I have no plan to add merging on the fly in this repo, as we more care about using in diffusers. Our goal is to load the new model via diffusers function from_pretrained().

@paulo-coronado
Copy link
Author

Thank you for your reply, @haofanwang!

Technically merging beforehand and on the fly do the same. However, I am not sure if the models generate the same/similar images. I am going to do some more tests today! Btw, you said in this thread that the model is actually being merged in both methods, do you know how to save the merged model in sd-webui-controlnet? Because, if we have both models saved, it is easy to compare the merging beforehand vs on the fly. I tried to add the following line in clmd.py:

# Does not work... saves a ~700KB file
torch.save(state_dict, './merged_model.pth')

About the cldm.py, I am also not sure if this code is actually running, because if look the "if" statement (line 66) there is never a "k" item starting with "control_model.". So, the operations you mentioned above, during my tests, never gets called. You can see that by adding some print() and running ControlNet via the WebUI... 🤔

@haofanwang
Copy link
Owner

It's kind of weird, let's follow up.

@ghpkishore
Copy link

@paulo-coronado any updates on how the canny model is differing? Also, I would appreciate if you can share more on if it is only the canny model which is not the same as Mikubil webui or if others are also differing.

@paulo-coronado
Copy link
Author

Hey, @ghpkishore! I don't know why it is differing, but I can tell you that merging using this guide works! Although it generates different results, the images are still great! I might have done something wrong to have generated the above example...

@ghpkishore
Copy link

ghpkishore commented Mar 1, 2023

Thanks @paulo-coronado . Do you know how I can use safe tensors and convert that into normal PyTorch bin and use that. I am running into some error when I try to use the control net canny edge model already made which say that the weights are not initialised.

Tried for couple of hours and gave up

@paulo-coronado
Copy link
Author

Do you have Discord, @ghpkishore? Send me your profile so me can chat there :)

@ghpkishore
Copy link

@paulo-coronado It is the same. @ghpkishore

@paulo-coronado
Copy link
Author

@ghpkishore The Discord username is something like @ghpkishore#0000

@ghpkishore
Copy link

@paulo-coronado ghpkishore#4438

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants