Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature] Integrated Openpose Editor #1454

Merged
merged 1 commit into from
Jun 4, 2023

Conversation

huchenlei
Copy link
Collaborator

@huchenlei huchenlei commented May 28, 2023

#1012 proposed an integrated openpose editor. Months passed, we are finally close to have one.

Screen Capture 001 - Stable Diffusion - localhost
Screen Capture 003 - Stable Diffusion - localhost

Workflow of this feature:
screen-capture.webm

There are still a few things that need to be worked out though:

How should the edited image be used by controlnet?

#1211 Discussed why we cannot make the preview result directly used by controlnet. TLDR is that many preprocessors use values from WebUI that are only available when clicking the generate button.

Option1: Let controlnet use pre-rendered openpose image if there is an update-to-date one. (Make an exception for openpose only as openpose preprocessor does not depend on any WebUI runtime values)
Option2: Add a button to push the preprocessor result to the input image and select none for preprocessor. This option is not so ideal as once sent, the pose becomes uneditable.

How should we release this feature to the users.

Behind the scene, the feature is embedding an openpose editor web page in an iframe. The controlnet JS code communiates with the iframe web page via window.postMessage. Currently this requires user install https://github.com/huchenlei/sd-webui-openpose-editor/tree/main separately.

If the extra extension is not installed, how should we handle the prompting:
Option1: Install the extension in the background. (Maybe too agressive?)
Option2: Display a line of text let user know the feature need another extension to function.

@huchenlei huchenlei marked this pull request as draft May 28, 2023 04:18
@huchenlei huchenlei requested a review from lllyasviel May 28, 2023 04:19
@huchenlei huchenlei added the enhancement New feature or request label May 28, 2023
@lllyasviel
Copy link
Collaborator

wow great. how can js send data to gradio? what is the currently used method?

@huchenlei
Copy link
Collaborator Author

wow great. how can js send data to gradio? what is the currently used method?

There is a hidden gr.Textbox that accepts pose JSON from JS code.
gradio-app/gradio#2981 explains how to formulate an event so that gradio backend can recognize data update on gr.Textbox. (Not surprising that the issue was raised by AUTOMATIC1111 lol)

@lllyasviel
Copy link
Collaborator

wow sounds very a1111. will take a look soon

@2blackbar
Copy link

This is pretty great, is there a way to have a button to make entire hand and bones appear instead of enabling bones one by one ?

@huchenlei
Copy link
Collaborator Author

This is pretty great, is there a way to have a button to make entire hand and bones appear instead of enabling bones one by one ?

Yes, you can toggle visibility on entire hand, body, face. But invalid/missing keypoints are marked invisible by default, so if there are invalid keypoints, they are likely to be jammed at the same location.

I am working on the documentation on the editor repo. Hopefully that will clarify things.

@lllyasviel
Copy link
Collaborator

image

@huchenlei
Copy link
Collaborator Author

image

Need to install https://github.com/huchenlei/sd-webui-openpose-editor/tree/main first. (No need to build now as it automatically pull latest release binary from GitHub.)

@huchenlei
Copy link
Collaborator Author

Screen Capture 004 - Stable Diffusion - localhost
Screen Capture 005 - Stable Diffusion - localhost

Now we display a error screen to guide user if no editor is detected.

@huchenlei huchenlei marked this pull request as ready for review May 29, 2023 18:45
@huchenlei huchenlei changed the title [New Feature][Discussion] Integrated Openpose Editor [New Feature] Integrated Openpose Editor May 29, 2023
@huchenlei
Copy link
Collaborator Author

I think I now have answers to the 2 questions originally asked:

How should the edited image be used by controlnet?

Option2: Add a button to push the preprocessor result to the input image and select none for preprocessor.
Note: Option1 takes too much effort to implement and might cause confusion to users.

How should we release this feature to the users?

Option2: Display a line of text let user know the feature need another extension to function.

This PR is now ready to be merged.

@lllyasviel
Copy link
Collaborator

or perhaps we just add a dirty flag to preview image. if sha256 of a input image match some marked sha256 then replace it with another image before processing. maintain a sha to ndarray dict

@huchenlei
Copy link
Collaborator Author

huchenlei commented May 29, 2023

or perhaps we just add a dirty flag to preview image. if sha256 of a input image match some marked sha256 then replace it with another image before processing. maintain a sha to ndarray dict

Probably need to hash all the input settings (resolution, etc) used to generate the preview image instead of just the hash of input image. I think for now a "Send back" button is the most cost efficient solution.

Note:
I find it really hard to navigate controlnet.py Script.process code. We probably want some sort of refactoring there soon. (Breakdown the logic of the big process function and add unittests to them.)

@lllyasviel
Copy link
Collaborator

then user click two buttons,first a button from extension and then send back?

@lllyasviel
Copy link
Collaborator

i think i can try it a bit tomorrow or later soon. how big is that extension? is it possible to directly move into cn, or hide the button if extension not installed?

@huchenlei
Copy link
Collaborator Author

huchenlei commented May 29, 2023

i think i can try it a bit tomorrow or later soon. how big is that extension? is it possible to directly move into cn, or hide the button if extension not installed?

The extension is not too big. 2~3k lines of typescript code. The problem is that the extension needs to be build with nodeJS installed. My current workaround is to let user pull the compiled app from github release. If we include the editor in this repository, it is likely to messup the release page.

I can add an option in settings to hide the edit button, and prompt the user to go to settings and toggle that option if they don't want to see an edit button with alert sign.

@huchenlei
Copy link
Collaborator Author

then user click two buttons,first a button from extension and then send back?

When the user is satisfied with the image in the preview, the user can click Send back button to use the preview image as input image.

Overall workflow:

  1. Click edit to pop up the editor modal
  2. Do the necessary edits
  3. Click Send to ControlNet in the editor modal to close the editor
  4. Repeat step#1 if there are other changes to be made or click Send back to use the edited image as controlnet input image

@lllyasviel
Copy link
Collaborator

it seems when I click a button titled "Send to ControlNet", I will expect that this image will be used by CN. Clicking another button again sounds a bit ...
Perhaps I can try later and see if we can find a solution.

@huchenlei
Copy link
Collaborator Author

huchenlei commented May 30, 2023

it seems when I click a button titled "Send to ControlNet", I will expect that this image will be used by CN. Clicking another button again sounds a bit ... Perhaps I can try later and see if we can find a solution.

We can directly set the input image when clicking send to ControlNet but that would overwrite the input image. If the user decides to adjust the pose again, the original input image won't be available as the background image in the editor.

The ideal way is let ControlNet use the edited pose in the generated image section, but I found it not so easy to implement.

@huchenlei
Copy link
Collaborator Author

@lllyasviel Gentle ping on the review. I can try to implement the logic to use the up-to-date preview image if you think that is the way to go.

@lllyasviel
Copy link
Collaborator

I am relatively busy this week. I think we can use a branch in this repo or what to test the features, and then perhaps the weekend I can test it more and try to merge hopefully.
Since the difference looks big and many users are actively updating this recently, I do not recommend to directly work on main branch for this.

@huchenlei
Copy link
Collaborator Author

I am relatively busy this week. I think we can use a branch in this repo or what to test the features, and then perhaps the weekend I can test it more and try to merge hopefully. Since the difference looks big and many users are actively updating this recently, I do not recommend to directly work on main branch for this.

A new branch embed_pose_edit is created.

@huchenlei huchenlei force-pushed the embed_pose_edit branch 2 times, most recently from 4f521b8 to 8d32cf8 Compare June 2, 2023 14:51
@huchenlei
Copy link
Collaborator Author

Hi @lllyasviel,

Can you take a look again? I add a new checkbox to allow direct usage of preview image as controlnet input.

Now the workflow is reduced to:

  1. Click edit to pop up the editor modal
  2. Do the necessary edits
  3. Click Send to ControlNet in the editor modal to close the editor
  4. Click genereation to run SD or go back to Step 1 to do more edits

🎨 nits

🎨 Load alert screen when no editor is detected.

🔧 add option to hide edit button

🐛 Resolve conflicts during rebase

🚧 Set use preview as input after edit

🐛 Fix merge conflict
@huchenlei
Copy link
Collaborator Author

I am going to merge this if no further oppositions / feedbacks are received.

Some testings:

We can either make the feature disabled by default or revert this PR if the user feedback is really negative.

@continue-revolution
Copy link
Collaborator

continue-revolution commented Jun 4, 2023

  1. It would be better if you could add a feature where people could directly add and edit pose, without any reference image. (like posex, but posex cannot edit face/hand. Also editing face/hand should be optional)
  2. It would be better if people do not have to build front end stuff. This is especially painful for non-experts like me.

@huchenlei
Copy link
Collaborator Author

  1. It would be better if you could add a feature where people could directly add and edit pose, without any reference image.
  2. It would be better if people do not have to build front end stuff. This is especially painful for non-experts like me.
  1. The feature can be accessed directly in localhost:7860/openpose_edit_index. The default will not include hand/face.
  2. People don't need to build front-end stuff as the extension downloads the compiled app zip from github release.

@huchenlei
Copy link
Collaborator Author

Also from the demo there is an issue with the openpose detection: The same hand got detected twice on the right person. Going to open an issue for this.

@huchenlei huchenlei merged commit 5b681b0 into Mikubill:main Jun 4, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants