Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Integrated openpose editor #1012

Closed
huchenlei opened this issue Apr 24, 2023 · 7 comments
Closed

[Feature Request] Integrated openpose editor #1012

huchenlei opened this issue Apr 24, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@huchenlei
Copy link
Collaborator

Often times the openpose preprocessor cannot produce the exact openpose we want. Here are some experience I have:

  • 2 hands in the image, only 1 hand correctly detected
  • The major pose is correctly detected, but some anchor points are in wield positions (this often happens in non-fullbody images)
  • Animie images are not supported for preprocessing.

I propose integrate an openpose editor into controlnet extension.
Here are some existing works:

The first 2 extensions copy controlnet's annotator/openpose directory to detect pose from image. Non of them has updated the code to support hand/face detection yet. Both extension supports sending the image to be used by controlnet.

The 3D editor feels too complicated to use, as moving an anchor point in 3D is way more complicated than in 2D.

Proposed workflow:

  • An Edit Pose button is shown below the generated image when any openpose model is selected as preprocessor.
  • Clicking the Edit Pose button will send the JSON openpose data (Stored somewhere on server side when running the preprocessor?) to the openpose editor (A modal?)
  • The user does the necessary edits (Adding missing hands, skeletons, torsoes, etc), close the modal, the JSON openpose data is send back to replace the original JSON openpose data on server side.
  • The server side renders the processed image again.

I am not so sure what is the best way to hijack the openpose JSON data on server side. Please provide some ideas, thanks!

@CharlesTHN
Copy link

I couldn't agree with you more,
I have the same idea and I'm looking at the code about openpose , but I found it was not easy for me :(

@huchenlei
Copy link
Collaborator Author

huchenlei commented Apr 24, 2023

There is also a openpose editor for hand specifically:
https://github.com/zackhxn/openpose-hand-editor
It also ports controlnet's code, but for hand detection.

Currently we live in a very chaos space where there are multiple extensions available, but none can do the task very well.

@continue-revolution
Copy link
Collaborator

openpose editor requires a javascript expert, where I am unfortunately not. I believe we need a JS expert to do this for us.

@huchenlei
Copy link
Collaborator Author

huchenlei commented Apr 24, 2023

I can help with the JS code. The JS implementations in the above extensions except the 3D extension are pretty minimal (<1k lines of JS).

@huchenlei
Copy link
Collaborator Author

I kinda sort out the data flow to achieve the functionality:

Build a separate sd-webui extension

The extension will use the sd-webui api to expose a FastAPI path:

def mount_openpose_api(_: gr.Blocks, app: FastAPI):
    @app.post('/openpose_editor', response_class=HTMLResponse)
    async def index():
        return templates.TemplateResponse('index.html', {"request": request"})

script_callbacks.on_app_started(mount_openpose_api)

Embed the openpose extension's page into controlnet

  • Let controlnet display an iframe to the /openpose_editor when the edit button is clicked. Both original image and the openpose json data are send to the iframe as POST request parameters.
  • The user does the pose edit in the iframe sending the processed openpose json data through window.parent.postMessage.
window.parent.postMessage("Data from the child iframe!", "*");
  • Controlnet receives the new openpose json data by observing the message event.
window.addEventListener("message", function (event) {
    console.log("Message received from the child iframe:", event.data);
    // Simulate a click event on a button, so that ControlNet's python backend can receive the event. data is passed through
    // gr.State.
    state.textContent = event.data;
    button.click();
});

Conclusion

By doing this we do not need to host the code of openpose editor within controlnet. User can also choose any openpose editor that support the controlnet's message protocol.

Alternative

Web UI uses very clumsy pure Gradio way to send data to each components link. I suspect this would take more effort, and limit the UI interaction to Gradio's scrope. Current proposed approach will allow us using a JavaScript front-end library (Vue.js, etc) to simplify pure front-end interactions.

@huchenlei huchenlei self-assigned this Apr 30, 2023
@huchenlei huchenlei added the enhancement New feature or request label Apr 30, 2023
@huchenlei huchenlei pinned this issue Apr 30, 2023
@huchenlei huchenlei unpinned this issue Apr 30, 2023
@huchenlei
Copy link
Collaborator Author

Building a pure front-end openpose editor here:
https://github.com/huchenlei/sd-webui-openpose-editor

@huchenlei
Copy link
Collaborator Author

Progress Update

The UI layout is mostly done. Probably can start the controlnet side work soon.
Screenshot (111)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants