-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MediaPipe Face Control #688
Conversation
…lueError: At least one stride in the given numpy array is negative'. Will have to ask why this happens.
… to see if we can alleviate the negative stride without changing the codebase.
…ll rather than empty.
… vestigial annotation stuff.
Important reminder to set the config adapter to models/cldm_v21.yaml! EDIT: This is no longer necessary with the updates we've done to the model. |
We are also considering MediaPipe. perhaps we need to take a look at whether the installation of the dependency MediaPipe is robust on different platforms. It seems that MediaPipe does not require compiling but we need to make sure about that. |
Thanks for raising the point. We haven't seen issues on Windows or Linux yet, but we haven't tested extensively on multiple systems. Let us know what we can do to help! |
How about using 768x768 resolution for SD v2.1_768? |
this webui plugin can read overwritten configs. Please consider copy cldm_v21.yaml to "controlnet_sd21_laion_face_v2_prund.yaml" and then put it in the same folder with the model so that users do not need to set settings manually. Also, please consider the name "mediapipe_face" like controlnet_sd21_mediapipe_face_X. |
SD v2.1_768 has a hard time making 512x512 images. It may be better to try SD v2.1 BASE. SD v2.1_768 不擅长做这512x512. 我们使用 https://huggingface.co/stabilityai/stable-diffusion-2-1-base .
Noted! I think we can make that change. Thank you. |
why there is no annotator to use? |
I've updated our model repo (https://huggingface.co/CrucibleAI/ControlNetMediaPipeFace) to be consistent with ControlNet's naming scheme. The model is now called "control_mediapipe_face_sd21_v2.safetensor|pt|yaml". We've also added the yaml file to the remote side. Are there additional recommended changes on this branch? |
can this be used on custom models? sd 1.5 version? model we trained on? @josephcatrambone-crucible i am gonna test it now |
We have an SD1.5 version training. It's a few hundred hours in, but it's not ready yet. When that's ready we'll push it up to HuggingFace. The SD2.1 model works pretty well. It's based on SD2.1 512-base. Some folks have reported success with using custom SD2.1 based models. I've not tried them. |
looking forward to sd 1.5. so once sd 1.5 released will be able to use it on our custom trained models? by the way sd 2.1 works perfect |
Yep any custom models should work as long as they match the base model we trained on. So custom 2.1 models should work with the one we released already. Custom 1.5 models should work with our 1.5 model when we release it. |
I have two questions.
|
Yes that's correct. As long as each model has a corresponding yaml with the same name, it will use that rather than the default in the settings. You can leave the defaults in the settings alone. |
We just pushed the SD 1.5 version of this model to HuggingFace Hub. 🥳 Let us know if there's anything we can do to help this PR get merged. |
I think official controlnet1.1 will also come out these days - hopefully next week. |
Thanks for amazing feature! Just successfully tested it (based on stderr: ERROR: Could not find a version that satisfies the requirement mediapipe==0.9.1.0 (from versions: none)
ERROR: No matching distribution found for mediapipe==0.9.1.0 Was solved by manual replacement ps: also there is log message after start Generation (with enabled ControlNet and ...
ControlNet model control_mediapipe_face_sd15_v2 [9c7784a9] loaded.
Loading preprocessor: mediapipe_face
objc[26962]: Class CaptureDelegate is implemented in both /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x2877d25a0) and /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/mediapipe/.dylibs/libopencv_videoio.3.4.16.dylib (0x2944e0860). One of the two will be used. Which one is undefined.
objc[26962]: Class CVWindow is implemented in both /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x2877d25f0) and /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/mediapipe/.dylibs/libopencv_highgui.3.4.16.dylib (0x2906fca68). One of the two will be used. Which one is undefined.
objc[26962]: Class CVView is implemented in both /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x2877d2618) and /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/mediapipe/.dylibs/libopencv_highgui.3.4.16.dylib (0x2906fca90). One of the two will be used. Which one is undefined.
objc[26962]: Class CVSlider is implemented in both /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x2877d2640) and /Users/nemilya/stable-diffusion-webui/venv/lib/python3.10/site-packages/mediapipe/.dylibs/libopencv_highgui.3.4.16.dylib (0x2906fcab8). One of the two will be used. Which one is undefined.
... but this not cause any issues. |
It looks like mediapipe has a completely separate branch for Apple silicon because it was set up by a third party maintainer. My reading of it is they're looking to merge the automatic build into the normal mediapipe flow. We can either change the requirement to this:
which might break because the setup script just iterates over the lines and does a split, or we can wait for Google to update their build pipeline. Here's the tracking issue: google-ai-edge/mediapipe#3277 |
hello @josephcatrambone-crucible, for some reasons, mediapipe only works with output size as 512:768 or 768:512, I tried other sizes but always get RuntimeError: The expanded size of the tensor (2560) must match the existing size (4608) at non-singleton dimension 1. Target sizes: [2, 2560, 320]. Tensor sizes: [1, 4608, 1], or ZeroDivisionError: division by zero. |
Thanks for the report. 🤔 We tried some intermediate sizes and didn't see this. Did you change the annotator resolution or the input resolution? Perhaps we should force the mediapipe to use a 512x512 image. Quick questions: are you on Apple Silicon? Are you using the SD1.5 or SD2.1-base model? |
@josephcatrambone-crucible, So, the size issue is due to the resize function only resizing along a single dimension:
The resize function given an example input produces the following:
you can replace the
This should handle different height and width combos while maintaining aspect ratio and not stretching the mask. For example:
|
Nice find. There are a LOT of the other processors using the resize_image (canny, simple_scribble, hed, mlsd, midas, leres, openpose, uniformer, pidinet, clip, and binary). I feel like perhaps we should do a separate PR to resolve the scaling issue because this will hit more than just us. As an off-hand, I'm surprised to hear that resize_image only does one axis. From the implementation it looks like it will be handling aspect ratio and all that:
|
Hmm, we could use a separate resize function for media pipe. But that wouldn't be a good practice I believe. A separate PR does make more sense here. I found the scaling issue while implementing a pre-processor class to use with diffusers. Hopefully, this gets fixed. And, thanks for the nice work on expressions. |
hello we recommend to follow the naming standard of controlnet 1.1. We will begin to merge models these days. |
Congratulations! That's a big release. We'll work on getting those model names updated. There are a few applications that are already using the model as named in the repo, but I think we can find a solution. EDIT: Do we need to make changes to this PR aside from the name update to be compatible with the ControlNet 1.1 release? |
I've renamed our models to be consistent with the ControlNet 1.1 scheme: https://huggingface.co/CrucibleAI/ControlNetMediaPipeFace/commit/6948da26359817bd4f366a9549fb094091560623 |
hello, for models marked as [p] production-ready, we will test them with a few cases, eg, a few non-cherry-picked random batch with seed 12345. [e/u] models does not need this. This may take one or two working days. |
hello we have comfirmed that this can be merged. |
On it! |
Changes are made and sanity checked on the local system. SD 1.5 and SD 2.1 models are both behaving okay from what I can tell. |
We've trained a ControlNet model on Stable Diffusion 2.1 Base with MediaPipe Face constraints. We added keypoints for the pupils which allows for better gaze control than existing alternatives.
The full changes, dataset, and process are described here: https://github.com/crucible-ai/ControlNet/blob/laion_dataset/README_laion_face.md
And the models (ckpt / safetensor) are on huggingface here: https://huggingface.co/CrucibleAI/ControlNetMediaPipeFace
One advantage over other approaches is there is significantly fewer dependencies and less code that has to be added. A disadvantage is this is currently only working with SD2.1. (We have a SD1.5 model brewing right now and will update when it's finished training.)
Notable concern: we added mediapipe to the requirements.txt -- the file seems oddly empty and we don't want to make this a necessity for others, but there isn't another obvious place.
The readme above has some cherry picked results, but here's an output of the UI: