-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Other background replacement code bases - a round up #58
Comments
Volcomix virtual background seems to give great results even on my bad laptop camera while also at 60fps. Would it be possible to do something similar here? |
@MartinKlevs glad to hear it! On the surface, both Volcomix and Deepbacksub operate in a similar fashion, using the same Google models to detect a person, but there are a few differences, in particular, the use of async rendering in the Volcomix solution (which will arrive here with #59) and use of the browser's 2d or WebGL canvas in place of OpenCV for other image processing, which will use a GPU where available and likely reduce CPU loading a little, quite possibly making up for the use of WASM and Emscripten compiled C++ tensorflow :) |
Volcomix seems to do a much better job overall. It also segments the whole image instead of a cropped area. |
Found this just now: https://developers.google.com/ml-kit/vision/selfie-segmentation |
Update: you can get the AAR file (which is just a Zip) via https://mvnrepository.com/artifact/com.google.mlkit/segmentation-selfie/16.0.0-beta1 - and there is indeed a .tflite file in there, that should be worth a try (it also has a quadratic input shape, so it should fit a landscape camera image somewhat better than the portrait-shape Meet model). |
Nice! I was able to get it working with minimal adustments. The new model outputs a [0, 1] |
Yes, quick-and-dirty implementation in 24dc33f - seems to be a candidate for new default model? |
I agree. Personally i experience better results with the threshold set to 0.75. |
Can you provide example screencaps? |
This repo contains several models: |
@insad thanks for those - the list of papers is excellent! I came across the @SwatiModi (Android targeted using MediaPipe) and @fangfufu (Python+Node.js, derived from Ben Elder's original work) projects in my search, I hadn't found @PapaEcureuil's where Streamlit is being used to put a nice GUI on the Python+Tensorflow (full fat) engine. |
Did you see this one: https://github.com/ZHKKKe/MODNet ? A lot of active development going on seems, sadly a lot of communication in chinese only... |
The new selfie model sure seems promising. What I think is still missing is the overlapping of masks to fill the whole image area from multiple NN runs. |
It is used in this plugin for OBS: https://github.com/royshil/obs-backgroundremoval |
@progandy - thanks, I note from Roy's README.md and a quick look at the code that his filter uses the Microsoft ONNX C++ wrapper for multiple possible ML frameworks (https://github.com/microsoft/onnxruntime), then borrows the ONNX pretrained ML model from https://github.com/ZHKKKe/MODNet (actually their google drive), but not their Python 😉 |
@phlash Please build backscrub as free plugin for obs studio |
@elkhalafy I have an experimental OBS plugin that uses backscrub here: https://github.com/phlash/obs-backscrub This builds against the |
I hope complete the project and create it as Actual plugin we need it so much. |
@floe the new model looks great. I think there's a place for larger models as well like the one from https://github.com/PeterL1n/BackgroundMattingV2 although I'm not sure what the status of GPU accleration is in backscrub since I haven't personally used XNNPack. |
@dsingal0 No GPU acceleration in backscrub as yet, XNNPACK provides CPU optimised kernels for TFLite. That said GPU works[citation needed] via the TFlite GPU delegate and OpenCL in my hacked up branch here: https://github.com/phlash/backscrub/tree/xnnpack-test according to one tester 😄 I would be interested to try the larger models from Peter Lin's paper, it looks like the ONNX ones are where we should start, which then need converting to TFLite through TF (apparently): https://stackoverflow.com/questions/53182177/how-do-you-convert-a-onnx-to-tflite |
@phlash if going for GPU acceleration TensorRT would be great for NVIDIA GPUs since they have ONNX->TensorRT converters. I tried out the current models in the repo and all except DeepLabv3 and MLKit Segmentation were quite unusable. https://github.com/ZHKKKe/MODNet looks very promising based on their colab. It's heavier than the tflite models, but much lighter than BackgroundMattingV2 so it can feasibly run on Intel non-U series CPUs or a dGPU |
Just wanted to mention that Zoom now has some kind of ML segmentation in their Linux client (Version 5.7.6 - 31792.0820), too, and it's quite performant. Curious if someone is up for reverse engineering it. |
Having recently discovered that the open source Jitsi video conferencing solution offers ML driven background replacement, I thought it would be interesting to round up who else is doing this here on Github and what tech is used..
The text was updated successfully, but these errors were encountered: