Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other background replacement code bases - a round up #58

Open
phlash opened this issue Mar 31, 2021 · 25 comments
Open

Other background replacement code bases - a round up #58

phlash opened this issue Mar 31, 2021 · 25 comments
Labels
documentation Improvements or additions to documentation

Comments

@phlash
Copy link
Collaborator

phlash commented Mar 31, 2021

Having recently discovered that the open source Jitsi video conferencing solution offers ML driven background replacement, I thought it would be interesting to round up who else is doing this here on Github and what tech is used..

  • Search: https://github.com/search?p=13&q=virtual+background&type=Repositories
  • 187 results! which with a bit of duplicate removal and filtering by star rating..
    • Jitsi-Meet, 100% client-side, tflite.js (optionally compiled to WASM, optionally with SIMD support), using BodyPix models.
    • ViBa, a tidy 100% python re-working of the original mixed-tech solution by Ben Elder, using python3-tensorflow, python3-opencv & BodyPix models.
    • Volcomix virtual background, the inspiration for Jitsi team, 100% client-side, using tflite.js (compiled to WASM & SIMD required) and either BodyPix or MediaPipe Meet models. Really well documented and tested.
    • EasyJitsi, 100% client-side React app, using tf.js and BodyPix. Small, nice demo site but slow (3FPS on my laptop)
    • VirtBG, 100% client-side, single file implementation, using tf.js, BodyPix. Similar performance to EasyJitsi above as expected. Great example of minimal bloat though!
@phlash phlash added the documentation Improvements or additions to documentation label Mar 31, 2021
@MartinKlevs
Copy link

Volcomix virtual background seems to give great results even on my bad laptop camera while also at 60fps. Would it be possible to do something similar here?

@phlash
Copy link
Collaborator Author

phlash commented Apr 13, 2021

@MartinKlevs glad to hear it! On the surface, both Volcomix and Deepbacksub operate in a similar fashion, using the same Google models to detect a person, but there are a few differences, in particular, the use of async rendering in the Volcomix solution (which will arrive here with #59) and use of the browser's 2d or WebGL canvas in place of OpenCV for other image processing, which will use a GPU where available and likely reduce CPU loading a little, quite possibly making up for the use of WASM and Emscripten compiled C++ tensorflow :)

@MartinKlevs
Copy link

Volcomix seems to do a much better job overall. It also segments the whole image instead of a cropped area.

@floe
Copy link
Owner

floe commented Apr 13, 2021

Found this just now: https://developers.google.com/ml-kit/vision/selfie-segmentation
Seems to be Apache-licensed (for now), problem is actually getting the model file, which is buried inside the MLKit runtime.

@floe
Copy link
Owner

floe commented Apr 13, 2021

Update: you can get the AAR file (which is just a Zip) via https://mvnrepository.com/artifact/com.google.mlkit/segmentation-selfie/16.0.0-beta1 - and there is indeed a .tflite file in there, that should be worth a try (it also has a quadratic input shape, so it should fit a landscape camera image somewhat better than the portrait-shape Meet model).

@MartinKlevs
Copy link

Nice! I was able to get it working with minimal adustments. The new model outputs a [0, 1] float32 mask. It seems to do a good job.

@floe
Copy link
Owner

floe commented Apr 13, 2021

Yes, quick-and-dirty implementation in 24dc33f - seems to be a candidate for new default model?

@MartinKlevs
Copy link

MartinKlevs commented Apr 13, 2021

I agree. Personally i experience better results with the threshold set to 0.75.

@insad
Copy link

insad commented Apr 13, 2021

@BenBE
Copy link
Collaborator

BenBE commented Apr 13, 2021

Yes, quick-and-dirty implementation in 24dc33f - seems to be a candidate for new default model?

Can you provide example screencaps?

@floe
Copy link
Owner

floe commented Apr 14, 2021

Three really quick-and-dirty (again) screenshots, in the order: new selfie model, Meet model, deeplabv3+.

Screenshot from 2021-04-14 08-31-53
Screenshot from 2021-04-14 08-32-06
Screenshot from 2021-04-14 08-32-28

@MartinKlevs
Copy link

This repo contains several models:

https://github.com/anilsathyan7/Portrait-Segmentation

@phlash
Copy link
Collaborator Author

phlash commented Apr 14, 2021

@insad thanks for those - the list of papers is excellent! I came across the @SwatiModi (Android targeted using MediaPipe) and @fangfufu (Python+Node.js, derived from Ben Elder's original work) projects in my search, I hadn't found @PapaEcureuil's where Streamlit is being used to put a nice GUI on the Python+Tensorflow (full fat) engine.

@insad
Copy link

insad commented Apr 14, 2021

Did you see this one: https://github.com/ZHKKKe/MODNet ?

A lot of active development going on seems, sadly a lot of communication in chinese only...

@BenBE
Copy link
Collaborator

BenBE commented Apr 15, 2021

The new selfie model sure seems promising. What I think is still missing is the overlapping of masks to fill the whole image area from multiple NN runs.

@progandy
Copy link
Contributor

Did you see this one: https://github.com/ZHKKKe/MODNet ?

It is used in this plugin for OBS: https://github.com/royshil/obs-backgroundremoval

@phlash
Copy link
Collaborator Author

phlash commented May 11, 2021

@progandy - thanks, I note from Roy's README.md and a quick look at the code that his filter uses the Microsoft ONNX C++ wrapper for multiple possible ML frameworks (https://github.com/microsoft/onnxruntime), then borrows the ONNX pretrained ML model from https://github.com/ZHKKKe/MODNet (actually their google drive), but not their Python 😉

@elkhalafy
Copy link

@phlash Please build backscrub as free plugin for obs studio

@phlash
Copy link
Collaborator Author

phlash commented May 17, 2021

@elkhalafy I have an experimental OBS plugin that uses backscrub here: https://github.com/phlash/obs-backscrub

This builds against the experimental branch of backscrub where the core functionality is separated into a library and deepseg is a wrapper around it (as is the obs plugin).

@elkhalafy
Copy link

I hope complete the project and create it as Actual plugin we need it so much.
@phlash

@ghost
Copy link

ghost commented May 21, 2021

@floe the new model looks great. I think there's a place for larger models as well like the one from https://github.com/PeterL1n/BackgroundMattingV2 although I'm not sure what the status of GPU accleration is in backscrub since I haven't personally used XNNPack.

@phlash
Copy link
Collaborator Author

phlash commented May 22, 2021

@dsingal0 No GPU acceleration in backscrub as yet, XNNPACK provides CPU optimised kernels for TFLite. That said GPU works[citation needed] via the TFlite GPU delegate and OpenCL in my hacked up branch here: https://github.com/phlash/backscrub/tree/xnnpack-test according to one tester 😄

I would be interested to try the larger models from Peter Lin's paper, it looks like the ONNX ones are where we should start, which then need converting to TFLite through TF (apparently): https://stackoverflow.com/questions/53182177/how-do-you-convert-a-onnx-to-tflite

@ghost
Copy link

ghost commented May 23, 2021

@phlash if going for GPU acceleration TensorRT would be great for NVIDIA GPUs since they have ONNX->TensorRT converters. I tried out the current models in the repo and all except DeepLabv3 and MLKit Segmentation were quite unusable. https://github.com/ZHKKKe/MODNet looks very promising based on their colab. It's heavier than the tflite models, but much lighter than BackgroundMattingV2 so it can feasibly run on Intel non-U series CPUs or a dGPU

@rdreyer-godaddy
Copy link

Just wanted to mention that Zoom now has some kind of ML segmentation in their Linux client (Version 5.7.6 - 31792.0820), too, and it's quite performant. Curious if someone is up for reverse engineering it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

8 participants