New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add background segmentation mask #142
base: main
Are you sure you want to change the base?
Conversation
Thanks @eehakkin In many cases, it might be important to have access to the original camera feed, so BG MASK retains the original frames intact, does segmentation and provides mask frames in addition to the original video frames thus web applications receive both the original frames and mask frames in the same video frame stream This PR follows up our presentation of BG Segmentation MASK in the monthly WebRTC WG call [Minutes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the general thrust of this effort is very useful for Web applications.
<p>A background segmentation mask with | ||
white denoting certainly foreground, | ||
black denoting certainly background and | ||
grey denoting uncertainty.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really only "uncertainty" that's represented? Is it perhaps sometimes partial transparency, and sometimes ambiguity?
Could anything be said here to clarify that shades of grey tend more towards the foreground/background based on being lighter/darker?
<h3>VideoFrame interface extensions</h3> | ||
<pre class="idl"> | ||
partial interface VideoFrame { | ||
readonly attribute VideoFrame? backgroundSegmentationMask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine this isn't going to suffer infinite recursion because the second layer deep will be guaranteed nullable. But it still strikes me as a bit odd to expose a full VideoFrame
here, with all its present and future fields, when what we really wish to get is a matrix of integer values of a limited range.
}; | ||
|
||
partial dictionary MediaTrackConstraintSet { | ||
ConstrainBoolean backgroundSegmentationMask; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it ever be interesting and feasible to tweak the parameters by which segmentation is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Atleast on Windows, the platform model does not allow tweaking segmentation parameters today. Using tensorflow.js with BodyPix
model for Blur, I see there's atleast a segmentationThreshold
parameter. Maybe it's the same as foregroundThresholdProbability
with the MediaPipeSelfieSegmentation
model ?
Did you have some other parameters in mind ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have some other parameters in mind?
I am not knowledgeable enough on what parameters would be best to include. I was mostly wondering if this is something we foresee extending from a boolean to a set of parameters, and if so, whether there was a viable path for such future extensions given the current API shape.
Hi!
This adds capabilities, constraints and settings for background segmentation mask. Those are fairly obvious.
For the feature to be useful, the actual background segmentation mask must be provided to web apps. There are various ways to do that:
However, that makes it awkward to process such streams and very unclear how to encode them.
/cc @riju
Preview | Diff