Skip to content

Commit

Permalink
Merge pull request #9 from jan-ivar/doc
Browse files Browse the repository at this point in the history
Initial version w/user-chooses semantics from w3c/mediacapture-main#667
  • Loading branch information
jan-ivar committed Nov 12, 2020
2 parents 31c8d9e + 55bff73 commit 83c2805
Showing 1 changed file with 351 additions and 0 deletions.
351 changes: 351 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,351 @@
<!doctype html>
<html lang="en-us">
<head>
<title>Media Capture and Streams Extensions</title>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<script src="https://www.w3.org/Tools/respec/respec-w3c" class="remove"></script>
<script class='remove'>
"use strict";
// See https://github.com/w3c/respec/wiki/ for how to configure ReSpec
var respecConfig = {
group: "webrtc",
xref: ["html", "infra", "permissions", "dom", "mediacapture-streams", "webidl"],
edDraftURI: "https://w3c.github.io/mediacapture-extensions/",
editors: [
{name: "Jan-Ivar Bruaroey", company: "Mozilla Corporation", w3cid: 79152},
],
shortName: "mediacapture-extensions",
specStatus: "UD",
subjectPrefix: "[mediacapture-extensions]",
github: "https://github.com/w3c/mediacapture-extensions",
};
</script>
</head>

<body>
<section id="abstract">
<p>This document refines behaviors of the {{MediaDevices}} functions
{{MediaDevices/enumerateDevices()}} and {{MediaDevices/getUserMedia()}} to
reduce their fingerprinting surface and enable user selection of camera and
microphone through device pickers implemented in the user agent.
</p>
</section>
<section id="sotd">
<p>This is an unofficial proposal.</p>
</section>
<section id="introduction">
<h2>Introduction</h2>
<p>This document contains proposed extensions and modifications to the
[[GETUSERMEDIA]] specification.</p>
<p>New features and modifications to existing features proposed here may be
considered for addition into the main specification post Recommendation.
Deciding factors will include maturity of the extension or modification,
consensus on adding it, and implementation experience.</p>
<p>A concrete long-term goal is reducing the fingerprinting surface of
{{MediaDevices/enumerateDevices()}} by deprecating exposure of the device
{{MediaDeviceInfo/label}} in its results. This requires relieving
applications of the burden of building user interfaces to select cameras and
microphones in-content, by offering this in user agents as part of
{{MediaDevices/getUserMedia()}} instead.</p>
<p>Miscellaneous other smaller features are under consideration as well,
such as constraints to control multi-channel audio beyond stereo.</p>
</section>
<section>
<h2>Terminology</h2>
<p>
This document uses the definitions {{MediaDevices}}, {{MediaStreamTrack}},
{{MediaStreamConstraints}} and {{ConstrainablePattern}}
from [[!GETUSERMEDIA]].
<p>The terms [=permission state=], [=request permission to use=], and
<a data-cite="permissions">prompt the user to choose</a> are defined in
[[!permissions]].</p> </p>
</section>
<section id="conformance">
</section>
<section id="camera and microphone picker">
<h2>In-browser camera and microphone picker</h2>
<p>The existing {{MediaDevices/enumerateDevices()}} function exposes camera
and microphone {{MediaDeviceInfo/label}}s to let applications build
in-content user interfaces for camera and microphone selection. Applications
have had to do this because {{MediaDevices/getUserMedia()}} did not offer a
web compatible in-agent device picker. This specification aims to rectify
that.</p>
<p>Due to the significant fingerprinting vector caused by device
{{MediaDeviceInfo/label}}s, and the well-established nature of the existing
APIs, the scope of this particular effort is limited to removing
{{MediaDeviceInfo/label}}, leaving the overall constraints-based model
intact. This helps ensure a migration path more viable than to a
less-powerful API.</p>
<p>This specification augments the existing {{MediaDevices/getUserMedia()}}
function instead of introducing a new less-powerful API to compete with it,
for that reason as well.</p>
<section id="new getusermedia semantics">
<h3>getUserMedia "user-chooses" semantics</h3>
<p>This specification introduces
slightly altered semantics to the {{MediaDevices/getUserMedia()}}
function called <code>"user-chooses"</code> that guarantee a picker will
be shown to the user in cases where the user agent would otherwise choose
for the user (that is: when application constraints do not narrow down
the choices to a single device). This is orthoginal to permission, and
offers a better and more consistent user experience across applications
and user agents.
</p>
<p>Unfortunately, since the <code>"user-chooses"</code> semantics may
produce user agent prompts at different times and in different situations
compared to the old semantics, they are somewhat incompatible with
expectations in some existing web applications that tend to call
{{MediaDevices/getUserMedia()}} repeatedly and lazily instead of using
e.g. <code>stream.clone()</code>.</p>
</section>
<section id="web compatibility">
<h3>Web compatibility and migration</h3>
<p>User agents are encouraged to provide the new semantics as opt-in
initially for web compatibility. User agents MUST deprecate (remove)
{{MediaDeviceInfo/label}} from {{MediaDeviceInfo}} over time, though specific migration strategies
are left to user agents. User agents SHOULD migrate to offering the new
semantics by default (opt-out) over time.</p>
<p>Since the constraints-model remains intact, web compatibility problems
are expected to be limited to:</p>
<ul>
<li>
Sites that never migrated show e.g. "Camera 1", "Camera 2" etc.
instead of descriptive device labels
</li>
<li>
Sites with no device management strategy provoke a picker in the
user agent every visit for users with more than a singular choice
of camera or microphone (a feature of sorts)
</li>
</ul>
</section>
</section>
<section id="mediadevices-interface">
<h3>MediaDevices Interface Extensions</h3>
<div>
<pre class="idl"
>partial interface MediaDevices {
readonly attribute GetUserMediaSemantics defaultSemantics;
};</pre>
</div>
<section>
<h2>Attributes</h2>
<dl data-link-for="MediaDevices" data-dfn-for="MediaDevices"
class="attributes">
<dt id="def-mediadevices-defaultSemantics"><dfn><code>defaultSemantics</code></dfn>
of type <span class="idlAttrType"><a>GetUserMediaSemantics</a></span>, readonly</dt>
<dd>
<p>The default semantics of {{MediaDevices/getUserMedia()}} in this
user agent.</p>
<p>User agents SHOULD default to <code>"browser-chooses"</code>
for backwards compatibility, until a transition plan has been
enacted where a majority of user agents collectively switch their
defaults to <code>"user-chooses"</code> for improved user privacy,
and usage metrics suggest this transition is feasible without
major breakage.</p>
</dd>
</dl>
</section>
</section>
<section id="mediastreamconstraints-dictionary-extensions">
<h3>MediaStreamConstraints dictionary extensions</h3>
<div>
<pre class="idl"
>partial dictionary MediaStreamConstraints {
GetUserMediaSemantics semantics;
};</pre>
<section>
<h2>Dictionary {{MediaStreamConstraints}} Members</h2>
<dl data-link-for="MediaStreamConstraints" data-dfn-for=
"MediaStreamConstraints" class="dictionary-members">
<dt><dfn><code>semantics</code></dfn> of type <span class=
"idlMemberType">{{GetUserMediaSemantics}}</span></dt>
<dd>
<p>In cases where the specified constraints do not narrow
multiple choices between devices down to one per kind, specifies
how the final determination of which devices to pick from the
remaining choices MUST be made. If not specified, then the
<a data-link-for="MediaDevices">defaultSemantics</a> are used.
</p>
</dd>
</dl>
</section>
</div>
</section>
<section id="getusermediasemantics-enum">
<h3>GetUserMediaSemantics enum</h3>
<div>
<pre class="idl"
>enum GetUserMediaSemantics {
"browser-chooses",
"user-chooses"
};</pre>
<table data-link-for="GetUserMediaSemantics" data-dfn-for=
"GetUserMediaSemantics" class="simple">
<tbody>
<tr>
<th colspan="2"><dfn>GetUserMediaSemantics</dfn> Enumeration
description</th>
</tr>
<tr>
<td><dfn><code id=
"idl-def-GetUserMediaSemantics.browser-chooses">browser-chooses</code></dfn></td>
<td>
<p>When application-specified constraints do not narrow multiple
choices between devices down to one per kind, the user agent is
allowed to make the final determination between the remaining
choices.
</p>
</td>
</tr>
<tr>
<td><dfn><code id=
"idl-def-GetUserMediaSemantics.user-chooses">user-chooses</code></dfn></td>
<td>
<p>When application-specified constraints do not narrow
multiple choices between devices down to one per kind, the user
agent MUST
<a href="prompt-the-user-to-choose">prompt the user to choose</a>
between the remaining choices, even if the application already
has permission to some or all of them.</p>
</td>
</tr>
</tbody>
</table>
</div>
</section>
<section>
<h2>Algorithms</h2>
<p>When the {{MediaDevices/getUserMedia()}} method is invoked, run the
following steps before invoking the {{MediaDevices/getUserMedia()}}
algorithm:</p>
<ol>
<li>
<p>Let <var>mediaDevices</var> be the object on which this method was
invoked.</p>
</li>
<li>
<p>Let <var>constraints</var> be the method's first argument.</p>
</li>
<li>
<p>Let <var>semanticsPresent</var> be <code>true</code> if
<var>constraints</var><code>.semantics</code> [= map/exists =],
otherwise <code>false</code>.</p>
</li>
<li>
<p>Let <var>semantics</var> be
<var>constraints</var><code>.semantics</code>
if <a href="https://heycam.github.io/webidl/#dfn-present">present</a>,
or the value of <var>mediaDevices</var><code>.<a data-link-for="MediaDevices">defaultSemantics</a></code>
otherwise.</p>
</li>
<li>
<p>Replace step 6.5.1. of the {{MediaDevices/getUserMedia()}}
algorithm in its entirety with the following two steps:</p>
<ol>
<li>
<p>Let <var>descriptor</var> be a {{PermissionDescriptor}}
with its {{PermissionDescriptor/name}} member set to the permission name
associated with <var>kind</var> (e.g. {{PermissionName/"camera"}} for
<code>"video"</code>, {{PermissionName/"microphone"}} for <code>"audio"</code>), and,
optionally, consider its {{DevicePermissionDescriptor/deviceId}} member set to any appropriate
device's <var>deviceId</var>.</p>
</li>
<li>
<p>If the number of unique devices sourcing tracks of
media type <var>kind</var> in <var>candidateSet</var>
is greater than <code>1</code> and
<var>semantics</var> is <code>"user-chooses"</code>,
then <a>prompt the user to choose</a> a device with
<var>descriptor</var>, resulting in provided media.
Otherwise, <a>request permission to use</a> a
device with <var>descriptor</var>, while considering
all devices being attached to a live and
<a>same-permission</a> MediaStreamTrack in the current
[=browsing
context=] to mean having permission status {{PermissionState/"granted"}},
resulting in provided media.</p>
<p><dfn>Same-permission</dfn> in this context means a
{{MediaStreamTrack}} that required the same level of
permission to obtain as what is being requested.</p>
<p>When asking the user’s permission, the user agent
MUST disclose whether permission will be granted only to
the device chosen, or to all devices of that
<var>kind</var>.</p>
<p>Let <var>track</var> be the provided media, which
MUST be precisely one track of type <var>kind</var> from
<var>finalSet</var>. If <var>semantics</var> is
<code>"browser-chooses"</code> then the decision of
which track to choose from <var>finalSet</var> is up
to the User Agent, which MAY use the value of the computed
"fitness distance" from the <a href=
"https://www.w3.org/TR/mediacapture-streams/#dfn-selectsettings">
SelectSettings</a>
algorithm, the value of <var>semanticsPresent</var>,
or any other internally-available information about
the devices, as inputs to its decision.
If <var>semantics</var> is <code>"user-chooses"</code>,
and the application has not narrowed down the choices
to one, then the user agent MUST ask the user to make
the final selection.</p>
<p>Once selected, the source of the
{{MediaStreamTrack}} MUST NOT change.</p>
<p>User Agents are encouraged to default to or present
a default choice based primarily on fitness distance,
and secondarily on the user's primary or system default
device for <var>kind</var> (when possible). User Agents
MAY allow users to use any media source, including
pre-recorded media files.</p>
</li>
</ol>
</li>
</ol>
</section>
<section>
<h2>Examples</h2>
<div>
<p>This example shows a setup with a start button and a camera selector
using the new semantics (microphone is not shown for brievity but is
equivalent).</p>
<pre class="example">
&lt;button id="start"&gt;Start&lt;/button&gt;
&lt;button id="chosenCamera" disabled&gt;Camera: none&lt;/button&gt;
&lt;script&gt;

let cameraTrack = null;

start.onclick = async () => {
try {
const stream = await navigator.mediaDevices.getUserMedia({
video: {deviceId: localStorage.cameraId}
});
setCameraTrack(stream.getVideoTracks()[0]);
} catch (err) {
console.error(err);
}
}

chosenCamera.onclick = async () => {
try {
const stream = await navigator.mediaDevices.getUserMedia({
video: true,
semantics: "user-chooses"
});
setCameraTrack(stream.getVideoTracks()[0]);
} catch (err) {
console.error(err);
}
}

function setCameraTrack(track) {
cameraTrack = track;
const {deviceId, label} = track.getSettings();
localStorage.cameraId = deviceId;
chosenCamera.innerText = `Camera: ${label}`;
chosenCamera.disabled = false;
}
&lt;/script&gt;
</pre>
</div>
</section>
</body>
</html>

0 comments on commit 83c2805

Please sign in to comment.