Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only Firefox turns off device on disabled track. Stronger language needed? #642

Closed
jan-ivar opened this issue Nov 18, 2019 · 18 comments · Fixed by #662
Closed

Only Firefox turns off device on disabled track. Stronger language needed? #642

jan-ivar opened this issue Nov 18, 2019 · 18 comments · Fixed by #662

Comments

@jan-ivar
Copy link
Member

Live camera/mic tracks can be turned on/off with the enabled attribute.

The obvious (and only?) use-case is the mute button:

muted.onclick = () => { video.srcObject.getVideoTracks()[0].enabled = !muted.checked; }

Unfortunately, only Firefox turns off the device (and hardware light) in this case.

This has led to web compat issues:

Sites like Hangouts use a hack to get this desired property on Chrome: relinquish camera track on mute, and reacquire it w/getUserMedia on unmute (the microphone track is left alone, presumably because few mics have a light?)

The hack backfires in Firefox, where Hangouts users get a needless extra prompt on unmute. Frustratingly, it would have worked beautifully in Firefox without the hack.

We worked hard to get this right in #389, but do we need stronger language to fix this web compat issue?

Right now we have: "when a track becomes either muted or disabled, and this brings all tracks connected to the device to be either muted, disabled, or stopped, then the UA MAY, using the device's deviceId, deviceId, set [[devicesLiveMap]][deviceId] to false"

Should this be a MUST? And do we need to say something about actually relinquishing the device?

If we can't get consensus on compatible behavior, should we scrap enabled?

@henbos
Copy link
Contributor

henbos commented Nov 19, 2019

Is the desired behavior from Hangouts' perspective "mute the track but don't turn the device off"? It would seem like this behavior is already supported with replaceTrack(null), and that using "enabled = false" for this effect is simply using the API incorrectly.

Remind me, what is there a difference between sending a disabled track and sending a null track?

If on the other hand we do turn off the device and the light, then re-prompting seems like the right thing to do. For privacy reasons we don't want a site to turn on the device without prompting.

@henbos
Copy link
Contributor

henbos commented Nov 19, 2019

Hang on, if the desired effect is "turn off the device", why not use track.stop()? The way Chrome handles track.enabled kind of makes sense to me actually.

@jan-ivar
Copy link
Member Author

The goal is to turn off the hardware light on mute. See blog.

@fippo
Copy link
Contributor

fippo commented Nov 20, 2019

ending the track + using replaceTrack to send it again is cumbersome and error-prone so I strongly prefer Firefox' behaviour.

@henbos
Copy link
Contributor

henbos commented Nov 20, 2019

Quoting my favorite blog ;)

Firefox 60’s indicator additionally turns gray on mute. Importantly, it does not go away, because the website still has permission to access your device(s), and can resume recording you at its discretion, at which point the indicator would begin to blink red again. When the indicator goes away completely, it cannot come back, and the website can no longer record you without asking for permission anew. Preserving this user guarantee was key.

OK I love it, this is good and motivates enabled as an API that deserves to exist separately from other "hacks".

@youennf
Copy link
Contributor

youennf commented Nov 20, 2019

I see some value in 'enabled' as it stands but also see value in making it evolve.
There is a trick in trying to make it closer to muted though.

A web page can set enabled to false and set it back to true whenever it wants.
For instance when the page is in the background, which might be hard to notice especially in case of microphone.
On the contrary, if a web page stops capture, the web page will not be able to restart capture in the background. If the web page stops capture for a long time, Safari will also for instance trigger a reprompt.
Prompting behaviour can be adjusted naturally with async getUserMedia, this is not as easy with 'enabled'.

We could try to emulate enabled=true as a sort of calling getUserMedia again with a mixture of enabled/muted/ended/user gesture restriction/focus mitigation. Not sure whether that is worth it.

I also find the argument of the excessive reprompting a bit odd.
There is nothing preventing Firefox to not prompt for a getUserMedia call in that kind of situation where enabled would be used with the same effect. Firefox could for instance render the 'gray' icon instead of reprompting.

@henbos
Copy link
Contributor

henbos commented Nov 20, 2019

If reprompting is or could be involved, async would be the way to go.

What does Firefox do when you perform "track.enabled = true;"?

@henbos
Copy link
Contributor

henbos commented Nov 20, 2019

Firefox could for instance render the 'gray' icon instead of reprompting.

I like this too.

@guest271314
Copy link

The hack backfires in Firefox, where Hangouts users get a needless extra prompt on unmute. Frustratingly, it would have worked beautifully in Firefox without the hack.

If that is the only use case

if ("requestFrame" in CanvasCaptureMediaStream.prototype) {
  // Nightly, Firefox
  // do nothing
} else if("OffscreenCanvasRenderingContext2D" in self) {
  // do Chrome, Chromium stuff
} 

What is the significance of "hardware light"?

The current language of the specification provides a means to clone, add, remove, replace tracks while not providing substantive content.

Can the track can be changed to suit either expected output?

The term "MAY" implies that implementers are free to do what they want. If implementers did not actually provide meaningful input to the specification themselves, either individually or on behalf of an entity, "MAY" allows for discretion. However, since in this case implementers appear to be involved with language in the primary document, "MAY" is ambiguous. Say "MUST". Leave no quarter for ambiguity.

Some cases from the wild might be difficult, if possible at all, to specify at all to be congruous with all existing specifications or implementations, without updating all relevant language to be consistent. For example, resize event for media element description in HTML Standard has been interpreted as at the first resize event the videoWidth and videoHeight "MUST" be equal to the underlying pixel dimensions of the frame; however, the language "first" and "MUST" re specific pixel dimensions is not actually in the current specification, from perspective here, and in any event is not guranteed to be the case when srcObject is a MediaStream from WebRTC. How could the relevant standards be changed to reflect what actually occurs without proving that at least one of the specifications is not entirely accurate? If those are the facts: write it out. Use direct language, "MUST" does not leave quarter for speculation or inference without substantive proof thereof by means of citation to the actual language in the specification.

@guest271314
Copy link

If we can't get consensus on compatible behavior, should we scrap enabled?

Note, enabled attribute, due to not being readonly can be used with a proxy to substitute for the case of mute (which links to Media Capture Main from Media Capture from DOM Elements), unmute, and ended events of MediaStreamTrack not being fired at all mute and unmute events of MediaStreamTrack from canvas.captureStream() do not fire under any circumstances

@jan-ivar
Copy link
Member Author

A web page can set enabled to false and set it back to true whenever it wants.
For instance when the page is in the background, which might be hard to notice especially in case of microphone.

@youennf Thanks for catching that! I've filed bug 1598374 on Firefox. We plan to mitigate this by firing the muted event until page receives focus. Seems within the realm of what UAs are allowed.

On the contrary, if a web page stops capture, the web page will not be able to restart capture in the background. If the web page stops capture for a long time, Safari will also for instance trigger a reprompt.

Recall Firefox's prompt includes camera/mic selection; horribly misplaced on unmute, and semantically different unless the app remembers

{video: {deviceId: {exact: track.getSettings().deviceId}}}

Poorly written apps may forget chosen cam/mic on unmute, bc they didn't test w/multiple devices.

Safari uses timed permissions. The spec works hard to not mandate a singular permission model. This accommodates UA differentiation in this still evolving space, which I think is good. Had we mandated one 7 years ago, it likely would have been Chrome's. I'm glad that didn't happen.

One-off permission, like Firefox's, is expressly permitted and follows the spec closely. For instance, look how track stop ties into the Accessible privacy indicator (distinct from Live), e.g. gray vs red in Firefox.

One-off permissions are not possible without a semantic difference between disabled and ended.

The priority of constituencies suggests the spec's job is to define the best API abstraction for app writers and end-users, regardless of user agent.

Do web authors find semantic value in using track.enabled as the API meant to successfully implement mute/unmute in all browsers? In a way that turns off the hardware light? wo/prompt?

Will end-users be able to temporarily turn off/on their camera/mic in conference calls without a prompt in all browsers? Will indicators give them confidence they're not being watched?

There is nothing preventing Safari from muting on track.enabled = true if too much time has passed, and prompt the user for confirmation that they're still there before unmuting.

@guest271314
Copy link

Do web authors find semantic value in using track.enabled as the API meant to successfully implement mute/unmute in all browsers? In a way that turns off the hardware light? wo/prompt?

Sure, as long as the caveat is clearly explained that track.enabled relevant to mute/unmute is only applicable to a MediaStreamTrack from getUserMedia() and not any derivative specifications which though discrete still refer and link to Media Capture and Streams specification (i.e., track.getSettings() does not provide the same output for HTMLMediaElement.captureStream() or canvas.captureStream()).

Will end-users be able to temporarily turn off/on their camera/mic in conference calls without a prompt in all browsers?

Remains to be seen.

Will indicators give them confidence they're not being watched?

Depends on whom is asked. There is no expectation of privacy for any signal communications.

@youennf
Copy link
Contributor

youennf commented Nov 22, 2019

Do web authors find semantic value in using track.enabled as the API meant to successfully implement mute/unmute in all browsers? In a way that turns off the hardware light? wo/prompt?

enabled is synchronous which makes it not straightforward to prompt on enabled value change.
If we believe this is an important pattern, we should add a proper API for that.

Will end-users be able to temporarily turn off/on their camera/mic in conference calls without a prompt in all browsers? Will indicators give them confidence they're not being watched?

Safari shows a browser UI to mute/unmute capture, this indicator is in sync with the hardware light and should give confidence for users. It is true that these indicators are not in sync with the page UI.

This area is still evolving in browsers and its UI.
I am not sure we are able to make a decision on the exact design.

@alvestrand
Copy link
Contributor

(not following the details)
We had this discussion at the IETF in Sapporo. The conclusion was clear: a muted track should produce no data, and turn the indicator off, but not require reacquiring the track when unmuted.

It's a bug in Chrome. I think the spec language is fine.

@jan-ivar
Copy link
Member Author

jan-ivar commented Nov 25, 2019

enabled is synchronous which makes it not straightforward to prompt on enabled value change.
If we believe this is an important pattern, we should add a proper API for that.

@youennf The user agent is allowed to fire the mute and unmute events at any time. So you can do this:

  1. If a track is disabled for too long, fire the mute event.
  2. Then if JS re-enables the track, prompt the user before reacquiring the device and fire the unmute event.

@jan-ivar
Copy link
Member Author

jan-ivar commented Nov 25, 2019

@alvestrand Firefox does not require JS to do any reacquiring. Firefox does this in the background. So the semantics of the API are maintained. The only downside is a ~1 second delay on unmute before video shows.

Try it in Firefox here.

@alvestrand
Copy link
Contributor

@youennf
Copy link
Contributor

youennf commented Mar 12, 2020

If we want to get stronger wording in the spec, we should probably define the behavior in terms of tracks muted state, at the page scope level. The scope level is important since page is the granularity of capture indicators for all browsers.

Something like: if all 'live' tracks using a given device of a page are disabled, all of these tracks are marked as muted=true.

Additionally, the spec should provide a model for when the web page might at some point want to restart capturing, i.e. unmute these tracks.
All security measures enforced for getUserMedia must also be enforceable by the browser and discoverable by the web page when unmuting tracks.

Mediacapture-Main to PR automation moved this from In progress to Done Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

7 participants