Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the relationship between toggle microphone/camera actions and MediaStreamTrack mute/unmute events? #307

Closed
youennf opened this issue Nov 27, 2023 · 11 comments

Comments

@youennf
Copy link
Contributor

youennf commented Nov 27, 2023

When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events.
It would help to define whether the mute/unmute events are fired before or after the corresponding MediaSession action callback.

@youennf
Copy link
Contributor Author

youennf commented Nov 27, 2023

This came up as part of w3c/mediacapture-extensions#39.

@youennf
Copy link
Contributor Author

youennf commented Nov 27, 2023

To me, it makes more sense that the action callback is executed first and the mute/unmute events second.
If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.

Maybe it would help for MediaSessionActionDetails to contain more information, something like:

  • whether the action is about muting or unmuting
  • which devices are being toggled (via deviceId maybe)

The first bullet is also related to the question we have about how setMicrophoneActive and setCameraActive are expected to be used.

@jan-ivar
Copy link
Member

jan-ivar commented Jan 2, 2024

There is currently no relationship AFAIK.

When the toggle capture action is executed, MediaStreamTrack muted state will change, which will fire mute or unmute events.

I don't think this describes any browser today.

I've confirmed this with this fiddle which reacts to the mic mutecam mute buttons in Picture-in-Picture mode (which end-users might be surprised to learn are chrome buttons).

Tracks are never muted in Chrome (easily observed by commenting out the JS that clears audioTrack.enabled and videoTrack.enabled). IOW, these chrome buttons are 100% webpage-controlled.

Chrome is the only browser to implement togglemicrophone and togglecamera, so I was unable to test Safari or Firefox.

@jan-ivar
Copy link
Member

jan-ivar commented Jan 2, 2024

That said, a UA COULD mute tracks here (since it can mute at any time), as I mention in #279 (comment).

The first decision such a UA would need to make would be whether to trust the webpage to maintain the mic muted cam muted states, and simply enforce them with mute/unmute.

Not trusting invites the double-mute problem.

Trust doesn't seem like a big deal in the PiP example, but that might be deceiving, as many end-users might not even consider them chrome buttons in the first place.

If we instead consider Safari's pause feature in the URL bar:

image

...then it seems less obvious that the webpage should control it. E.g. Safari might wish to A) keep it as a double-mute, or B) use heuristics on unmute based on how the user muted (from the web page or the chrome).

To that end, it might be useful to fail the setters and return a promise to allow prompting. E.g. (proposal):

  // API modification proposal
  try {
    await navigator.mediaSession.setMicrophoneActive(true);
    // unmute succeeded
  } catch (e) {
    if (e.name != "NotAllowedError") throw;
    // unmute denied
  }

UAs might also take transient activation into account.

@jan-ivar
Copy link
Member

jan-ivar commented Jan 2, 2024

To me, it makes more sense that the action callback is executed first and the mute/unmute events second.
If these are not done synchronously, this would mean the MediaStreamTrack muted state is the old one.

That seems the most deterministic, since mute/unmute may fire for other reasons.

Maybe it would help for MediaSessionActionDetails to contain ... something like: ...whether the action is about muting or unmuting

This might be helpful for a webpage that has gotten out of sync, but also seems redundant until I understand how they can get out of sync.

@youennf
Copy link
Contributor Author

youennf commented Jan 8, 2024

@jan-ivar, I think we are mostly aligned, basically:

  • When user clicks on the capture icon (PiP window or Safari address bar), UA calls the corresponding action callback and then schedule tasks to trigger mute/unmute events for tracks.
  • Pages can at any point in time call setMicrophoneActive/setCameraActive to try muting/unmuting tracks, with UA specific privacy mitigation heuristics (transient activation, prompts...).

@steimelchrome, this is not exactly how Chrome is implementing these APIs.
I am hoping it is not departing too much that this is ok. Thoughts?

This might be helpful for a webpage that has gotten out of sync

I do not think this is mandatory to settle this particular point, we can discuss it as a follow-up once we agree on the interaction between track muted and media session API.

Difficult to sync/Out of sync scenarios might happen with tracks being stopped in workers, setActive being asynchronous, and calling getUserMedia concurrently with these two.
It might have been better to reuse play/pause model instead of a single toggle but this might be too late.

@youennf
Copy link
Contributor Author

youennf commented Jan 9, 2024

Discussed in today's media WG meeting.
Plan is to:

  • Write a PR to update setMicrophoneActive/setCameraActive to return a promise and potentially fail
  • Write a PR to mention that if a UA decides to mute/unmute tracks from a user action that triggers toggle mic/camera, the mute/unmute events should fire after the corresponding action event handers.
  • File a separate issue about whether there is a need for the new state as part of the toggle mic/camera action handlers.

@chrisn
Copy link
Member

chrisn commented Jan 10, 2024

Minutes from 9 January 2024 Media WG meeting: https://www.w3.org/2024/01/09-mediawg-minutes.html

@guidou
Copy link

guidou commented Jan 30, 2024

For VC applications it is necessary to know/specify what microphone or camera the event/action refers to.
Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.

@guidou
Copy link

guidou commented Jan 30, 2024

For VC applications it is necessary to know/specify what microphone or camera the event/action refers to. Otherwise, the API is useful only in systems that have no more than one microphone and no more than one camera.

Filed issue #317 to track this.

@youennf
Copy link
Contributor Author

youennf commented Feb 15, 2024

Fixed by #313 and #312

@youennf youennf closed this as completed Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants