Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraint to exclude application audio (echo) #79

Closed
henbos opened this issue Oct 3, 2018 · 5 comments
Closed

Constraint to exclude application audio (echo) #79

henbos opened this issue Oct 3, 2018 · 5 comments

Comments

@henbos
Copy link
Contributor

henbos commented Oct 3, 2018

If I screen share in a presentation that is also a conference with remote participants, and my screen sharing includes audio (e.g. I want to show a media clip as part of the presentation) we have a risk of echo.

  • If screen share contains remote participants talking, when this stream arrives at a remote participant for playout, they will hear themselves AND other remote participants twice (once because they are receiving a stream from them directly and once more because of the screen share).
  • We can't just throw "echo cancellation" on the problem.

Constraints are currently not allowed to limit the user's choice, and this is generally a good thing; which sources are present are are none of the application's business. However, the application need to be able to constrain the user agent not to include audio from the application that performed the getDisplayMedia() request to avoid echo. Otherwise these applications would end up presenting the user with false choices. Options that, if chosen, would seemingly cause "echo bugs".

Proposal: {audio:{excludeApplicationAudio:true}} limits user agent to provide a stream that does not contain the application audio. Note that this does not say what audio the user agent must provide - the implementation/user still very much has freedom of choice - it only specifies that a particular audio source must not be present. This constrain can be fulfilled in multiple ways, e.g:

  1. Exclude choices from the user. For example, "tab audio" or "window audio" are still valid choices, as long as the "tab" chosen is not the application tab. "System audio" is not a valid choice.
  2. "No audio" is a valid choice. (Though in this case no audio track should be produced)
  3. Manipulate the audio sources to subtract the application audio.

1 and 2 are easy to implement. 3 is likely infeasible for most platforms, but conceivable. In any case, the application should not have to care about user agent capabilities - as long as the application gets a stream that does not produce echo it is happy.

This was referenced Oct 3, 2018
@henbos henbos self-assigned this Oct 11, 2018
@martinthomson
Copy link
Member

The point is to exclude audio that "this" application (i.e., origin) is generating, right? As you say, I think that you should allow the UA to implement that as it sees fit. That said, this is only a problem if the share encompasses the current tab/window and includes audio.

Given the acknowledged difficultly in isolating different audio sources we have, it seems like this is a difficult feature to implement. And echo cancellation will go some way to address the problem. How important is this?

@henbos
Copy link
Contributor Author

henbos commented Oct 12, 2018

And echo cancellation will go some way to address the problem.

Perhaps, but I'm skeptical. Not only would screen sharing the application source reflect it back at 100% volume, each participant is a new possible source of echo with different delays and at different quality, it's not just your microphone.

I am operating under the assumption that echo cancellation is not good enough (I don't have data to back this up) and we want to give browsers an easy way out.

How important is this?

Under this assumption, this feature is essential. We should not present false choices to the user that cause "echo bugs".

Given the acknowledged difficultly in isolating different audio sources we have, it seems like this is a difficult feature to implement.

I don't expect UAs to implement excluding audio from any of the audio source configuration. Most likely they will exclude choices (e.g. instead of presenting all tabs as choices, presenting all tabs except the application tab as choices), but certain configurations might allow you to exclude a particular audio source (e.g. sharing a browser window and including audio sources from all tabs except the application tab) and then the browser should be allowed to implement something usable without being mandated to. There's always an easy way to satisfy this constraint: don't present problematic configurations as choices.

If we don't have a constraint like this we might implicitly be forcing the browser to have a really good technical solution to the problem which we want to avoid or leaving it to the user to figure out whether or not a choice is valid for their applications' use case.

@suhasHere
Copy link
Contributor

I tend to agree with @martinthomson on leave it to UA to make the best decision.
I do feel sharing a media clip is a valid use-case in many scenarios and when that happens typically the presenter doesn't talk over fully while he/she is sharing the mediaClip.
Also wonder if we should probably start assuming that echo cancellation is not that bad either ..

Thoughts ?

@henbos
Copy link
Contributor Author

henbos commented Dec 6, 2018

If the echo cancellation is good enough, this constraint is not necessary, and it would probably be a bad idea to add it. Then we'd require implementations to be really good at it if they want to support audio.

More information is needed. But if it's not good enough then...

The UA can't make a "best decision" without knowing what the application use case is. If the application doesn't specify, how would the UA know that origin audio is problematic? I think the UA would end up being overly defensive and exclude origin audio even when it would not be problematic. I.e. not being able to share desktop /w audio even in cases where the platform/browser supports it and the application could not be affected by echo. That's a shame for people who want to be able to do that.

Alternatively, let the user choose and the user has nobody to blame but themselves if they make a bad choice. Perhaps this is valid.

With regards to "not talking": I'm not a fan of "hey can everyone please be quiet, I'm about to hold a presentation". No more question and answers sections, that would involve someone talking. Or we are only interested in solving this use case for applications that implement a manual mute/unmuting buttons, meaning the presenter loses the ability to hear anyone not physically present while presenting. This sounds like a workaround rather than a solution.

@henbos henbos closed this as completed in #94 Feb 1, 2019
@danbriggs5
Copy link

@henbos Just wanted to thank you for pushing this forward. As an application developer, I'd love to have this, and I know our users would as well. It's nasty trying to explain the echo situation to users and help them avoid it. Especially since we can't detect if they selected a window, application, or tab. Looking forward to seeing browsers implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants