Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Echo cancellation: Need to specify the source of the echo cancellation reference signal #31

Open
alvestrand opened this issue Aug 30, 2021 · 12 comments · May be fixed by #32
Open

Echo cancellation: Need to specify the source of the echo cancellation reference signal #31

alvestrand opened this issue Aug 30, 2021 · 12 comments · May be fixed by #32
Assignees

Comments

@alvestrand
Copy link
Contributor

alvestrand commented Aug 30, 2021

At the moment, echo cancellation is defined on a MediaStreamTrack, but the source of the signal to be cancelled is not specified, leaving this up to the implementation.

Since most cases of echo are from a specific output device creating echo into an input device, it makes sense to specify for a given input (made visible as a MediaStreamTrack) to be echo-cancelled against the output that the application thinks will most affect it - most of the time, this will be the system default output device, but sometimes (as with headphones on the non-default device with mechanical-path echo), the right device is something else.

This seems to be addressable with a means of specifying which output device the input is to be echo-cancelled against; the most logical source of such identifiers is the output device ID from [[mediacapture-output]].

Tagging @o1ka

@jan-ivar
Copy link
Member

sometimes (as with headphones on the non-default device with mechanical-path echo), the right device is something else.

Does it matter which device is outputting the echo? The sound being captured is the same. What information is the user agent missing to cancel echo coming from a headset?

@alvestrand
Copy link
Contributor Author

In the case where you have the conference on headphones, and have a completely different sound playing on the room-directed speakers, and strong coupling between the headphones and the microphone, cancelling against the room sound is useless; that's not the sound that can cause echoes.

@o1ka
Copy link

o1ka commented Sep 2, 2021

The echo canceller needs to receive a reference output signal (audio going to the output device) to be canceled from the microphone signal. The echo canceller does not necessarily have an ability to deal with multiple reference output signals (audio streams going to different output devices): canceling multiple outputs with different acoustic paths is much more channeling and resource-demanding problem. Being able to cancel payback of only one physical output is a common limitation of echo cancellation solutions.

In most cases, the audio is played to a default output device. But the application may choose to output audio to a non-default device. If echo cancellation is limited to cancelling playback of a single output device (a common situation), the application needs to be able to specify the output device which playback it wants to cancel from the microphone (i.e. the device the application plays audio to).

@o1ka
Copy link

o1ka commented Sep 4, 2021

Here may be a possible source of confusion: echo cancellation happens not on the side which experiences (plays) echo. It happens on the side which generates echo. The echo is a playback signal picked up by the microphone and eventually transmitted to the remote end, so that the remote end hears it. The echo canceller processes the microphone signal to remove the playback signal from it, before sending it over to the network. To do so, the echo canceller needs to know what is being played out (the reference signal). The rest - see above.

@padenot
Copy link

padenot commented Sep 21, 2021

In most cases, the audio is played to a default output device. But the application may choose to output audio to a non-default device. If echo cancellation is limited to cancelling playback of a single output device (a common situation), the application needs to be able to specify the output device which playback it wants to cancel from the microphone (i.e. the device the application plays audio to).

The implementation knows the device to which the important output is routed to for a given application, including after processing/mixing via Web Audio or via multiple HTMLMediaElement in a row, and this should be used as the reverse stream.

Although not perfect, it is a very robust technique in practice, and will work better in the field that something application developers have to do. This has been implemented for years in Gecko.

@o1ka
Copy link

o1ka commented Nov 29, 2021

The implementation knows the device to which the important output is routed to for a given application, including after processing/mixing via Web Audio or via multiple HTMLMediaElement in a row, and this should be used as the reverse stream.

Although not perfect, it is a very robust technique in practice, and will work better in the field that something application developers have to do. This has been implemented for years in Gecko.

The main problem is what to consider as an important output if an application choses to play different elements to different output devices (simultaneously, or at different points in time).
Also, configuring echo cancellation is not "for free". Depending on the order in which capture and playback are set up by an application, the implementation may have to make a decision on what is the important output and to reconfigure echo cancellation multiple times.

For example, an application may call getUserMedia(), then set up WebAudio playback, and then set up media element playback to a non-default device.
Should it be left to implementation to decide which output will be cancelled? In that case, an application would have zero control over echo cancellation.

@alvestrand
Copy link
Contributor Author

@padenot can you reply to the Nov 29 comment?

@alvestrand
Copy link
Contributor Author

@padenot after more discussion inside Google, we've decided to follow the "make a heuristic" course at least for the current iteration.
I wonder if I can ask you to explain more about how the Gecko heuristic works - in particular how you tell what the important output is?

@alvestrand
Copy link
Contributor Author

Of course, the moment it was shipped, someone came up with an example:

https://crbug.com/687574

@zoschfrosch
Copy link

Hi, I was asked to describe my usecase in this thread because I made a feature request for a flag to switch off programmatically the new chrome-wide-echo-cancellation: https://bugs.chromium.org/p/chromium/issues/detail?id=1372451

My use case is a platform for vocal coaches and their pupils. They are connected by a WebRTC video session. The coach - let's say he's sitting at the far end of the connection - is able to remotely start and stop the playback of audio sources like midi singing lessons or songs. The pupil sings at the near end of the connection. The playback and the sound of the lessons is played with loudspeakers, and all sound is gathered by a microphone and sent back to the vocal coach.

With the new chrome-wide-echo-cancellation, all audio produced inside Chrome browser is filtered out. The coach does not hear the playback, and even worse, the voice of the singer is completely mangled.

If the chrome://flags/#chrome-wide-echo-cancellation would be removed in a later Chrome version, my platform would be absolutely useless.

I guess that a possibility to specify the source of the echo cancellation reference signal would have the same effect as switching off chrome-wide-echo-cancellation.

@Annoraaq
Copy link

The situation described by @zoschfrosch does seem like a valid use case to me. Are there any updates on this?

@alvestrand
Copy link
Contributor Author

@padenot missing your response on the request from October 2022.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants