Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix echo cancellation bug + add command line option to dump AudioInput streams #4167

Merged
merged 2 commits into from May 26, 2020

Conversation

fedetft
Copy link
Contributor

@fedetft fedetft commented May 12, 2020

Dear mumble developers,
I think I found a bug in the echo canceller.

Basically, the way the current mumble code works is to put speaker readback samples in a queue (qlEchoFrames) in the addEcho() callback, and then in the addMic() callback take both the microphone and speaker samples and pass them to the echo canceller.

However,
https://www.speex.org/docs/manual/speex-manual/node7.html#SECTION00740000000000000000
explicitly says that "It is important that, at any time, any echo that is present in the input has already been sent to the echo canceller as echo_frame."
and adding a queue to the speaker samples makes them arrive after the microphone ones.

As a result, the echo canceller is only effective against periodic signals, but not for voice.

To verify that, the first commit of this pull request,
fedetft@b3aa5de
adds a command line option to tap and synchronously dump pcm streams of the raw microphone, speaker readback, and processed microphone.

The result is shown in this image: https://imgur.com/a/FpeB6Mp

The top figure shows the original mumble code with only the profiling patch applied.
As can be seen, the echo canceller receives the speaker readback with a high delay, I experienced up to 300ms, and is thus effective only against periodic sounds, not voice audio.

The bottom figure shows the effect of this pull request, a 20ms lead is forced by delaying the audio path, so that the echo canceller is reasonably certain to receive the data in the correct order even when callbacks are jittery.

The patch has been tested with the mixed echo cancellation on Linux with PulseAudio.

The patch fixes the issue but some more work needs to be done, in particular either the multichannel echo cancellation is broken and passes garbage data to the echo canceller, or I didn't understand how the PCM streams are passed. In any case, it does not seem to cancel echo, so there appears to be an issue, although I haven't looked it up yet.

Moreover, I could not understand if the addMic() and addEcho() callbacks can be called concurrently by multiple threads, and thus if a mutex is requred in the Resynchronizer class I added.

@fedetft
Copy link
Contributor Author

fedetft commented May 13, 2020

Closed and reopened as the CI was failing due to network errors, but it appears every time there's at least one target that has network issues.

Don't know what's wrong with translations.

src/mumble/main.cpp Show resolved Hide resolved
@Krzmbrzl
Copy link
Member

Closed and reopened as the CI was failing due to network errors, but it appears every time there's at least one target that has network issues.

The issue is actually a different one and closing and reopening doesn't do anything (I think). Anyways I restarted the CI.
As for the translations: See my review :)

@streaps
Copy link

streaps commented May 13, 2020

The patch fixes the issue but some more work needs to be done, in particular either the multichannel echo cancellation is broken and passes garbage data to the echo canceller, or I didn't understand how the PCM streams are passed. In any case, it does not seem to cancel echo, so there appears to be an issue, although I haven't looked it up yet.

Last time I asked, how multichannel echo cancellation works and what is it good for, nobody could explain it. Does anybody care about that (broken) feature or could it be removed?

The OpusFAQ recommends the Google AEC from WebRTC.
https://wiki.xiph.org/index.php?title=OpusFAQ&mobileaction=toggle_view_desktop#Does_Opus_have_an_echo_canceller_like_Speex_does.3F

@fedetft
Copy link
Contributor Author

fedetft commented May 13, 2020

My branch is called webrtc because the original intention was to replace mumble's echo cancellation algorithm, which to me at least performed poorly, with webrtc's Google AEC.

However, when trying to understand the code I found this bug, and after fixing it, it looks like libspeexdsp's echo canceller isn't bad at all, it was just not operating due to a precondition violation.

For what concerns the multichannel echo canceller and the possibility to remove it, I see an UX/UI issue, and a code issue.

The UX issue is that it is not clear at all to the end user the difference between mixed and multichannel. I only understood it after reading the C++ code...
If you want to keep it, it would be good to add some text like
"Mixed echo cancellation mixes all speaker outputs in one mono stream and passes that stream to the echo canceller, while multichannel echo cancellation passes all audio channels to the echo canceller directly.
Multichannel echo cancellation requires more CPU, so you should try mixed first"

The code issue is that it looks like the multichannel pcm data is somehow corrupted. Since the echo readback is never heard by anyone and only passed to the echo canceller, it was difficult to spot this bug, until I added the --dump-input-streams command line option.

@streaps
Copy link

streaps commented May 13, 2020

thanks for the explanation.

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the actual review, we'll have to wait for @davidebeatrici as I don't really know the audio system...

src/mumble/AudioInput.cpp Outdated Show resolved Hide resolved
@fedetft
Copy link
Contributor Author

fedetft commented May 15, 2020

Update: yesterday I started having a look at why the multichannel echo canceller receives garbage data. I found a small issue but the signal didn't improve at all.
Should I add the commits related to the multichannel echo canceller to this PR?

@Krzmbrzl
Copy link
Member

I think it'd be a good idea, yes. This facilitates the work of a potential future contributor that wants to look into this issue :)

@fedetft
Copy link
Contributor Author

fedetft commented May 15, 2020

I think that to make it easy for future contributors to perform regression tests I would need to write somewhere how to use the --dump-input-streams option to view in Audacity the various signals being passed to AudioInput's dsp algorithms. I don't know if a pull request is the best place for that, though.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented May 15, 2020

You can create a new markdown document n docs for that purpose ☝️
EDIT: And maybe reference it in a comment somewhere in the related code sections :)

@fedetft
Copy link
Contributor Author

fedetft commented May 15, 2020

Found it! here's the bug that was corrupting the pcm data for the multichannel echo cancellation!
The addEcho function is called with smaller chunks of data, and accumulates them until it fills an entire 10ms of data, then the buffer is passed to the signal processing chain.

Foe every chunk the multichannel code path always started form the beginning of the buffer, thus not accumulating anything. This overwrote data and left the end of the buffer uninitialized too. No wonder I was hearing garbage when playing those data...

Now multichannel echo works.

@fedetft
Copy link
Contributor Author

fedetft commented May 16, 2020

Added documentation in the docs directory. Should help make sure issues in the echo canceller don't get reintroduced.

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also document the different functions and member fields in the header files using Doxygen /// comments, please?
I know that this is not done in the existing code but it's an ongoing quest of ours to improve the state of the in-source documentation :)

And if you're done with the PR, please squash the commits. I think here it'd be a good idea to make these 3 commits:

  • Addition of the new cli options + documentation
  • Fix of mixed channel echo cancellation
  • Fix of multichannel echo cancellation

We also have a practice of prefixing our commit messages with the changed path, e.g.

src/mumble/AudioInput: <your message>

You can have a look at other commits in this repo to see what I mean by that as well :)

src/mumble/AudioInput.h Outdated Show resolved Hide resolved
private:
void printQueue(char who);

//TODO: there was a mutex (qmEcho), but can the callbacks be called concurrently?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. @davidebeatrici might now better though :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know they are never called concurrently.

@fedetft
Copy link
Contributor Author

fedetft commented May 18, 2020

Had to make four commits as the documentation covers also part of the other commits.
For me this PR is complete.

@Krzmbrzl
Copy link
Member

Alright thank you!
I'll try to have a look at this in the coming days :)

@Krzmbrzl
Copy link
Member

What's the best approach to verifying that the echo canceller works as expected? Can I even do this locally on a single machine? 🤔

@fedetft
Copy link
Contributor Author

fedetft commented May 20, 2020

Good question: I've pushed another commit: it only touches the docs and adds a step-by-step guide to reproducibly verify the correct operation of the echo canceller. I think you may find it useful also in the future to avoid regressions.

By doing the test before and after the patches that fix the echo, you can appreciate the effect of the changes.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented May 20, 2020

Hm okay I tried it out using the built-in microphone of my laptop and its speakers but it doesn't seem to do all that much. I clearly see that something was done (especially noise reduction) but I still hear most parts of the YouTube video.
I guess though that this might be because the built-in microphone I was using is really shit. Probably the noise was so intense that the echo canceller was confused by it. I'll try again with my headset.

Could someone else please also try this out and report back here?

EDIT: The results with my normal headset are pretty much the same. Significant parts of from the YT video are still hearable...

@TerryGeng
Copy link
Contributor

TerryGeng commented May 20, 2020

Hi! I'm amazed by your work and it also solved my problem that why echo cancellation isn't working for us. I'm not actually a mumble developer, but I have just read your code and have a little question.

I have seen your finite state design that is used to keep the speaker audio chunk ahead of mic chunk. But then I read these lines:

if(drop == false)
{
result = AudioChunk(micQueue.front(), speaker);
micQueue.pop_front();
}

It seems that you have assumed speaker audio chunk comes later than the mic chunk, so you wait until mic chunk piles up and then return the "now" speaker chunk and "past" mic chunk, which is just the opposite of requiring speaker chunk comes earlier than mic chunk.

I'm not sure what mistake I have made in understand your code and I'd appreciate for your reply!

@fedetft
Copy link
Contributor Author

fedetft commented May 25, 2020

@davidebeatrici
The use of alloca() is to minimize the number of dynamic memory allocations in code that is called 200 times a second. I've tried to allocate on the stack all the buffers that don't have to outlive the stack frame they're in, which is true for anything but the echo cancel queue.
This way you get better performance and you also can't forget a delete and leak memory.
Some of those can be replaced with just an array declaration on the stack, but in some cases the array size is a variable, and I was getting CI errors with MSVC as it doesn't support C99 style variable length arrays.

In the long run, it would be best to just wrap the buffers in an object that manages its lifetime, but such a change would be too invasive and is unrelated to fixing the echo issue. Moreover, if there are rumors about rewriting everything, would a separate PR that refactors the code even be useful or just a waste of time?

About the mutex, I can remove it if you're sure it's not needed, but I can't test on any platform other than linux/pulseaudio.

@Krzmbrzl
Copy link
Member

if there are rumors about rewriting everything, would a separate PR that refactors the code even be useful or just a waste of time?

I think for now you should leave it until we have decided on what to do about the rewrite/refactor

About the mutex, I can remove it if you're sure it's not needed, but I can't test on any platform other than linux/pulseaudio.

Does it hurt to have it there? If not I think I'd play it safe and just leave it as it is 🤔

@fedetft
Copy link
Contributor Author

fedetft commented May 25, 2020

Leaving a potentially unnecessary mutex is a missed performance optimization.
There was a mutex also before, so this code is no slower than the previous one.

@Krzmbrzl
Copy link
Member

Yeah but if it turns out that there is some concurrency in there after all (well hidden in layers of cryptic code) we'll have a problem xD

But as I don't know the audio code, I'll leave the decision about that up to you and @davidebeatrici :)

@fedetft fedetft mentioned this pull request May 25, 2020
@fedetft
Copy link
Contributor Author

fedetft commented May 25, 2020

I've removed it locally and will test it this evening on Linux.
If @davidebeatrici is sure to remove it and it doesn't segfault on my machine, I'll push the change as a separate commit (so as to keep the previous version in the history).

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing: I think the last commit should be documentation only. The code style fixes should be squashed into the other commits ☝️

After that though I'd say we can merge this :)

@fedetft
Copy link
Contributor Author

fedetft commented May 26, 2020

I don't know in git an automated way of doing that squash without going through line by line in the previous commits to change the parentheses and spaces, which given the scale of the changes requested is not a productive use of my time.

Please consider setting up an automated code formatting tool in the future. Personally I'm used to a completely different code style and this part of the review process is becoming a barrier to future contributions.

@Krzmbrzl
Copy link
Member

You can simply reset all commits and add the respective files and commit them in the chunks you want. That way you don't have to do line by line edits.

@fedetft
Copy link
Contributor Author

fedetft commented May 26, 2020

Ok,
the final version of main.cpp uses a variable that has been introduced only in the first echo cancellation fix, so it can't stay in its own commit without creating a commit that won't compile, and the two echo cancellation fixes touch the same file, so that means the only option is to make a single commit with all the code, and one with the documentation.
Apparently the issue is just that I value the git history much more than you guys, but if that's how you want your commits, that's fine for me too.

@Krzmbrzl Krzmbrzl merged commit 12ce17e into mumble-voip:master May 26, 2020
@Krzmbrzl
Copy link
Member

Thank you very much for your contribution! Much appreciated :)

Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 26, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 26, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 27, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 27, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 27, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 28, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 28, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Krzmbrzl pushed a commit to Krzmbrzl/mumble that referenced this pull request May 28, 2020
…nput audio data and echo canceller queue state

(Backported from mumble-voip#4167 and adapted to work with 1.3.x)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants