Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT(client): Positional interaural delay #5094

Merged
merged 1 commit into from
Jun 14, 2021

Conversation

Epicalert
Copy link
Contributor

@Epicalert Epicalert commented Jun 3, 2021

Adds a slight 0.6 millisecond delay between ears depending on where the sound source is coming from.

There is a small time delay (interaural time delay or ITD) between your
ears depending on the sound source position on the horizontal plane and
the distance between your ears. This commit will add this delay by using
the extra sound data in the audio buffer.

Fixes #2324

Checks

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things that come to my mind here:

  1. There should be more comments in the code that explain what interaural delay is about and what the current code part is doing to achieve this goal
  2. From a first glance it seems that the delay is now always accounted for, even if there is no positional data or positional audio is disabled. Is this indeed the case?

src/mumble/Audio.h Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutputUser.h Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutputSample.cpp Show resolved Hide resolved
src/mumble/AudioOutputUser.cpp Outdated Show resolved Hide resolved
@Krzmbrzl Krzmbrzl added client feature-request This issue or PR deals with a new feature positional audio labels Jun 3, 2021
@Kissaki
Copy link
Member

Kissaki commented Jun 3, 2021

I have added the description from the first commit to the MR description to provide some more context and descriptiveness.

How did you arrive at exactly 0.6 seconds? I would prefer something based on actual studies/audio reference over using just any random value (which may fit less for some people than others).

@Epicalert
Copy link
Contributor Author

I have added the description from the first commit to the MR description to provide some more context and descriptiveness.

How did you arrive at exactly 0.6 seconds? I would prefer something based on actual studies/audio reference over using just any random value (which may fit less for some people than others).

This article claims the average distance between ears is 20 cm. Assuming the speed of sound is 343 m/s, the ITD would be 0.58 ms if the distance between ears is 20 cm. This Wikipedia article claims the average distance is 15.2 cm for men (0.44 ms) and 14.4 cm for women (0.42 ms), but it does not cite a source. The width of this binaural microphone is 19 cm (0.55 ms).

It would also be nice if we had a slider in the settings menu that let the user set their head width.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Jun 3, 2021

Please don't create merge commits in your branch. If you want to update against another branch, please do a rebase instead :)

@vimpostor
Copy link
Contributor

vimpostor commented Jun 3, 2021

I think this is very interesting, thanks for this PR!

I have attached a direct comparision between the old and new: The sentence "This is an example of speex, an audio compression codec specifically tuned for the reproduction of human speech" is first heard once with the old behaviour, then after that twice with the new behaviour, i.e. with interaural delay.

mumble_positional.zip
To be honest, I don't hear much of a difference :D (Maybe the rotation sounds a bit smoother, but could be just Placebo)

average distance is 15.2 cm for men (0.44 ms) and 14.4 cm for women (0.42 ms)

I think we should then reduce the delay from 0.6 to something like 0.43.

@Epicalert
Copy link
Contributor Author

Please don't create merge commits in your branch. If you want to update against another branch, please do a rebase instead :)

Sorry, should I remove the merge commit or is it too late now?

@Epicalert
Copy link
Contributor Author

I think this is very interesting, thanks for this PR!

I have attached a direct comparision between the old and new: The sentence "This is an example of speex, an audio compression codec specifically tuned for the reproduction of human speech" is first heard once with the old behaviour, then after that twice with the new behaviour, i.e. with interaural delay.

mumble_positional.zip
To be honest, I don't hear much of a difference :D (Maybe the rotation sounds a bit smoother, but could be just Placebo)

average distance is 15.2 cm for men (0.44 ms) and 14.4 cm for women (0.42 ms)

I think we should then reduce the delay from 0.6 to something like 0.43.

I tried 0.43 and it seems like it works better. What do you think?

itd-test-0.43ms.zip

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Jun 3, 2021

Sorry, should I remove the merge commit or is it too late now?

Yes please. We don't have issues with force-pushing branches in PRs. In fact it is requires in the and to straighten up the history before merging :)

@vimpostor
Copy link
Contributor

I tried 0.43 and it seems like it works better. What do you think?

itd-test-0.43ms.zip

Thanks! But just to be sure, your attached audio file is 2x the new behaviour, right? Because I literally do not hear any difference between the first and the second iteration.

@Epicalert
Copy link
Contributor Author

But just to be sure, your attached audio file is 2x the new behaviour, right? Because I literally do not hear any difference between the first and the second iteration.

Yep! The whole file is the new 0.43 ms ITD.

@Kissaki
Copy link
Member

Kissaki commented Jun 4, 2021

I think a positionally static sound source would be better to hear the difference. With how dynamic and changing the source is in the circling around audio it’s pretty hard to make out.

Practically PA sound will come mostly from one/similar side too, rather than actually circling around you.

@Epicalert
Copy link
Contributor Author

I think a positionally static sound source would be better to hear the difference. With how dynamic and changing the source is in the circling around audio it’s pretty hard to make out.

Practically PA sound will come mostly from one/similar side too, rather than actually circling around you.

First sentence: With ITD (e335b23)
Second sentence: No ITD (4fab188)

The sound source is about 50 degrees to the left in both samples.

comparison-static.zip

@vimpostor
Copy link
Contributor

vimpostor commented Jun 4, 2021

Yes, I think one can hear the difference clearly now. I did some testing right now too, and here is an example where I crossfade both versions mid-sentence and the difference is even more obvious now.

old-new-old.zip

The sentence is said twice.

  • The first time it starts with the old behaviour and then switches to the new interaural version on the word "compression", the rest of the sentence is the new version.
  • The second time it starts with the new interaural version and switches to the old version on the word "compression".

Not only is the difference clearly audible, I think with this direct comparision I can say that I prefer the new version. I think one can really hear the direction more clearly, but it is very subtle.

@davidebeatrici
Copy link
Member

davidebeatrici commented Jun 8, 2021

@vimpostor Thank you for providing a clear demonstration, I can definitely hear the difference.

I also prefer the new version.

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Jun 8, 2021

I finally came to actually listen to the provided demo and wow! The difference is very subtle and yet it is much more pleasant to listen to the new version. It definitely sounds more natural 👍

Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. These review comments are mainly concerned about comment / readability improvements.

Note also that before this will get merged I would kindly ask you to squash all your commits in this PR into a single FEAT one (the first one).

src/mumble/Audio.h Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
src/mumble/AudioOutputUser.h Outdated Show resolved Hide resolved
src/mumble/Audio.h Outdated Show resolved Hide resolved
@Epicalert Epicalert requested a review from Krzmbrzl June 11, 2021 06:57
src/mumble/Audio.h Outdated Show resolved Hide resolved
src/mumble/AudioOutput.cpp Outdated Show resolved Hide resolved
There is a small time delay (interaural time delay or ITD) between your
ears depending on the sound source position on the horizontal plane and
the distance between your ears. This commit will add this delay by using
the extra sound data in the audio buffer.

For me at least, implementing this makes it much easier to identify
where a sound is coming from when positional audio is enabled.

Implements mumble-voip#2324
@Krzmbrzl Krzmbrzl merged commit 20cd335 into mumble-voip:master Jun 14, 2021
@Krzmbrzl
Copy link
Member

Thank you very much for implementing this 👍

@vimpostor
Copy link
Contributor

Will this make it into the 1.4 release?
Would be cool to have this great feature in it, but I do understand if you don't want to make any exemptions from "no more features after the 1.4 release candidate".

@Krzmbrzl
Copy link
Member

No 1.4 is feature-frozen. But we intend to release 1.5 approximately half a year after 1.4 is released, so you shouldn't have to wait too long :)

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Oct 6, 2023

Note: This very likely caused #6149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client feature-request This issue or PR deals with a new feature positional audio
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Binaural audio on positional audio
5 participants