Remote audio feature #99

stephenatwork · 2019-10-14T15:39:52Z

Remote audio can now be redirected away from the default audio device.

RemoteAudioSource on Unity now plays audio using the OnAudioFilterRead API.

This PR also adds some a new high-level API which manages buffering and provides a way to consume the audio as a stream, resampling and/or changing the number of channels on the fly.

Open issues:

Add proper interface to enable/disable individual tracks
Refactor AudioTrackReadBuffer to buffer an individual track, and tidy up interface according to comments.
Tidy up Unity integration.

eanders-ms · 2019-10-14T15:56:09Z

Did you consider using OnAudioFilterRead? It allows you to write the streaming audio data (after decoding it to WAVE_FORMAT_IEEE_FLOAT) directly to an AudioSource, bypassing the need for an AudioClip.

...y.WebRTC.Unity/Assets/Microsoft.MixedReality.WebRTC.Unity/Scripts/Media/RemoteAudioSource.cs

djee-ms · 2019-10-14T16:29:57Z

Did you consider using OnAudioFilterRead? It allows you to write the streaming audio data (after decoding it to WAVE_FORMAT_IEEE_FLOAT) directly to an AudioSource, bypassing the need for an AudioClip.

I saw that technique yesterday in that Gamasutra article and was wondering about the difference with the streaming AudioClip. @eanders-ms do you have any experience with this? How do the two compare? It sounds (pun intended) like OnAudioFilterRead might be better for performance and/or latency, no? And that avoids the issue with Unity object access when changing channel count and sample rate I assume, at the expense of re-sampling since you seem to be limited to the internal sampling rate of the Unity DSP (but anyway Unity will do that resampling on the AudioSource anyway I am pretty sure).

Ping @stephenatwork, what do you think? The docs explicitly mention the technique for procedural audio:

If this is the first filter in the chain and a clip isn't attached to the audio source, this filter will be played as the audio source. In this way you can use the filter as the audio clip, procedurally generating audio.

eanders-ms · 2019-10-15T06:04:37Z

I was hoping to sign up for this work but @stephenatwork beat me to it :)

@eanders-ms do you have any experience with this? How do the two compare?

My prior experience is one-sided in that I've only used the OnAudioFilterRead approach. On a previous project I incorporated a streaming mp3 player into Unity. It would read output from ffmpeg and pipe it to an AudioSource via OnAudioFilterRead. There was some timing code to ensure the video stream stayed in sync to the audio. I think the same approach is applicable here, given the similarities.

That said, if the solution in this PR works and it's a good Unity integration, then fantastic.

stephenatwork · 2019-10-15T14:10:28Z

Did you consider using OnAudioFilterRead? It allows you to write the streaming audio data (after decoding it to WAVE_FORMAT_IEEE_FLOAT) directly to an AudioSource, bypassing the need for an AudioClip.

Thanks for the tip. That looks like a much better choice than the one I found. I did a few local tests and it seems to have very little delay (which is an issue in the AudioClip version). It shouldn't be much work to port the implementation, except to add the resampling.

stephenatwork · 2019-10-17T11:13:39Z

I've updated this PR to use the OnAudioFilterRead API.

Also, managing buffers and resampling in C# seems inefficient. So this PR adds some a new high-level API which manages buffering and provides a way to consume the audio as a stream, resampling and/or changing the number of channels on the fly.

djee-ms

Agreed in principle on design. The C# interface is a nice touch. Need some clean-up and finishing the implementation obviously. Thanks!

libs/Microsoft.MixedReality.WebRTC.Native/src/api.cpp

libs/Microsoft.MixedReality.WebRTC.Native/src/peer_connection.h

libs/Microsoft.MixedReality.WebRTC/PeerConnection.cs

libs/Microsoft.MixedReality.WebRTC.Native/src/peer_connection.h

stephenatwork · 2019-10-18T17:03:42Z

All the scaffolding is in place, we just need to choose and wire up a resampler. Luckily webrtc seems to include at least 2 which I'll need to evaluate:
webrtc\xplatform\opus\silk
webrtc\xplatform\webrtc\common_audio\resampler\include

Fix bug in size of audio frame.

…remote-audio-92

libs/Microsoft.MixedReality.WebRTC/Interop/AudioReadStreamInterop.cs

…remote-audio-92

…the output device.

stephenatwork · 2019-12-19T15:03:01Z

The current branch supports (only) spatial audio. Objects with a RemoteAudioSource component, can have an AudioSource attached. Remote audio tracks are not piped to the output device. When there is no remote audio data, either because of underrun or not yet connected, the RemoteAudioSource outputs a low buzz for debugging.
TODO: remove the dialtone, allow per track routing of output to speakers or AudioSource.

libs/Microsoft.MixedReality.WebRTC.Native/include/interop_api.h

djee-ms · 2020-04-08T16:13:08Z

libs/Microsoft.MixedReality.WebRTC.Native/include/interop_api.h

@@ -450,6 +453,11 @@ MRS_API void MRS_CALL mrsPeerConnectionRegisterRemoteAudioFrameCallback(
    PeerConnectionAudioFrameCallback callback,
    void* user_data) noexcept;

+// Experimental. Render or not remote audio tracks on the audio device.


nit: Explain about remote audio callbacks? It looks like a "mute" function with this comment alone.

djee-ms · 2020-04-08T16:14:00Z

libs/Microsoft.MixedReality.WebRTC.Native/include/interop_api.h

@@ -622,6 +630,21 @@ MRS_API mrsResult MRS_CALL mrsPeerConnectionRemoveLocalVideoTrack(
 MRS_API void MRS_CALL mrsPeerConnectionRemoveLocalAudioTrack(
    PeerConnectionHandle peerHandle) noexcept;

+MRS_API mrsResult MRS_CALL
+mrsAudioReadStreamCreate(PeerConnectionHandle peerHandle,
+                         int bufferMs,


Can we document those parameters, especially bufferMs? And for mrsAudioReadStreamRead() below too.

I'm not sure bufferMs should even be in the API. (I added it originally because I needed to test different values).

While on the topic of buffering, it's worth mentioning an issue if there's a hiccup (delay) in unity. There is no logic for 'catching up' and so in the worst case, and even with a great connection, the buffer can be completely full and all it's doing is adding latency. Half a second by default!

There is no logic for 'catching up'

This can be solved in AudioReadStream::Read by taking the newest n frames that fill the requested buffer, rather than taking the oldest ones, right? It seems that should be easy to do - or am i missing some fundamental issue?

I think the issue is that if you always take the latest, then you run the risk of dropouts which is what the buffering is trying to avoid! So the idea is that you're willing to trade N ms of delay for reduced probability of dropouts.

You can do this by skipping some part of the buffer, but it's likely (?) that creates pops/artifacts. Maybe there's something better? Perhaps temporarily speed up the audio by messing with the sample rates, or something more complex which doesn't change the pitch.

Yes, you're right (also if frames are pushed more frequently than when they are pulled that wont' work at all, but it shouldn't be our case).

Are we sure this is a problem though? I would expect OnAudioFilterRead to request more data after a hiccup to get up to speed.

In any case I have looked at how WebRTC handles audio packets buffering, doing it properly (and generically) is complicated unsurprisingly (see modules/audio_coding/net_eq/net_eq.cc). Though they have an Accelerate class that we might use directly (modules/audio_coding/net_eq/accelerate.h). Looks like some work so I'd be for logging this separately (if it is indeed an issue that can happen).

I didn't look in to the pipeline of buffers, but would expect OnAudioFilterRead to be much more dumb/realtime and always ask for the same number of samples. I think it's only a short hop from there to handing to the output device.

Absolutely, it's complex and can be deferred. Perhaps there is some simple adaptive scheme we can use to choose the buffer size based on network conditions. Or perhaps webrtc is already doing all the smart stuff and we can reduce bufferMs to 20 or 30.

I'll mark the API as experimental then and log an issue to investigate this more.

I didn't look in to the pipeline of buffers, but would expect OnAudioFilterRead to be much more dumb/realtime and always ask for the same number of samples.

That seems what's happening actually 😞

libs/Microsoft.MixedReality.WebRTC.Native/src/peer_connection.cpp

djee-ms · 2020-04-08T17:54:06Z

libs/Microsoft.MixedReality.WebRTC.Native/src/peer_connection.h

+
+  /// Fill data with samples at the given sampleRate and number of channels.
+  /// If the internal buffer overruns, the oldest data will be dropped.
+  /// If the internal buffer is exhausted, the data is padded with white noise.


Implementation pads with sine, not white noise.

From #99 (comment) this should be debugging only.

@stephenatwork any reason to pad with noise rather than simply silence here?

It was @djee-ms suggestion to help debugging, since there are many states which can lead to silence. It was moderately useful to experimentally choose a buffer size. I suppose it may be more useful to have underrun/overrun status available programmatically.

We can have Read return the actual consumed samples then.

Also I'd say we can make sine padding opt-in (easier than leave it to higher layers and not too cumbersome).

I'd rather have it opt-out. Better have strong indications that something is wrong, and let user ignore them (opt out and get silence instead), than hide a potential issue from the user without them knowing.

...y.WebRTC.Unity/Assets/Microsoft.MixedReality.WebRTC.Unity/Scripts/Media/RemoteAudioSource.cs

libs/Microsoft.MixedReality.WebRTC/IAudioReadStream.cs

libs/Microsoft.MixedReality.WebRTC/PeerConnection.cs

djee-ms · 2020-04-08T18:02:33Z

libs/Microsoft.MixedReality.WebRTC/PeerConnection.cs

+        }
+
+        /// <summary>
+        /// High level interface for consuming WebRTC audio streams.


This doesn't say much about why one would use that and what features it enables. Can we add an example of feature to show why/how to use an audio read stream?

libs/Microsoft.MixedReality.WebRTC/IAudioReadStream.cs

Average before resampling and duplicate at the end, so we do work on a single channel when possible

Audio won't be played if missing

fibann · 2020-04-17T09:48:05Z

...y.WebRTC.Unity/Assets/Microsoft.MixedReality.WebRTC.Unity/Scripts/Media/RemoteAudioSource.cs

            if (!IsPlaying)
            {
                IsPlaying = true;
-                //PeerConnection.Peer.RemoteAudioFrameReady += RemoteAudioFrameReady;
+                OnAudioConfigurationChanged(deviceWasChanged: false);
+                _audioTrackReadBuffer = PeerConnection.Peer.CreateAudioTrackReadBuffer();
            }
        }


@stephenatwork @djee-ms it feels weird that we can control play state on the linked AudioSource and on this object independently. Looks like an easy source of mistakes. Also we need to decide a sane behavior for when the AudioSource plays but this doesn't (at the moment you get beeping).

I propose to

create/destroy the read buffer automatically when a track is added/removed

return empty frames in OnAudioFilterRead if there is no track

remove Play/Stop/IsPlaying from this class, or make them a shortcut for _linkedAudioSource.(Play|Stop|IsPlaying)

Thoughts?

Sounds good to me. I suppose that there is a small overhead even when paused via the unity AudioSource, because IIRC the buffering/transcoding still happens internally, even if we're not consuming the data via OnAudioFilterRead. Not sure if it's worth doing anything about that.

At the moment we transcode only on read so we don't have this issue*. For the sake of correctness though I suppose we can override OnEnable/OnDisable to turn the buffer on/off.

*Maybe we shouldn't wait for a Read() call (I am not sure if the processing adds any meaningful latency) but we can deal with this later.

Yes, we should continue to transcode on read because this means transcoding inside the audio thread, which is the right place to do so, instead of some other thread like the WebRTC signaling thread or the main Unity app thread, which are busy doing other stuffs. And yes we should return silence when there's no track, to be consistent with the WebRTC behavior of sending silence when there's no track on a transceiver. I also agree on removing the playing state from here and use the audio source one only to avoid confusion.

Filippo - when I say on read I mean OnAudioFilterRead. Do you mean something else? IIRC we're transcoding when the frame arrives from webrtc (addFrame), not when requested from OnAudioFilterRead (that part is a memcpy). The transcoding is the extra work I mean if the unity source is paused, but data continues to arrive from webrtc.

Remote audio interface has been refactored so that it compiles, but still needs to be ported to multi-track world. Part of RemoteAudioSource has been moved to AudioReceiver, but the class is not functional yet.

…remote-audio-92 CustomAudioMixer moved to peer_connection.h due to include order issues.

Will be refactored later.

djee-ms

Taken offline: Let's merge as is and improve from there.

…oft/user/stephenatwork/remote-audio-92)

stephenatwork · 2020-04-24T12:48:55Z

libs/mrwebrtc/src/interop/peer_connection_interop.cpp

+}
+
+mrsResult MRS_CALL
+mrsAudioTrackReadBufferRead(AudioTrackReadBufferHandle readStream,


nit: readStream -> readBuffer or readHandle

The API has been refactored to create a buffer from a track (rather than the PeerConnection) and to address comments on microsoft#99.

Remote audio feature

3fa8108

stephenatwork requested a review from djee-ms October 14, 2019 15:39

stephenatwork self-assigned this Oct 14, 2019

stephenatwork mentioned this pull request Oct 14, 2019

Support for connecting a WebRTC audio track to a Unity AudioSource #92

Closed

djee-ms added the enhancement New feature or request label Oct 14, 2019

djee-ms added this to the v1.1 release milestone Oct 14, 2019

djee-ms suggested changes Oct 14, 2019

View reviewed changes

Proposed IAudioReadStream API

094f037

stephenatwork requested a review from djee-ms October 17, 2019 11:13

djee-ms suggested changes Oct 17, 2019

View reviewed changes

Update from review comments

15c07fb

djee-ms approved these changes Oct 18, 2019

View reviewed changes

libs/Microsoft.MixedReality.WebRTC.Native/src/peer_connection.h Outdated Show resolved Hide resolved

Fix doc comment

df30e6d

stephenatwork added 5 commits October 21, 2019 16:44

Added resampling and buffering implementation

5c9d789

Unsubscribe audio frame callback on destruction.

a6219d0

Fix bug in size of audio frame.

Tidy overrun logic

6a2f0cb

Merge remote-tracking branch 'origin/master' into user/stephenatwork/…

c20074a

…remote-audio-92

Post merge fixes.

04e88d0

stephenatwork commented Dec 2, 2019

View reviewed changes

libs/Microsoft.MixedReality.WebRTC/Interop/AudioReadStreamInterop.cs Outdated Show resolved Hide resolved

djee-ms mentioned this pull request Dec 10, 2019

Error init PeerConnection #141

Closed

stephenatwork added 3 commits December 16, 2019 12:17

Merge remote-tracking branch 'origin/master' into user/stephenatwork/…

14e2da9

…remote-audio-92

Add calling convention

0a956e4

Add a mixer which prevents remote audio streams from going direct to …

5222042

…the output device.

djee-ms suggested changes Apr 8, 2020

View reviewed changes

fibann added 3 commits April 9, 2020 11:19

Fix formatting

5a91eeb

Rename AudioReadStream to AudioTrackReadBuffer

92aca1c

Improve comments

81d883f

fibann reviewed Apr 10, 2020

View reviewed changes

libs/Microsoft.MixedReality.WebRTC/IAudioReadStream.cs Outdated Show resolved Hide resolved

fibann added 10 commits April 10, 2020 12:25

Use named argument

5031303

Use SafeHandle for AudioTrackReadBuffer

d795ea5

Apply rename to Unity lib

936ebea

Move AudioTrackReadBuffer to its own file

2dd1d39

Move AudioTrackReadBuffer (C#) to its own file

713b4ef

Apply rename to Unity project

1f5c749

Fix issues with AudioTrackReadBuffer interop handle

12225f4

Reduce redundant work in Buffer::addFrame

a685ea5

Average before resampling and duplicate at the end, so we do work on a single channel when possible

Do not render directly when using RemoteAudioSource

3b56299

Require AudioSource with RemoteAudioSource

1cf883c

Audio won't be played if missing

fibann reviewed Apr 17, 2020

View reviewed changes

fibann added 3 commits April 23, 2020 14:48

Merge remote-tracking branch 'origin/master' into HEAD

5dad3b3

Remote audio interface has been refactored so that it compiles, but still needs to be ported to multi-track world. Part of RemoteAudioSource has been moved to AudioReceiver, but the class is not functional yet.

Merge remote-tracking branch 'origin/master' into user/stephenatwork/…

f6fe636

…remote-audio-92 CustomAudioMixer moved to peer_connection.h due to include order issues.

Revert Unity files commited by mistake

b20e16d

fibann force-pushed the user/stephenatwork/remote-audio-92 branch from 52c6ea0 to b20e16d Compare April 23, 2020 18:18

Hide public interface temporarily

4c00e41

Will be refactored later.

fibann requested a review from djee-ms April 24, 2020 10:53

djee-ms approved these changes Apr 24, 2020

View reviewed changes

fibann merged commit 48b9429 into master Apr 24, 2020

fibann deleted the user/stephenatwork/remote-audio-92 branch April 24, 2020 10:59

mr-webrtc-buildbot added a commit that referenced this pull request Apr 24, 2020

Generated docs for commit 48b9429 (Merge pull request #99 from micros…

55d6fb2

…oft/user/stephenatwork/remote-audio-92)

stephenatwork commented Apr 24, 2020

View reviewed changes

fibann added a commit to fibann/MixedReality-WebRTC that referenced this pull request May 19, 2020

Uncomment and tidy up audio track read buffer

5b9c7e8

The API has been refactored to create a buffer from a track (rather than the PeerConnection) and to address comments on microsoft#99.

This was referenced May 19, 2020

Uncomment and tidy up audio track read buffer #359

Merged

Missed reads add audio delay in AudioTrackReadBuffer #372

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote audio feature #99

Remote audio feature #99

stephenatwork commented Oct 14, 2019 •

edited by fibann

eanders-ms commented Oct 14, 2019

djee-ms commented Oct 14, 2019

eanders-ms commented Oct 15, 2019

stephenatwork commented Oct 15, 2019

stephenatwork commented Oct 17, 2019

djee-ms left a comment

stephenatwork commented Oct 18, 2019

stephenatwork commented Dec 19, 2019

djee-ms Apr 8, 2020

fibann Apr 10, 2020

djee-ms Apr 8, 2020

stephenatwork Apr 9, 2020

fibann Apr 9, 2020

stephenatwork Apr 9, 2020

fibann Apr 10, 2020

stephenatwork Apr 10, 2020

fibann Apr 10, 2020

fibann Apr 17, 2020

djee-ms Apr 8, 2020

fibann Apr 9, 2020

stephenatwork Apr 10, 2020

fibann Apr 10, 2020

djee-ms Apr 10, 2020

djee-ms Apr 8, 2020

fibann Apr 17, 2020

stephenatwork Apr 20, 2020

fibann Apr 23, 2020

djee-ms Apr 23, 2020

stephenatwork Apr 24, 2020

djee-ms left a comment

stephenatwork Apr 24, 2020

Remote audio feature #99

Remote audio feature #99

Conversation

stephenatwork commented Oct 14, 2019 • edited by fibann

eanders-ms commented Oct 14, 2019

djee-ms commented Oct 14, 2019

eanders-ms commented Oct 15, 2019

stephenatwork commented Oct 15, 2019

stephenatwork commented Oct 17, 2019

djee-ms left a comment

Choose a reason for hiding this comment

stephenatwork commented Oct 18, 2019

stephenatwork commented Dec 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djee-ms left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephenatwork commented Oct 14, 2019 •

edited by fibann