Add stat for inputAudioLevel, before the audio filter #271

huibk · 2017-11-02T19:52:10Z

It could be useful to have a stat the represents the audio level before noise filtering or gain control. Use cases are:

distinguish silence from missing microphone input from silence due to noise suppression
determine the level of processing of an audio signal

alvestrand · 2017-11-03T03:43:55Z

Doodling: This would logically be a stat on the source for a track, not a stat on the track itself, no?

if you have:

getUserMedia(audio, id=foo, volume=1.0) => track1
getUserMedia(audio, id=foo, volume=0.5) => track2

and get an input signal at 0.5 (-6dBov)

then track2 would get 0.25 (-12dBov) and track1 would get 0.5 as "level"; an input stat would get 0.5 for both.

One way of getting "unprocessed volume" would be

getUserMedia(audio, id=foo) => track1
getUserMedia(audio, id=foo, processing=none) => track2

track2 should then get the number you want.
Don't know if that works now.

huibk · 2017-11-03T16:25:21Z

Good point, getting a second track without processing may achieve the same. Presumably at higher cost though.

vr000m · 2018-01-10T14:24:46Z

from #288:
My expectation was that input audio level and output level would match. And if someone did not hear anything then the volume stat being 0 would identify the problem.

I am trying to see now if we compare the input audio level with the audio output level and the audio output takes the volume into account, how do I diagnose if the issue is with the post decoding filter...?

alvestrand · 2018-01-17T12:09:35Z

An implementation experiment showed that on current Chrome, you can't turn on echo cancellation on one track and turn it off on another track from the same source - which limits the usefulness of my other idea. https://crbug.com/802198
Firefox at least reports that echo has been turned off, but no discernible volume difference in simple tests. webrtc/samples#993 is the test page.

na-g · 2018-02-14T18:45:58Z

Would RTCRtp{Contributing,Synchronization}Source (from webrtc-pc) be an appropriate place for this? It seems like this is data that one might want to poll frequently.

na-g · 2018-02-14T21:25:46Z

It took me too long to realize this was about local media. Still, there may be a better API possible (one that is synchronous and fast) than getStats for audio data.

alvestrand · 2018-03-07T14:13:41Z

It turns out present Chrome doesn't support having two tracks from the same source with differing processing requirements. So it makes sense, sort of.
@huibk is this still a burning desire?

alvestrand · 2018-05-07T08:39:12Z

Thought: One possibility is to define a "source" stats object, which would have a reference from the "track" stats object. This is a place to hang both input audio level and input frame width and height (which has been requested in other contexts - see googFrameWidthInput non-standard stats).
@henbos @burnburn what do you think?

alvestrand · 2018-05-16T13:10:01Z

Note: No matter what, it should be doing "accumulated energy", not "instant volume", for all the reasons given.

vr000m · 2018-05-16T13:12:56Z

👍 If we can do video input as well, it would be worthwhile consider doing.

henbos · 2018-06-21T09:29:36Z

1. Source: Camera resolution
->
2. Constraints: E.g. downscaled video
->
Entering: WebRTC pipeline, the track is attached to a sender.
3. Sender knows input resolution (the per-constraint downscaled video).
->
The encoder is not exposed, but the sender's encoder encodes the video.
4. Sender knows output resolution (encoder might decide to downscale even more).
->
Sender creates RTP packets
->
IceTransport
->
Receiver gets RTP patckets
->
Jitter buffer, concealment, whatever happens to prepare the stuff for the decoder.
->
The decoder is not exposed, but the receiver's decoder decodes the video.
5. Receiver knows the resolution of the final track.
->
6. Possible post-procesing. I don't know if this happens, but it's conceivable that the WebRTC implementation decides that, if it's audio, "this is just silence", and mutes it, OR this could have been part of the decoding step.
Exiting the WebRTC pipeline.
->
7. The application might do additional processing through canvas etc, but now we have left "WebRTC land".
->
Render on screen.

Resolution may change 1-7, we can only provide getStats() for what is in the "WebRTC pipeline", e.g. 3-6.

Our current stats are for 3 and for 6 (or if we don't do anything at 6 I then the stats are for 5).
We could chose to expose more of these, but we cannot expose stuff that happens outside of the "WebRTC pipeline" without getStats() or equivalent on non-WebRTC primitives or on GETUSERMEDIA objects like MediaStreamTrack (note that the webrtc getStats() for track is not actually MediaStreamTrack stats but based on sender/receiver stats).

Am I missing anything?

henbos · 2018-06-21T09:31:11Z

2 and 3 are the same resolution

henbos · 2018-06-21T09:33:34Z

I think some of this issue stems from wanting stats before/after application does additional processing, but that is outside the scope of WebRTC pipeline.

henbos · 2018-06-21T09:35:16Z

@huibk / @vr000m Can you pinpoint where in the steps described above you want to have the metrics for? (I mostly used resolution as the example, but same applies for audioLevel)

vr000m · 2018-06-21T09:37:33Z

My original concern was to have input and output metrics for all the components that transform media in some way.

For example, components of the media pipeline that downscale/upscale video or suppress/conceal audio.

henbos · 2018-06-21T09:45:59Z

The sender is comprised of multiple components and it might be worth considering splitting the dictionary up into multiple components to reflect that, a "sender"/"receiver" and "encoder"/"decoder" dictionary. Conceptually: https://photos.app.goo.gl/fyqGKxYMhM247dK47

The encoder should have input/output stats (whether resolution or audio energy, etc). Or we put them in the "sender" stats for now, but be clear about what is input and what is output, none of the names are clear about which step in the pipeline they reflect.

But let's make sure this is something we want before we change it. It could be that what is being asked for are metrics for before or after the WebRTC pipeline, which might be outside the scope of this spec.

henbos · 2018-06-21T10:25:24Z

Based on discussion with @huibk and @vr000m:

This has to do with MediaStreamTrack and getUserMedia()/devices, stats for before and after processing/applying constraints, which is outside of WebRTC (and could be of interest for non-WebRTC applications) and this spec.

Closing this bug. Feel free to file a bug on https://github.com/w3c/mediacapture-main/issues.

alvestrand self-assigned this Dec 8, 2017

alvestrand mentioned this issue Dec 8, 2017

Do the "audio level" stats include MediaStreamTrack volume settings? #239

Closed

alvestrand mentioned this issue Jan 17, 2018

Additional description of audioLevel #288

Merged

vr000m mentioned this issue Jan 22, 2018

RTCMediaStreamTrackStats.audioLevel clarification #193

Closed

henbos added the Submitter input needed label Jun 21, 2018

henbos closed this as completed Jun 21, 2018

henbos mentioned this issue Jun 21, 2018

Input/output resolution and audio energy #353

Closed

alvestrand mentioned this issue Jun 21, 2018

Volume attributes for audio MediaStreamTracks? w3c/mediacapture-main#523

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add stat for inputAudioLevel, before the audio filter #271

Add stat for inputAudioLevel, before the audio filter #271

huibk commented Nov 2, 2017

alvestrand commented Nov 3, 2017

huibk commented Nov 3, 2017

vr000m commented Jan 10, 2018

alvestrand commented Jan 17, 2018

na-g commented Feb 14, 2018

na-g commented Feb 14, 2018

alvestrand commented Mar 7, 2018

alvestrand commented May 7, 2018

alvestrand commented May 16, 2018

vr000m commented May 16, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

vr000m commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

Add stat for inputAudioLevel, before the audio filter #271

Add stat for inputAudioLevel, before the audio filter #271

Comments

huibk commented Nov 2, 2017

alvestrand commented Nov 3, 2017

huibk commented Nov 3, 2017

vr000m commented Jan 10, 2018

alvestrand commented Jan 17, 2018

na-g commented Feb 14, 2018

na-g commented Feb 14, 2018

alvestrand commented Mar 7, 2018

alvestrand commented May 7, 2018

alvestrand commented May 16, 2018

vr000m commented May 16, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018

vr000m commented Jun 21, 2018

henbos commented Jun 21, 2018

henbos commented Jun 21, 2018