New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTCMediaStreamTrackStats.audioLevel clarification #193
Comments
The intent wasn't to claim that it had to have this result by remapping (remappping through a 7-bit value loses precision when we don't have to). It was intended to say that this definition is the same as the definition for the input value to the RFC appendix. |
The referenced takes length number of samples of 0..MAX_VALUE (127 for 1 byte per sample), normalizes it to 0..1 and converts to the rms (root mean square) 0..1. It calculates db = 20 * log10(rms), clamps it between -127 and 0 and rounds it to an integer. So... our audioLevel is the rms value? I get not wanting to use the sample value, 0..MAX_VALUE, as to be agnostic about the number of bytes used per sample, but why do we pick an audioLevel value that is the rms value? Why not the db value? We don't have to round it to an integer if we're afraid of loss of precision. |
Do you know the answer to my comment @taylor-b ? |
I've wondered myself for a long time why it's not in dBov... Following the git history, I found it was defined this way to match "volume" in mediacapture-main... Which was introduced in a PR by Cullen... Which they decided to merge in this meeting: https://www.w3.org/2014/10/30-mediacap-minutes.html ... where it turns out it was just an arbitrary choice. Anyway, I think the definition in mediacapture-main makes more sense; this one confused me for a while the first time I read it. |
Some other things that aren't clear to me about
|
@hlundin can you help clarifying audioLevel? We're confused. :) |
I'd define it as "RMS of audio samples divided by the maximum encodable value". Is there a better term for these units? And/or we can specifically point to the |
I found this definition:
Does that make it clear? It says nothing about what chunk of audio we are measuring, and whether to use smoothing or not. I think that smoothing should not be used, but for the duration I have no good answer. |
@hlundin This is the definition we don't think is very clear. It's not obvious to me what "linear" means, or that the "approximately 6" value is the result of |
We have totalAudioEnergy defined now, which is computable. Is it possible to define audioLevel in terms of a computation over totalAudioEnergy? I worry about audioLevel since it's inherently a "windowing" operator - instantaneous audioLevel doesn't make much sense at all. We could leave it implementation-defined, but that will give implementation variance. https://webrtc.github.io/samples/src/content/getusermedia/volume/ gives a JS implementation that shows two versions of exponential-decay calculation. "Linear" kind of makes sense if you think that each halving of audio level should represent another 6 dBSPL change (0.25 = -12dB compared to 0 dBov). There are many other curves that could go through the other two defined points (0 dBov and -6 dBov) if the word "linear" is dropped. Suggestions for clearer description? |
Is |
Can I reassign to you Harald? |
Let's see if #288 (where I claim that it's an average over an implementation-defined interval) makes sense to you. Agree that current Chrome implementation doesn't. |
Closing. |
This is in reference to the member definition for RTCMediaStreamTrackStats.audioLevel.
Is the second sentence indicating that RTCMediaStreamTrackStats.audioLevel is obtained via remapping (from 0 ... 127 to 1.0 ... 0.0) the result of the method described in Appendix A of RFC 6465?
The text was updated successfully, but these errors were encountered: