-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time domain RMS #407
Time domain RMS #407
Conversation
You may consider a |
@stefan-balke Haha, yeah... 🐹 I'll squash soon. Sorry for the notifications. |
Do these give the same (numerical) results?
RMSE also means "root mean square energy". |
They give similar results. Example: https://gist.github.com/carlthome/048942b1369c374508f56b0d567abe2f |
Of course, but not numerically equivalent: they'll differ due to windowing. (They'd be the same if stft is called with This means that we're breaking backwards compatibility here, and more generally, breaking the usual librosa convention for the I agree that using an stft is overkill for rmse calculation, so having an option for calculation in the time domain makes sense. So, the question for us: how much do we care about preserving backwards compatibility and time/frequency equivalence for rmse? If we do adopt the proposed modification, it will need to be documented. |
The default could be the time-frequency method even when passed an audio series, and a bool would have to be set for the time domain method to be used. Then backwards compatibility is not affected. |
Sure. We could also just tell people to use their own TF representation if that's what they want, and make no promises about it being equivalent to the un-windowed time domain implementation. There's a kind of precedent for this with melspectrogram, which uses squared energy rather than energy, so you only get equivalent results if called with I guess this is all to say: I'm okay with changing the default behavior to break backwards compatibility, and keeping the api simple. If we document it properly, there shouldn't be any major headaches. |
@bmcfee, how do you like this version with |
@carlthome I think I prefer breaking backwards compatibility in this case. The additional parameter is redundant with whether the user supplied S or not; I think we can assume that a user would not supply S and expect a time-domain calculation. As long as we include the example with |
Hmm, even with a constant window function of 1.0, the RMS output from magnitudes or raw samples are not precisely equal. Any insight, @bmcfee? |
Are they off by sqrt(2pi) from FFT normalization? |
And for reference, I'm trimming the input signal to be divisible by the frame length before doing anything else, e.g.: frame_length = 2048
y = util.fix_length(y, y.size - y.size % frame_length)
assert y.size % frame_length == 0 |
Aha! They won't be the same. stft first pads the signal and then center-aligns the frames. If you frame the signal directly, you lose padding and centering. I'm okay with the difference here. If you call |
@bmcfee, cool, stuff looks similar enough: https://gist.github.com/carlthome/048942b1369c374508f56b0d567abe2f |
Looks great! One more nitpick in the tests (style), but otherwise I think this can merge. Thanks for doing this! Reviewed 1 of 2 files at r5. tests/test_features.py, line 342 at r5 (raw file):
Please simplify this test to not use maps or lambdas. In this case, since the test signals are known to be strictly positive, I see no harm in just doing Comments from Reviewable |
Review status: 1 of 2 files reviewed at latest revision, 1 unresolved discussion. tests/test_features.py, line 342 at r5 (raw file):
|
Reviewed 1 of 1 files at r6. Comments from Reviewable |
Merged. Thanks again! |
If a precomputed spectrogram is not given, calculate
rmse()
in the time domain (to avoid doing a costly STFT).(side note: could
rmse()
be renamed torms()
? RMSE implies root-mean-square error and it's confusing.)This change is