Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time domain RMS #407

Merged
merged 12 commits into from
Sep 21, 2016
Merged

Time domain RMS #407

merged 12 commits into from
Sep 21, 2016

Conversation

carlthome
Copy link
Contributor

@carlthome carlthome commented Sep 6, 2016

If a precomputed spectrogram is not given, calculate rmse() in the time domain (to avoid doing a costly STFT).

(side note: could rmse() be renamed to rms()? RMSE implies root-mean-square error and it's confusing.)


This change is Reviewable

@stefan-balke
Copy link
Member

stefan-balke commented Sep 6, 2016

You may consider a git rebase first. :)

@carlthome
Copy link
Contributor Author

carlthome commented Sep 6, 2016

@stefan-balke Haha, yeah... 🐹 I'll squash soon. Sorry for the notifications.

@bmcfee
Copy link
Member

bmcfee commented Sep 6, 2016

calculate rmse() in the time domain

Do these give the same (numerical) results?

RMSE implies root-mean-square error and it's confusing

RMSE also means "root mean square energy".

@carlthome
Copy link
Contributor Author

They give similar results. Example: https://gist.github.com/carlthome/048942b1369c374508f56b0d567abe2f

@bmcfee bmcfee added enhancement Does this improve existing functionality? discussion Open-ended discussion for developers and users API change Does this change the behavior of existing API? labels Sep 6, 2016
@bmcfee
Copy link
Member

bmcfee commented Sep 6, 2016

They give similar results.

Of course, but not numerically equivalent: they'll differ due to windowing. (They'd be the same if stft is called with window=np.ones).

This means that we're breaking backwards compatibility here, and more generally, breaking the usual librosa convention for the feature module that equivalent results are produced using time-domain or spectrogram input (for appropriately parameterized spectrograms).

I agree that using an stft is overkill for rmse calculation, so having an option for calculation in the time domain makes sense.

So, the question for us: how much do we care about preserving backwards compatibility and time/frequency equivalence for rmse? If we do adopt the proposed modification, it will need to be documented.

@bmcfee bmcfee added this to the 0.5 milestone Sep 6, 2016
@bmcfee bmcfee self-assigned this Sep 6, 2016
@carlthome
Copy link
Contributor Author

carlthome commented Sep 6, 2016

The default could be the time-frequency method even when passed an audio series, and a bool would have to be set for the time domain method to be used. Then backwards compatibility is not affected.

@bmcfee
Copy link
Member

bmcfee commented Sep 8, 2016

The default could be the time-frequency method even when passed an audio series, and a bool would have to be set for the time domain method to be used. Then backwards compatibility is not affected.

Sure. We could also just tell people to use their own TF representation if that's what they want, and make no promises about it being equivalent to the un-windowed time domain implementation. There's a kind of precedent for this with melspectrogram, which uses squared energy rather than energy, so you only get equivalent results if called with melspectrogram(D=np.abs(S**2), ...).

I guess this is all to say: I'm okay with changing the default behavior to break backwards compatibility, and keeping the api simple. If we document it properly, there shouldn't be any major headaches.

@carlthome
Copy link
Contributor Author

@bmcfee, how do you like this version with mode? It's backwards compatible.

@bmcfee
Copy link
Member

bmcfee commented Sep 13, 2016

@carlthome I think I prefer breaking backwards compatibility in this case. The additional parameter is redundant with whether the user supplied S or not; I think we can assume that a user would not supply S and expect a time-domain calculation.

As long as we include the example with window=np.ones in the docstring to demonstrate how to recover the old behavior, that's good enough.

@carlthome
Copy link
Contributor Author

carlthome commented Sep 20, 2016

Hmm, even with a constant window function of 1.0, the RMS output from magnitudes or raw samples are not precisely equal. Any insight, @bmcfee?

@bmcfee
Copy link
Member

bmcfee commented Sep 20, 2016

Hmm, even with a constant window function of 1.0, the RMS output from magnitudes or raw samples are not precisely equal. Any insight, @bmcfee?

Are they off by sqrt(2pi) from FFT normalization?

@carlthome
Copy link
Contributor Author

It's a problem with the framing. Given x, stft(..., n_fft=x) and util.frame(..., frame_length=x) should segment the signal exactly the same, no? However, the resulting RMS envelopes are slightly shifted in time:

image

@carlthome
Copy link
Contributor Author

And for reference, I'm trimming the input signal to be divisible by the frame length before doing anything else, e.g.:

frame_length = 2048
y = util.fix_length(y, y.size - y.size % frame_length)
assert y.size % frame_length == 0

@bmcfee
Copy link
Member

bmcfee commented Sep 20, 2016

It's a problem with the framing. Given x, stft(..., n_fft=x) and util.frame(..., frame_length=x) should segment the signal exactly the same, no?

Aha! They won't be the same. stft first pads the signal and then center-aligns the frames. If you frame the signal directly, you lose padding and centering. I'm okay with the difference here.

If you call stft(..., center=False) then it ought to line up.

@carlthome
Copy link
Contributor Author

@bmcfee, cool, stuff looks similar enough: https://gist.github.com/carlthome/048942b1369c374508f56b0d567abe2f

@bmcfee
Copy link
Member

bmcfee commented Sep 20, 2016

Looks great! One more nitpick in the tests (style), but otherwise I think this can merge. Thanks for doing this!


Reviewed 1 of 2 files at r5.
Review status: 1 of 2 files reviewed at latest revision, 1 unresolved discussion.


tests/test_features.py, line 342 at r5 (raw file):

        rms1, rms2 = map(lambda x: librosa.util.normalize(x, axis=1), [rms1, 
                                                                       rms2])

Please simplify this test to not use maps or lambdas. In this case, since the test signals are known to be strictly positive, I see no harm in just doing rms1 /= rms1.max() and likewise for 2.


Comments from Reviewable

@carlthome
Copy link
Contributor Author

Review status: 1 of 2 files reviewed at latest revision, 1 unresolved discussion.


tests/test_features.py, line 342 at r5 (raw file):

Previously, bmcfee (Brian McFee) wrote…

Please simplify this test to not use maps or lambdas. In this case, since the test signals are known to be strictly positive, I see no harm in just doing rms1 /= rms1.max() and likewise for 2.

Done.

Comments from Reviewable

@bmcfee
Copy link
Member

bmcfee commented Sep 21, 2016

Reviewed 1 of 1 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@bmcfee bmcfee merged commit 9ca2da3 into librosa:master Sep 21, 2016
@bmcfee
Copy link
Member

bmcfee commented Sep 21, 2016

Merged. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change Does this change the behavior of existing API? discussion Open-ended discussion for developers and users enhancement Does this improve existing functionality?
Development

Successfully merging this pull request may close these issues.

3 participants