-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality issues? #75
Comments
Concerning indeed! I remember doing some tests along these lines when initially developing resampy, and it was never quite as good as libsamplerate because I never bothered with the iterative optimization procedure for producing the window parameters. In principle, we could do that at some point, and should get ~equivalent results. The noise differences are in the -90dB range (from what I can tell), so not audible, but not great to see. I wonder if there's some numerical precision factors at play here? I haven't done a source-dive on libsamplerate, but it wouldn't surprise me if they're operating internally at higher precision than resampy is, and those round-off errors could add up. (Just a guess!) Failing that, it's entirely possible that we have some kind of a bug, but for now I'd chalk it up to less-than-optimal windowing. |
Following up on this - I don't think it's actually a numerical precision discrepancy, though I suppose it's possible. A couple of observations here:
Both of these suggest to me that the algorithm itself is implemented properly, and it all comes down to the precomputed window coefficients. Per the docstring in our filters module: Lines 9 to 13 in c32f0bf
So there are three things at play here: the number of zero crossings retained, the beta parameter of the kaiser filter, and the roll-off frequency. There's also the It's not exactly trivial to compare this to the libsamplerate implementation, as it's parametrized a little differently, but we can try. For example, adding (
"resampy_custom",
lambda: resampy.resample(wave1, sr1, sr2, filter='sinc_window', num_zeros=69, precision=15)
), to the benchmark script above gives a decent improvement mainly by using better filter interpolation: At a glance, this drops the artifacts from -90dB down to around -110dB. Pushing the precision up to 20 gives Probably we could tune this better to have a more optimal setting of precision and filter shape. IIRC libsamplerate did this by some kind of automated parameter search. That might be a nice thing to implement here as well, but I think I'm convinced that the observed behavior is not a "bug" per se. |
Following up on this a bit, the libsamplerate octave code for parameter tuning is here: https://github.com/libsndfile/libsamplerate/tree/master/Octave If you unpack this a bit, it looks like they're using beta=16.05 for their kaiser-best filter (compared to our 14.76). There's also a bit of over-sampling (fudge factor) in their filter, which is only reported in the docs for the fast filter (not best). |
After a bit of a dive into the libsamplerate filters, I think we can get some easy mileage out of changing the balance between the number of zero crossings (64 for kaiser_best) and the precision (number of interpolation points) in our filters. For reference, the libsamplerate high-quality filter has a half-length of 340239, whereas our current "best" filter has about 1/10 as many: 32769 (64 zeros and precision=9). Adding three bits of precision gets us in the same coefficient ballpark, but we can actually do better by going further with precision and removing some zeros from the filter. Here's a prototype using 40 zeros, precision of 13 bits, beta=12.56ish, rolloff=0.90ish. (Beta and rolloff optimized the method noted in #96 with some modifications to come.) I've adjusted the colormap for these plots so that the midpoint at -120dB is black; anything tinted red is louder, blue is quieter. The idea here is that artifacts in blue should be tolerable. TLDR is that we can bring the noise level down to the -120dB range without any appreciable loss in efficiency. It remains to be seen how far we can push the low-quality filter, as the trade-off between length and interpolation might be qualitatively different in that regime. |
Closing this one out now - 0.3 release improves things a bit, still not perfect, but within reasonable tolerances. |
Out of curiosity I did a test similar to this on the various resampling methods available in librosa. The test resamples an exponentially swept sine from 96kHz down to 44.1kHz. The test signal is 8 seconds long, and at around 7.2 seconds the swept sine passes the Nyquist frequency of the downsampled rate.
Considering that the resampy algorithm has a sound theoretical background straight from a JOS publication and is librosa's default I expected it to be very high quality. However what I got was somewhat surprising:
full reproduction code
Any ideas why resampy shows such strong distortion?
The text was updated successfully, but these errors were encountered: