Quality issues? #75

bluenote10 · 2020-11-02T20:07:16Z

Out of curiosity I did a test similar to this on the various resampling methods available in librosa. The test resamples an exponentially swept sine from 96kHz down to 44.1kHz. The test signal is 8 seconds long, and at around 7.2 seconds the swept sine passes the Nyquist frequency of the downsampled rate.

Considering that the resampy algorithm has a sound theoretical background straight from a JOS publication and is librosa's default I expected it to be very high quality. However what I got was somewhat surprising:

full reproduction code

import time

import numpy as np
import matplotlib.pyplot as plt

import librosa
import librosa.display


def exp_swept_sine(f1, f2, sr, amp=1.0, t=1.0):
    num_samples = int(t * sr)
    ts = np.arange(num_samples) / sr
    L = t / np.log(f2 / f1)
    wave = amp * np.sin(2.0 * np.pi * f1 * L * (np.exp(ts / L) - 1))
    return wave


def analyze_and_plot(wave, sr, method_name, runtime):
    hop_length = 256
    S = librosa.stft(wave, hop_length=hop_length)

    fig, ax = plt.subplots(1, 1, figsize=(20, 10))
    img = librosa.display.specshow(
        librosa.amplitude_to_db(np.abs(S), ref=np.max, amin=1e-10, top_db=180.0),
        y_axis='log',
        x_axis='time',
        sr=sr,
        ax=ax,
        hop_length=hop_length,
    )
    plt.colorbar(img, ax=ax)
    fig.suptitle("{} ({:.1f} ms)".format(method_name, runtime * 1000))
    fig.tight_layout()
    fig.savefig("/tmp/{}.png".format(method_name))
    plt.show()


def multi_check():
    sr1 = 96000
    sr2 = 44100
    wave1 = exp_swept_sine(f1=20, f2=sr1/2, sr=sr1, amp=0.5, t=8.0)

    methods = [(
        "resampy_best",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="kaiser_best")
    ), (
        "resampy_fast",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="kaiser_fast")
    ), (
        "scipy.signal.resample",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="scipy")
    ), (
        "scipy.signal.resample_poly",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="polyphase")
    ), (
        "samplerate.converters.resample_sinc_best",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="sinc_best")
    ), (
        "samplerate.converters.resample_sinc_medium",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="sinc_medium")
    ), (
        "samplerate.converters.resample_sinc_fastest",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="sinc_fastest")
    ), (
        "samplerate.converters.resample_linear",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="linear")
    ), (
        "samplerate.converters.resample_zero_order_hold",
        lambda: librosa.resample(wave1, sr1, sr2, res_type="zero_order_hold")
    )]

    for method_name, func in methods:
        t1 = time.time()
        wave2 = func()
        t2 = time.time()
        runtime = t2 - t1
        print("{:<50s} runtime: {:.1f} ms".format(method_name, runtime * 1000))
        analyze_and_plot(wave2, sr2, method_name, runtime)


if __name__ == "__main__":
    multi_check()

Any ideas why resampy shows such strong distortion?

bmcfee · 2020-11-03T00:03:52Z

Concerning indeed! I remember doing some tests along these lines when initially developing resampy, and it was never quite as good as libsamplerate because I never bothered with the iterative optimization procedure for producing the window parameters. In principle, we could do that at some point, and should get ~equivalent results.

The noise differences are in the -90dB range (from what I can tell), so not audible, but not great to see. I wonder if there's some numerical precision factors at play here? I haven't done a source-dive on libsamplerate, but it wouldn't surprise me if they're operating internally at higher precision than resampy is, and those round-off errors could add up. (Just a guess!)

Failing that, it's entirely possible that we have some kind of a bug, but for now I'd chalk it up to less-than-optimal windowing.

bmcfee · 2022-06-27T15:33:33Z

Following up on this - I don't think it's actually a numerical precision discrepancy, though I suppose it's possible.

A couple of observations here:

Artifacts do not appear when upsampling (e.g. 44100→96000), only downsampling appears to be affected.
Downsampling at an integer ratio (eg 44100→22050) is also unaffected, and in fact marginally better than libsamplerate's analogous configuration.

Both of these suggest to me that the algorithm itself is implemented properly, and it all comes down to the precomputed window coefficients. Per the docstring in our filters module:

resampy/resampy/filters.py

Lines 9 to 13 in c32f0bf

    
               - `kaiser_best` : 64 zero-crossings, a Kaiser window with beta=14.769656459379492, 
        
                   and a roll-off frequency of Nyquist * 0.9475937167399596. 
        
               - `kaiser_fast` : 16 zero-crossings, a Kaiser window with beta=8.555504641634386, 
        
                   and a roll-off frequency of Nyquist * 0.85.

So there are three things at play here: the number of zero crossings retained, the beta parameter of the kaiser filter, and the roll-off frequency. There's also the precision parameter to think about, which controls how many coefficients to retain for each zc.

It's not exactly trivial to compare this to the libsamplerate implementation, as it's parametrized a little differently, but we can try. For example, adding

    (
        "resampy_custom",
        lambda: resampy.resample(wave1, sr1, sr2, filter='sinc_window', num_zeros=69, precision=15)
    ),

to the benchmark script above gives a decent improvement mainly by using better filter interpolation:

compared to stock:

At a glance, this drops the artifacts from -90dB down to around -110dB. Pushing the precision up to 20 gives

bringing artifacts down to around -150 (at a significant performance hit).

Probably we could tune this better to have a more optimal setting of precision and filter shape. IIRC libsamplerate did this by some kind of automated parameter search. That might be a nice thing to implement here as well, but I think I'm convinced that the observed behavior is not a "bug" per se.

bmcfee · 2022-06-27T16:08:45Z

Following up on this a bit, the libsamplerate octave code for parameter tuning is here: https://github.com/libsndfile/libsamplerate/tree/master/Octave

If you unpack this a bit, it looks like they're using beta=16.05 for their kaiser-best filter (compared to our 14.76). There's also a bit of over-sampling (fudge factor) in their filter, which is only reported in the docs for the fast filter (not best).

bmcfee · 2022-06-28T17:17:23Z

After a bit of a dive into the libsamplerate filters, I think we can get some easy mileage out of changing the balance between the number of zero crossings (64 for kaiser_best) and the precision (number of interpolation points) in our filters. For reference, the libsamplerate high-quality filter has a half-length of 340239, whereas our current "best" filter has about 1/10 as many: 32769 (64 zeros and precision=9). Adding three bits of precision gets us in the same coefficient ballpark, but we can actually do better by going further with precision and removing some zeros from the filter.

Here's a prototype using 40 zeros, precision of 13 bits, beta=12.56ish, rolloff=0.90ish. (Beta and rolloff optimized the method noted in #96 with some modifications to come.) I've adjusted the colormap for these plots so that the midpoint at -120dB is black; anything tinted red is louder, blue is quieter. The idea here is that artifacts in blue should be tolerable.

compared to our current kaiser_best:

(note the similar runtimes)
and libsamplerate (much slower, but higher quality):

(Side note: resampy runtimes here are using the new parallel implementation, hence the speedup relative to other algorithms.)

TLDR is that we can bring the noise level down to the -120dB range without any appreciable loss in efficiency. It remains to be seen how far we can push the low-quality filter, as the trade-off between length and interpolation might be qualitatively different in that regime.

bmcfee · 2022-06-29T17:40:57Z

Closing this one out now - 0.3 release improves things a bit, still not perfect, but within reasonable tolerances.

bmcfee added the question label Feb 26, 2021

bmcfee mentioned this issue Jun 27, 2022

Improve default filters and make the whole process reproducible #96

Closed

bmcfee added a commit that referenced this issue Jun 28, 2022

fixing #96, #75 - update filter generation and new filters

a927af7

bmcfee mentioned this issue Jun 29, 2022

Parameter generation #98

Merged

bmcfee closed this as completed Jun 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality issues? #75

Quality issues? #75

bluenote10 commented Nov 2, 2020 •

edited

Loading

bmcfee commented Nov 3, 2020

bmcfee commented Jun 27, 2022

bmcfee commented Jun 27, 2022

bmcfee commented Jun 28, 2022

bmcfee commented Jun 29, 2022

Quality issues? #75

Quality issues? #75

Comments

bluenote10 commented Nov 2, 2020 • edited Loading

bmcfee commented Nov 3, 2020

bmcfee commented Jun 27, 2022

bmcfee commented Jun 27, 2022

bmcfee commented Jun 28, 2022

bmcfee commented Jun 29, 2022

bluenote10 commented Nov 2, 2020 •

edited

Loading