New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kaiser window support to resampling #1509
Add kaiser window support to resampling #1509
Conversation
9577a1d
to
a5bf156
Compare
torchaudio/functional/functional.py
Outdated
rolloff: float): | ||
rolloff: float, | ||
resampling_method: str, | ||
beta: float = 6.): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://numpy.org/doc/stable/reference/generated/numpy.kaiser.html, a value of 6 is similar to a Hanning window shape (used for sinc_interpolation), but I'm open to other default values as well. Resampy uses ~8.55 for their kaiser_fast and ~14.77 for kaiser_best.
a5bf156
to
dd6c076
Compare
@@ -1352,15 +1354,20 @@ def _get_sinc_resample_kernel( | |||
# they will have a lot of almost zero values to the left or to the right... | |||
# There is probably a way to evaluate those filters more efficiently, but this is kept for | |||
# future work. | |||
idx = torch.arange(-width, width + orig_freq) | |||
idx = torch.arange(-width, width + orig_freq, dtype=torch.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
through running separate scripts, I realized that #1499 introduced rounding errors if waveform dtype was greater than float32. adding dtype=float64 here to retain implementation accuracy from the prior version, and this implementation will likely be improved after additional discussion on transforms kernel dtype/device computation
dd6c076
to
76bf776
Compare
@@ -657,19 +657,22 @@ class Resample(torch.nn.Module): | |||
Args: | |||
orig_freq (float, optional): The original frequency of the signal. (Default: ``16000``) | |||
new_freq (float, optional): The desired frequency. (Default: ``16000``) | |||
resampling_method (str, optional): The resampling method. (Default: ``'sinc_interpolation'``) | |||
resampling_method (str, optional): The resampling method. | |||
Options: [``sinc_interpolation``, ``kaiser_window``] (Default: ``'sinc_interpolation'``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
followup: This docstring can be improved.
Co-authored-by: Holly Sweeney <77758406+holly1238@users.noreply.github.com>
Add kaiser window as an option for resampling
Jupyter notebook of some results kaiser_resampling__1_.pdf
cc #1487