Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions torchaudio/compliance/kaldi.py
Original file line number Diff line number Diff line change
Expand Up @@ -770,6 +770,8 @@ def resample_waveform(waveform: Tensor,
but less efficient. We suggest around 4 to 10 for normal use. (Default: ``6``)
rolloff (float, optional): The roll-off frequency of the filter, as a fraction of the Nyquist.
Lower values reduce anti-aliasing, but also reduce some of the highest frequencies. (Default: ``0.99``)
resampling_method (str, optional): The resampling method to use.
Options: [``sinc_interpolation``, ``kaiser_window``] (Default: ``'sinc_interpolation'``)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mthrok any thoughts on whether or not we should also add beta as a parameter to kaldi's resample waveform? this would make it more consistent with the trasnforms/functional form, but not "kaldi-compliant". resampling_method was added for easier testing which was done in compliance_kaldi_test -- if we remove beta, do we also want to remove resampling_method once we migrate the tests to functional?


Returns:
Tensor: The waveform at the new frequency
Expand Down
13 changes: 4 additions & 9 deletions torchaudio/functional/functional.py
Original file line number Diff line number Diff line change
Expand Up @@ -1424,26 +1424,21 @@ def resample(
resampling_method: str = "sinc_interpolation",
beta: Optional[float] = None,
) -> Tensor:
r"""Resamples the waveform at the new frequency. This matches Kaldi's OfflineFeatureTpl ResampleWaveform
which uses a LinearResample (resample a signal at linearly spaced intervals to upsample/downsample
a signal). LinearResample (LR) means that the output signal is at linearly spaced intervals (i.e
the output signal has a frequency of ``new_freq``). It uses sinc/bandlimited interpolation to
upsample/downsample the signal.
r"""Resamples the waveform at the new frequency using bandlimited interpolation.

https://ccrma.stanford.edu/~jos/resample/Theory_Ideal_Bandlimited_Interpolation.html
https://github.com/kaldi-asr/kaldi/blob/master/src/feat/resample.h#L56

Args:
waveform (Tensor): The input signal of dimension (..., time)
orig_freq (float): The original frequency of the signal
new_freq (float): The desired frequency
lowpass_filter_width (int, optional): Controls the sharpness of the filter, more == sharper
but less efficient. We suggest around 4 to 10 for normal use. (Default: ``6``)
but less efficient. (Default: ``6``)
rolloff (float, optional): The roll-off frequency of the filter, as a fraction of the Nyquist.
Lower values reduce anti-aliasing, but also reduce some of the highest frequencies. (Default: ``0.99``)
resampling_method (str, optional): The resampling method.
resampling_method (str, optional): The resampling method to use.
Options: [``sinc_interpolation``, ``kaiser_window``] (Default: ``'sinc_interpolation'``)
beta (float, optional): The shape parameter used for kaiser window.
beta (float or None): The shape parameter used for kaiser window.

Returns:
Tensor: The waveform at the new frequency of dimension (..., time).
Expand Down
6 changes: 3 additions & 3 deletions torchaudio/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -666,13 +666,13 @@ class Resample(torch.nn.Module):
Args:
orig_freq (float, optional): The original frequency of the signal. (Default: ``16000``)
new_freq (float, optional): The desired frequency. (Default: ``16000``)
resampling_method (str, optional): The resampling method.
resampling_method (str, optional): The resampling method to use.
Options: [``sinc_interpolation``, ``kaiser_window``] (Default: ``'sinc_interpolation'``)
lowpass_filter_width (int, optional): Controls the sharpness of the filter, more == sharper
but less efficient. We suggest around 4 to 10 for normal use. (Default: ``6``)
but less efficient. (Default: ``6``)
rolloff (float, optional): The roll-off frequency of the filter, as a fraction of the Nyquist.
Lower values reduce anti-aliasing, but also reduce some of the highest frequencies. (Default: ``0.99``)
beta (float, optional): The shape parameter used for kaiser window.
beta (float or None): The shape parameter used for kaiser window.

Note: If resampling on waveforms of higher precision than float32, there may be a small loss of precision
because the kernel is cached once as float32. If high precision resampling is important for your application,
Expand Down