Julius, fast PyTorch based DSP for audio and 1D signals
Julius contains different Digital Signal Processing algorithms implemented with PyTorch, so that they are differentiable and available on CUDA. Note that all the modules implemented here can be used with TorchScript.
For now, I have implemented:
- julius.resample: fast sinc resampling.
- julius.fftconv: FFT based convolutions.
- julius.lowpass: FIR low pass filter banks.
- julius.filters: FIR high pass and band pass filters.
- julius.bands: Decomposition of a waveform signal over mel-scale frequency bands.
Along that, you might found useful utilities in:
- julius.core: DSP related functions.
- julius.utils: Generic utilities.
julius0.2.7 released:: fixed ONNX compat (thanks @iver56). I know I missed the 0.2.6 one...
julius0.2.5 released:: support for setting a custom output length when resampling.
julius0.2.4 released:: adding highpass and band passfilters. Extra linting and type checking of the code. New
unfoldimplemention, up to x6 faster FFT convolutions and more efficient memory usage.
julius0.2.2 released: fixing normalization of filters in lowpass and resample to avoid very low frequencies to be leaked. Switch from zero padding to replicate padding (uses first/last value instead of 0) to avoid discontinuities with strong artifacts.
juliusimplementation of resampling is now officially part of Torchaudio.
julius requires python 3.6. To install:
pip3 install -U julius
See the Julius documentation for the usage of Julius. Hereafter you will find a few examples to get you quickly started:
import julius import torch signal = torch.randn(6, 4, 1024) # Resample from a sample rate of 100 to 70. The old and new sample rate must be integers, # and resampling will be fast if they form an irreductible fraction with small numerator # and denominator (here 10 and 7). Any shape is supported, last dim is time. resampled_signal = julius.resample_frac(signal, 100, 70) # Low pass filter with a `0.1 * sample_rate` cutoff frequency. low_freqs = julius.lowpass_filter(signal, 0.1) # Fast convolutions with FFT, useful for large kernels conv = julius.FFTConv1d(4, 10, 512) convolved = conv(signal) # Decomposition over frequency bands in the Waveform domain bands = julius.split_bands(signal, n_bands=10, sample_rate=100) # Decomposition with n_bands frequency bands evenly spaced in mel space. # Input shape can be `[*, T]`, output will be `[n_bands, *, T]`. random_eq = (torch.rand(10, 1, 1, 1) * bands).sum(0)
This is an implementation of the sinc resample algorithm by Julius O. Smith. It is the same algorithm than the one used in resampy but to run efficiently on GPU it is limited to fractional changes of the sample rate. It will be fast if the old and new sample rate are small after dividing them by their GCD. For instance going from a sample rate of 2000 to 3000 (2, 3 after removing the GCD) will be extremely fast, while going from 20001 to 30001 will not. Julius resampling is faster than resampy even on CPU, and when running on GPU it makes resampling a completely negligible part of your pipeline (except of course for weird cases like going from a sample rate of 20001 to 30001).
Computing convolutions with very large kernels (>= 128) and a stride of 1 can be much faster
using FFT. This implements the same API as
but with a FFT backend. Dilation and groups are not supported.
FFTConv will be faster on CPU even for relatively small tensors (a few dozen channels, kernel size
of 128). On CUDA, due to the higher parallelism, regular convolution can be faster in many cases,
but for kernel sizes above 128, for a large number of channels or batch size, FFTConv1d
will eventually be faster (basically when you no longer have idle cores that can hide
the true complexity of the operation).
Classical Finite Impulse Reponse windowed sinc lowpass filter. It will use FFT convolutions automatically
if the filter size is large enough. This is the basic block from which you can build
high pass and band pass filters (see
Decomposition of a signal over frequency bands in the waveform domain. This can be useful for instance to perform parametric EQ (see Usage above).
You can find speed tests (and comparisons to reference implementations) on the
benchmark. The CPU benchmarks are run on a Mac Book Pro 2020, with a 2.4 GHz
8-core intel CPU i9. The GPUs benchmark are run on Nvidia V100 with 16GB of memory.
We also compare the validity of our implementations, as compared to reference ones like
Clone this repository, then
pip3 install .[dev]' python3 tests.py
To run the benchmarks:
pip3 install .[dev]' python3 -m bench.gen
julius is released under the MIT license.
This package is named in the honor of Julius O. Smith, whose books and website were a gold mine of information for me to learn about DSP. Go checkout his website if you want to learn more about DSP.