Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

biquad filter similar to SoX #275

Merged
merged 32 commits into from
Sep 18, 2019
Merged

biquad filter similar to SoX #275

merged 32 commits into from
Sep 18, 2019

Conversation

engineerchuan
Copy link
Contributor

Use biquad filter similar to SoX.

@engineerchuan
Copy link
Contributor Author

engineerchuan commented Sep 11, 2019

Hi Community,

We are currently exploring ways to make torchaudio's dependence on SoX optional (per #260).

A subset of SoX's functionality is its frequency filtering operations e.g. highpass, lowpass, bandpass, etc.. Sox's implementation are based on the digital biquad filter. (https://en.wikipedia.org/wiki/Digital_biquad_filter). This WIP PR ports SoX's implementation of the biquad effect, lowpass, and highpass filters.

Question: Which parts of the frequency filtering functions would be helpful to include in torchaudio vs keep separate?

  • One approach would be to provide only the core "execution layer" and leave the filter design outside the library.
  • For an IIR filter like biquad, this could mean implementing the biquad effect but asking the user to supply the coeffs (b0, b1, b2, a0, a1, a2). Or perhaps implementing a general IIR difference equation execution engine.
  • For a general FIR filter, the user could use filter design libraries like scipy.signal to design the impulse response. Then we can use torch's GPU accelerated convolution functions to execute.

@vincentqb

Thank you,

Chuan

@vincentqb
Copy link
Contributor

Relates to #260

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look, and good job so far :)

A test is failing, but does not appear related to your changes.

test/test_datasets_vctk.py::TestVCTK::test_make_manifest FAILED

examples/filter_file.py Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
torchaudio/__init__.py Outdated Show resolved Hide resolved
torchaudio/functional.py Outdated Show resolved Hide resolved
assert(output_waveform.size(0) == n_channels);
assert(output_waveform.size(1) == n_frames);

auto input_accessor = input_waveform.accessor<float,2>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need to look into using the appropriate C/GPU interface here. Unfortunately, we do not yet have the CI tools for anything else but linux on CPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added GPU typedefs based on my understanding of code but we need to test this.

int n_order = a_coeffs.size(0); // n'th order - 1 filter
assert(a_coeffs.size(0) == b_coeffs.size(0));

for (int64_t i_channel = 0; i_channel < n_channels; ++i_channel) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the channels in one pass for each frame?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there is necessarily a faster way. Would you use tensors to slice all channels at each frame? Would that be much faster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slicing would be great, yes. I'd expect this to work faster on GPU.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 -- We want to implement a transformation lfilter that could "almost" be implemented by convolutions. We can't because of the way the terms depend on each other. @engineerchuan suggests implementing it in C++ (see here, in torchaudio/filtering.cpp), since the for-loop in time is much faster and comparable to scipy in speed on CPU. We shouldn't need a for-loop over the channels though. Thoughts on things to be careful about here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the computation for each channel doesn't depend on each other, we might be able to use at::parallel_for (https://github.com/pytorch/pytorch/blob/cc61af3c3d8ca8b46f7234383513b5166e10150c/aten/src/ATen/Parallel.h#L48) to speed up on CPU, and thrust::for_each (or a custom CUDA kernel launch) to speed up on GPU.

torchaudio/functional_sox_convenience.py Outdated Show resolved Hide resolved
torchaudio/functional_sox_convenience.py Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
@engineerchuan
Copy link
Contributor Author

engineerchuan commented Sep 17, 2019

3 different implementations of lfilter were explored in this commit.

  1. lfilter "Element-wise" computation using tensor accessors (https://pytorch.org/cppdocs/notes/tensor_basics.html#efficient-access-to-tensor-elements)
  2. lfilter_tensor "Slice" computation where the difference equation is evaluated simultaneously across all channels by adding and subtracting slices.
  3. lfilter_tensor_matrix "Matrix" computation where we recast the difference equation computation as a matrix multiply.

All 3 implementations return same result at 1e-5 tolerance for a variety of random inputs. We have confidence they are doing the same math.

Performance (all on CPU):

Input (2 channel x 100K samples):

Looped Implementation took       :  0.0026006698608398438
Tensor Implementation took       :  7.084993362426758
Tensor Matrix Implementation took:  2.870526075363159

Input (10 channel x 10K samples):

Looped Implementation took       :  0.0015499591827392578
Tensor Implementation took       :  0.8290731906890869
Tensor Matrix Implementation took:  0.29657483100891113

Input (100 channel x 10K samples):

Looped Implementation took       :  0.01202535629272461
Tensor Implementation took       :  0.9394674301147461
Tensor Matrix Implementation took:  0.4104940891265869

The basic element wise implementation seems much faster. I need help evaluating my tensor based implementation. Is the method shown of slicing the most efficient way to access? Do I need to contiguous() at all?

Thank you,

Chuan

@yf225 @vincentqb

@engineerchuan
Copy link
Contributor Author

Unwound the cpp lfilter implementations. We will move these to a separate PR.

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! We're essentially ready to merge this :)

docs/source/functional_sox_compatibility.rst Outdated Show resolved Hide resolved
test/test_datasets_vctk.py Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
docs/source/index.rst Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
test/test_functional_filtering.py Outdated Show resolved Hide resolved
torchaudio/functional_sox_compatibility.py Outdated Show resolved Hide resolved
@vincentqb vincentqb merged commit 8273c3f into pytorch:master Sep 18, 2019
@vincentqb vincentqb changed the title [WIP] Exploring Filtering Capabilities biquad filter similar to SoX Sep 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants