-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPP Implementation of lfilter for CPU #290
Conversation
Raw performance test results:
|
We talked about order 40. Is this too high to test? :) |
@vincentqb, here are the results for 40'th, code in commit. The python is now about 200x slower than the elementwise, so the CPP is about 100x times slower.
|
Hi, I refactored the CPP implementations to use a Python wrapper. Basically I tried to write the bare minimum necessary in CPP.
Performance results are similar as before. For small data, may be affected by Python wrapper addition but its a fixed cost that does not scale with data size. Chuan
|
…to lfilter_cpp_try_2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this! The gain in performance may be worth offering this C++ implementation to the user, so I'm ok with offering it. I'd like having a simple mechanism to offer/remove the interface to the implementation.
What's the status with GPU? Is it only for CPU in its current form? Should we have a forking mechanism when on CUDA and use the python implementation instead?
This PR has a lot of commits from other PRs (e.g. deltas, SpecAugment) that are tangle with it, and so needs to be rebased to isolate what is meant to be part of this PR.
# Perform sanity checks, input check, and memory allocation in python | ||
|
||
# Current limitations to be removed in future | ||
assert(waveform.dtype == torch.float32 or waveform.dtype == torch.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this implementation in C++ restricting the functionality w/r to the python one?
def biquad(waveform, b0, b1, b2, a0, a1, a2): | ||
# type: (Tensor, float, float, float, float, float, float) -> Tensor | ||
r"""Performs a biquad filter of input tensor. Initial conditions set to 0. | ||
https://en.wikipedia.org/wiki/Digital_biquad_filter | ||
|
||
Args: | ||
waveform (torch.Tensor): audio waveform of dimension of `(n_channel, n_frames)` | ||
waveform (torch.Tensor): Audio waveform of dimension of `(n_channel, n_frames)`. | ||
Currently only supports float32. Normalized [-1, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other float types?
By the way, it'd be nice to make a comparison with jit compiled python code, #326. |
cc #1238 |
To compare with the current python implementation of
lfilter
(implementation 0), we developed 2 cpp implementations.1.) an element wise implementation using accessors (implementation 1)
2.) a matrix based implementation (implementation 2)
We test these implementations against each other using random inputs, and a variety of signal lengths and filter orders:
Performance
Correctness
Questions / Comments