Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Improve performance of lfilter / filtfilt by using scipy.ndimage.convolve1d #17080

Open
maxnoe opened this issue Sep 23, 2022 · 6 comments
Labels
enhancement A new feature or improvement scipy.signal

Comments

@maxnoe
Copy link
Contributor

maxnoe commented Sep 23, 2022

Is your feature request related to a problem? Please describe.

The scipy.signal.lfilter uses np.apply_along_axis with np.convolve

out_full = np.apply_along_axis(lambda y: np.convolve(b, y), axis, x)

In my simple tests, this is much slower than what seems to be exactly equivalent:

scipy.ndimage.convolve1d(x, b, axis)

Describe the solution you'd like.

Switch to using convolve1d if these calls are really equivalent?

Describe alternatives you've considered.

No response

Additional context (e.g. screenshots, GIFs)

No response

@maxnoe maxnoe added the enhancement A new feature or improvement label Sep 23, 2022
@roryyorke
Copy link
Contributor

Interesting - I would have thought np.convolve is C-level and unlikely to be much (if at all) slower. Does scipy.ndimage.convolve1d support all the types (float32, float64, complex64, complex128, I think) that np.convolve etc. do?

Some timing results would be good.

Could you attach the scipy.signal label? (I'm not a dev or triager, so I can't do that.)

@maxnoe
Copy link
Contributor Author

maxnoe commented Oct 9, 2022

Interesting - I would have thought np.convolve is C-level and unlikely to be much (if at all) slower.

convolve is, the problem is apply_along_axis which is essentially a python loop.

@roryyorke
Copy link
Contributor

I get these timings for different orders and inputs lengths (see script at end). I was a little surprised as the low order result, but I ran it again and got much the same figure.

firfilt-time

ndimage is slowest for low order and large input length, but fastest for intermediate and high(-ish?) order. As input length and order increase the difference become small.

ndimage is about 3 times faster for short inputs and higher order filters, but the absolute times in that case are smaller (under 1ms).

import sys
import timeit

import numpy as np
import scipy
import matplotlib.pyplot as plt

from scipy import signal, ndimage

print(f'{sys.version = !s}\n{np.__version__ = !s}\n{scipy.__version__ = !s}')

def firfilter(b, u):
    return ndimage.convolve1d(u, b, mode="constant", cval=0, origin=-(len(b)//2))

def firfilter2(b, u):
    return np.convolve(b, u)[:len(u)-(len(b)-1)]

stats = []

orders = [2, 12, 22]
nus = 10**np.arange(3,8)

for order in orders:
    b = signal.firwin(order, 0.2)
    ostats = []
    for nu in nus:
        print(order, nu, flush=True)
        u = np.random.normal(size=nu)
        ostats.append(timeit.Timer('signal.lfilter(b, 1, u)', globals=globals()).autorange())
        ostats.append(timeit.Timer('firfilter(b, u)', globals=globals()).autorange())
        ostats.append(timeit.Timer('firfilter2(b, u)', globals=globals()).autorange())
        
    stats.append(ostats)

astats = np.array(stats)
# axes are order, nu, func
times = np.reshape(astats[...,1] / astats[...,0], [len(orders),-1,3])
fig, ax = plt.subplots(1, 3, sharey='row')

for i in range(3):
    ax[i].loglog(nus, times[i])
    ax[i].legend(['lflt','ndimg','npconv'])
    ax[i].set_xlabel('nu')
    ax[i].set_title(f'FIR order {orders[i]}')

ax[0].set_ylabel('time [s]')

fig.tight_layout()
fig.savefig('firfilt-time.png')

@maxnoe
Copy link
Contributor Author

maxnoe commented Oct 22, 2022

But you compare application on a single 1d array. The point is that the scipy filter functions use np.convolve in a np.apply_along_axis call, which creates a pure python loop and calls convolve for each subarry.

convolve1d supports axis without going through a python python loop.

@roryyorke
Copy link
Contributor

Ah, I see. This is a much simpler example demonstrating the problem:

In [27]: u = np.random.normal(size=(3,10**6))

In [29]: %timeit signal.lfilter([0.5, 0.5], 1, u)
9.86 ms ± 37.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [30]: %timeit signal.lfilter([0.5, 0.5], 1, u, axis=0)
4.74 s ± 50.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [35]: %timeit firfilter([0.5, 0.5], u)
16.4 ms ± 27.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [36]: %timeit firfilter([0.5, 0.5], u, axis=0)
30 ms ± 125 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

This uses this definition of firfilter:

def firfilter(b, u, axis=-1):
    return ndimage.convolve1d(u, b, mode="constant", cval=0, origin=-(len(b)//2), axis=axis)

signal already depends on ndimage, specifically ndimage.convolve1d in _savitzky_golay.py, presumably for similar speed-of-execution reasons.

@AtsushiSakai
Copy link
Member

When looking at the documentation for np.convolve, with the default mode 'full', the size of the output vector is N (size of x) + M (size of b) - 1.
https://numpy.org/doc/stable/reference/generated/numpy.convolve.html

On the other hand, according to the documentation for ndimage.convolve1d, the size of the output vector is the same as N (the size of x).
https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.convolve1d.html
Therefore, when the size of b is anything other than 1, it seems that the sizes of the output vectors from these two functions will differ. Not equivalent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.signal
Projects
None yet
Development

No branches or pull requests

4 participants