Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use FFT in np.correlate/convolve? (Trac #1260) #1858

Open
thouis opened this issue Oct 19, 2012 · 11 comments
Open

Use FFT in np.correlate/convolve? (Trac #1260) #1858

thouis opened this issue Oct 19, 2012 · 11 comments

Comments

@thouis
Copy link
Contributor

thouis commented Oct 19, 2012

Original ticket http://projects.scipy.org/numpy/ticket/1260 on 2009-10-12 by trac user roger, assigned to unknown.

The convolve and correlate functions appear to be much slower than their MATLAB 2009a equivalents.

The MATLAB command xcorr(randn(1e6,1)) takes about 0.35s to execute, while the Python equivalent x = randn(1e6);correlate(x, x) takes more than a minute (then killed). MATLAB is also much faster for arrays of 1e5 elements.

fftconvolve in scipy.signal is even slower.

Tested with Python 2.6.3 and numpy 1.30 (x86) under x64 Windows 7 and x64 Ubuntu on an i7.

@numpy-gitbot
Copy link

@josef-pkt wrote on 2009-10-13

I don't know what xcorr does, but the full convolution in standard matlab (2009b):

tmp = randn(1e6,1);
conv(tmp, tmp(end:-1:1);

also takes forever (around or close to an hour).

np.correlate(x,x) takes no time since it only calculates one value.

np.correlate(x,x,'full') is also taking around an hour.

Maybe a truncated convolution/correlation would be good.

@numpy-gitbot
Copy link

@pv wrote on 2009-10-15

xcorr(x,x) is the same as correlate(x,x,'full').

One thing is that fftconvolve uses 2*M-1 sized FFT. It's never a power of two, which seems to make it slow. I fixed that in http://projects.scipy.org/scipy/changeset/5968

The remaining question then is whether we want to use FFT also in Numpy for these functions.

@numpy-gitbot
Copy link

Milestone changed to 1.4.0 by @pv on 2009-10-15

@numpy-gitbot
Copy link

Title changed from Poor convolve/correlate performance to Use FFT in np.correlate/convolve? by @pv on 2009-10-15

@numpy-gitbot
Copy link

@josef-pkt wrote on 2009-10-16

I'm using np.correlate for time series analysis where the number of observations is more in the range of a few hundred to 10000 (in finance), and also with integers (for bias correction.)

Is there a disadvantage in using the fft for this?
If yes, then I think, I would be more in favor of adding a 1d np.fftconvolve, similar to the duplication 1d <-> nd between np.correlate/convolve and scipy.signal.correlate/convolve

(xcorr in matlab signal toolbox seems to use fft.)

@numpy-gitbot
Copy link

@pv wrote on 2009-10-16

I don't think there is any significant disadvantage in using FFT. It might be a bit slower for small N than explicit convolution, and might for intermediate N introduce more numerical error (we're talking maybe about 1e-11 or so). IIRC, for large N FFT should be better both for the numerical error and performance.

I don't believe you can perform FFT convolution with integers using the tools currently in Numpy -- cast to complex is probably always needed.

(An interesting exercise might be to write an explicit divide-and-conquer version of convolution, it's possible to do since it can be done via FFT works :)

@numpy-gitbot
Copy link

@endolith wrote on 2009-11-20

  1. The [http://docs.scipy.org/doc/scipy-0.7.x/reference/generated/scipy.stsci.convolve.correlate2d.html#scipy.stsci.convolve.correlate2d STSCI correlate2d/convolve2d] functions (no longer included with Scipy) have an fft=True option for choosing whether to use it or not. Would this be a good way to go? (Fold the functionality of fftconvolve into the other functions as an option.)

  2. Using the real FFT [http://projects.scipy.org/scipy/ticket/973 halves the computation time](and memory?) when the inputs are real. It seems fftconvolve uses the [http://docs.scipy.org/doc/scipy/reference/fftpack.html scipy version of fftpack], while [http://docs.scipy.org/doc/numpy/reference/generated/numpy.fft.rfftn.html#numpy.fft.rfftn rfftn] only exists in the numpy version of fftpack. Otherwise, would something like this work?

    if not complex_result:
        IN1 = rfftn(in1,size)
        IN1 *= rfftn(in2,size)
        ret = irfftn(IN1)
        # ret = ret.real not needed?
    else:
        IN1 = fftn(in1,size)
        IN1 *= fftn(in2,size)
        ret = ifftn(IN1)
    del IN1
    
  3. Improving the FFT itself to work with prime sizes seems like a better solution than modifying every function that uses it: Implement bluestein algorithm for prime size FFT (Trac #579) #1177 http://projects.scipy.org/scipy/ticket/949

@numpy-gitbot
Copy link

@cournape wrote on 2009-11-25

FFT has several disadvantages: it does not work for object arrays, can be memory hungry, and slower when only a few values of the correlation is needed.

I think we should keep the current implementation based on naive implementation, and add a FFT-based one. The current function could take an argument to force naive, fft or heuristic (to try to pick the fastest one).

@numpy-gitbot
Copy link

Milestone changed to Unscheduled by @cournape on 2009-11-25

@numpy-gitbot
Copy link

@endolith wrote on 2009-12-01

Also, in http://projects.scipy.org/scipy/changeset/5968, shouldn't

163     ret = ifftn(IN1)

have been removed?

@numpy-gitbot
Copy link

@endolith wrote on 2011-02-18

Replying to [comment:3 pv]:

One thing is that fftconvolve uses 2*M-1 sized FFT. It's never a power of two, which seems to make it slow. I fixed that in http://projects.scipy.org/scipy/changeset/5968

Shouldn't that be optimized in the fft function itself, instead of modifying all the functions that use it? It could be optimized to use the real FFT when appropriate, too.

Replying to [comment:8 cdavid]:

FFT has several disadvantages: it does not work for object arrays

I think we should keep the current implementation based on naive implementation, and add a FFT-based one. The current function could take an argument to force naive, fft or heuristic (to try to pick the fastest one).

This is what I was asking for. Can object arrays be automatically handled by the naive implementation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants