Implement weighted moving average (pass array of weights) #886

Closed
wesm opened this Issue Mar 8, 2012 · 6 comments

Projects

None yet

7 participants

I gave it a shot. Weights at the moment do not have to be normalized. When minimum period is less than the length of the weights it doesn't renorm the average. Not sure what the behavior should be.

https://github.com/jeffhsu3/pandas/compare/weighted_average

Contributor
meteore commented Nov 2, 2012

This would allow to do convolutions and FIR filtering with Pandas.

I'm hoping to get this functionality as well.

Contributor
jseabold commented Mar 2, 2013

I'm not sure I understand why you wouldn't use existing convolution code in numpy/scipy to do this? Would Cython really be faster than using fftconvolve on big series? I didn't check.

[~/]
[50]: np.random.seed(12345)

[~/]
[51]: x = np.random.random(size=100)

[~/]
[52]: x = pandas.Series(x)

[~/]
[53]: weights = np.random.random(10)

[~/]
[54]: weights /= weights.sum()

[~/]
[55]: from scipy.signal import fftconvolve, convolve # depends on size

[~/]
[56]: fftconvolve(x, weights, mode='valid')
[56]: 
array([ 0.48806114,  0.4699973 ,  0.56282001,  0.62870073,  0.68197355,
        0.57545891,  0.62169473,  0.64379092,  0.65297064,  0.60404881,
        0.52361615,  0.50352046,  0.47251414,  0.58907332,  0.69035363,
        0.71600119,  0.71412992,  0.72127673,  0.70794396,  0.65661495,
        0.55435558,  0.52407079,  0.49853205,  0.58023334,  0.7016371 ,
        0.65212416,  0.49190644,  0.47329566,  0.46863786,  0.45679194,
        0.46234051,  0.58391968,  0.54200453,  0.48114622,  0.51340279,
        0.53233502,  0.51497697,  0.5358541 ,  0.62072028,  0.56308463,
        0.49212517,  0.40754766,  0.33825073,  0.35123196,  0.42332693,
        0.51606583,  0.52950296,  0.44079394,  0.42975181,  0.49261896,
        0.59498324,  0.6180508 ,  0.60367819,  0.48303673,  0.46025441,
        0.49618106,  0.53169628,  0.54171016,  0.59075358,  0.53381996,
        0.44129289,  0.49348725,  0.51075933,  0.55902604,  0.60853367,
        0.6697937 ,  0.71312713,  0.62902879,  0.59838022,  0.55089196,
        0.59655967,  0.53044648,  0.53761734,  0.53084328,  0.47724668,
        0.51884146,  0.63705283,  0.64955511,  0.59763594,  0.56413901,
        0.50962114,  0.49927102,  0.44665032,  0.52229761,  0.50640236,
        0.5093921 ,  0.54369071,  0.52473202,  0.68901516,  0.67945055,
        0.72106766])

Is the idea just some syntactic sugar for doing this on pandas objects? You might also be interested in time-series filters in statsmodels depending on domain of interest.

http://statsmodels.sourceforge.net/devel/tsa.html#other-time-series-filters

Contributor
jreback commented Sep 21, 2013

closing as more in statsmodels domain

@jreback jreback closed this Sep 21, 2013
Contributor
cpcloud commented Sep 21, 2013

FWIW fftconvolve will be faster than rolling your own Cython version. You'd have to implement your own FFT to beat the "definition" implementation (used by correlate and convolve). Doing that is basically useless unless you're learning about FFT (there are high-quality implementations that have been around for a while, fftw, fftpack, maybe more)

Of course this also means that you should use fftconvolve for cross-correlation. Why numpy decided to default on the slower version, we'll never know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment