Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Implement weighted moving average (pass array of weights) #886

Closed
wesm opened this Issue · 6 comments

7 participants

Wes McKinney Jeffrey Hsu meteore Troy Ponthieux Skipper Seabold jreback Phillip Cloud
Jeffrey Hsu

I gave it a shot. Weights at the moment do not have to be normalized. When minimum period is less than the length of the weights it doesn't renorm the average. Not sure what the behavior should be.

https://github.com/jeffhsu3/pandas/compare/weighted_average

meteore

This would allow to do convolutions and FIR filtering with Pandas.

Troy Ponthieux

I'm hoping to get this functionality as well.

Skipper Seabold
Collaborator

I'm not sure I understand why you wouldn't use existing convolution code in numpy/scipy to do this? Would Cython really be faster than using fftconvolve on big series? I didn't check.

[~/]
[50]: np.random.seed(12345)

[~/]
[51]: x = np.random.random(size=100)

[~/]
[52]: x = pandas.Series(x)

[~/]
[53]: weights = np.random.random(10)

[~/]
[54]: weights /= weights.sum()

[~/]
[55]: from scipy.signal import fftconvolve, convolve # depends on size

[~/]
[56]: fftconvolve(x, weights, mode='valid')
[56]: 
array([ 0.48806114,  0.4699973 ,  0.56282001,  0.62870073,  0.68197355,
        0.57545891,  0.62169473,  0.64379092,  0.65297064,  0.60404881,
        0.52361615,  0.50352046,  0.47251414,  0.58907332,  0.69035363,
        0.71600119,  0.71412992,  0.72127673,  0.70794396,  0.65661495,
        0.55435558,  0.52407079,  0.49853205,  0.58023334,  0.7016371 ,
        0.65212416,  0.49190644,  0.47329566,  0.46863786,  0.45679194,
        0.46234051,  0.58391968,  0.54200453,  0.48114622,  0.51340279,
        0.53233502,  0.51497697,  0.5358541 ,  0.62072028,  0.56308463,
        0.49212517,  0.40754766,  0.33825073,  0.35123196,  0.42332693,
        0.51606583,  0.52950296,  0.44079394,  0.42975181,  0.49261896,
        0.59498324,  0.6180508 ,  0.60367819,  0.48303673,  0.46025441,
        0.49618106,  0.53169628,  0.54171016,  0.59075358,  0.53381996,
        0.44129289,  0.49348725,  0.51075933,  0.55902604,  0.60853367,
        0.6697937 ,  0.71312713,  0.62902879,  0.59838022,  0.55089196,
        0.59655967,  0.53044648,  0.53761734,  0.53084328,  0.47724668,
        0.51884146,  0.63705283,  0.64955511,  0.59763594,  0.56413901,
        0.50962114,  0.49927102,  0.44665032,  0.52229761,  0.50640236,
        0.5093921 ,  0.54369071,  0.52473202,  0.68901516,  0.67945055,
        0.72106766])

Is the idea just some syntactic sugar for doing this on pandas objects? You might also be interested in time-series filters in statsmodels depending on domain of interest.

http://statsmodels.sourceforge.net/devel/tsa.html#other-time-series-filters

jreback
Owner

closing as more in statsmodels domain

jreback jreback closed this
Phillip Cloud
Collaborator

FWIW fftconvolve will be faster than rolling your own Cython version. You'd have to implement your own FFT to beat the "definition" implementation (used by correlate and convolve). Doing that is basically useless unless you're learning about FFT (there are high-quality implementations that have been around for a while, fftw, fftpack, maybe more)

Of course this also means that you should use fftconvolve for cross-correlation. Why numpy decided to default on the slower version, we'll never know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.