Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acf / pacf do not work on pandas objects #322

Closed
jseabold opened this issue Jun 20, 2012 · 8 comments

Comments

Projects
None yet
4 participants
@jseabold
Copy link
Member

commented Jun 20, 2012

Something I noticed writing some examples.

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Jun 26, 2012

Update. They work fine on pandas objects. It's only when you have missing data that they break. E.g.,

from statsmodels.datasets.macrodata import load_pandas
cpi = load_pandas().data["cpi"]
sm.tsa.pacf(cpi.diff())

Not really sure what to do here yet. Thoughts?

@jseabold

This comment has been minimized.

Copy link
Member Author

commented Jul 16, 2012

Spoke too soon ACF doesn't work

dta = sm.datasets.sunspots.load_pandas().data
dta.index = pandas.Index(sm.tsa.datetools.dates_from_range('1700', '2008'))
del dta["YEAR"]
sm.tsa.acf(dta['SUNACTIVITY'])
#AssertionError: Index length did not match values
sm.tsa.acf(dta) # needs a .values.squeeze()
#ValueError: object too deep for desired array
@jseabold

This comment has been minimized.

Copy link
Member Author

commented Nov 13, 2012

There's actually another issue here. The object is too deep for desired array happens because DataFrames can't be 1d. So we need not only an asarray but a squeeze and probably a dims check for a better error message.

@jseabold jseabold reopened this Nov 13, 2012

@jseabold jseabold closed this in bcdb025 Nov 13, 2012

@josef-pkt josef-pkt reopened this May 30, 2013

@josef-pkt

This comment has been minimized.

Copy link
Member

commented May 30, 2013

the asarray was only added to acovf. acf calls acovf if fft=False, but acf has it's own calculation if fft=True

suggested fix, add the same asarray and dim change to acf directly.

@tshauck

This comment has been minimized.

Copy link
Contributor

commented May 31, 2013

Alight, I'll try to look into the issue this weekend.

On Thursday, May 30, 2013, Josef Perktold wrote:

the asarray was only added to acovf. acf calls acovf if fft=False, but
acf has it's own calculation if fft=True

suggested fix, add the same asarray and dim change to acf directly.


Reply to this email directly or view it on GitHubhttps://github.com//issues/322#issuecomment-18713845
.

Trent Hauck

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Jun 22, 2013

this issue never mentions and links to the pull request #486

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this issue Sep 2, 2014

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this issue Sep 2, 2014

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this issue Sep 2, 2014

Merge branch 'fix-322'. Closes statsmodels#322 and statsmodels#486.
* fix-322:
  STY: Whitespace cleanup
  Explicity convert x to numpy array to allow pandas.Series to be passed
  Passing test for acovf with pandas Series, clean acovf import
  Added test to confirm it works with pandas Series
  Actually address issue statsmodels#322 with passing tests
  Addresses statsmodels#322: when passed a pandas Series acf will return a numpy array

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this issue Sep 2, 2014

@bashtage

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2014

@josef-pkt This appears to work in 0.5.0. Maybe ready to be closed, or it just missing a test?

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Sep 19, 2014

It's still broken

using the sunspot data from Skipper's example above

>>> dta
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 309 entries, 1700-12-31 00:00:00 to 2008-12-31 00:00:00
Data columns (total 1 columns):
SUNACTIVITY    309  non-null values
dtypes: float64(1)
>>> sm.tsa.acf(dta, fft=False)
array([ 1.        ,  0.82020129,  0.45126849,  0.03957655, -0.27579196,
       -0.42523943, -0.37659509, -0.15737391,  0.15820254,  0.47309753,
        0.65898002,  0.65029082,  0.45666254,  0.16179329, -0.12205105,
       -0.3161808 , -0.37471125, -0.30605753, -0.1348069 ,  0.09158727,
        0.2975632 ,  0.4207074 ,  0.41183954,  0.27020758,  0.04496208,
       -0.17428715, -0.33045026, -0.37287834, -0.28555061, -0.11794414,
        0.08293231,  0.24897507,  0.32752101,  0.28335919,  0.1375272 ,
       -0.05526386, -0.22973205, -0.31338879, -0.29355684, -0.17897285,
       -0.01769038])
>>> sm.tsa.acf(dta, fft=True)
array([[  1.00000000e+00,              nan,   1.00000000e+00, ...,
          1.00000000e+00,   1.00000000e+00,              nan],
       [  7.49831457e-01,              nan,  -1.00000000e+00, ...,
         -1.00000000e+00,  -1.00000000e+00,              nan],
       [  5.68819900e-01,              nan,   5.00000000e-01, ...,
          5.00000000e-01,   5.00000000e-01,              nan],
       ..., 
       [  1.21401450e+01,              nan,   0.00000000e+00, ...,
          0.00000000e+00,   0.00000000e+00,              nan],
       [  8.49950450e+00,              nan,   7.09642837e-17, ...,
         -4.37578782e-17,   7.09642837e-17,              nan],
       [  1.74907666e+00,              nan,  -8.87053546e-18, ...,
          5.46973478e-18,  -8.87053546e-18,              nan]])
>>> sm.tsa.acf(dta, fft=True).shape
(41, 625)

>>> sm.tsa.acf(dta.values.squeeze(), fft=True)
array([ 1.        ,  0.82020129,  0.45126849,  0.03957655, -0.27579196,
       -0.42523943, -0.37659509, -0.15737391,  0.15820254,  0.47309753,
        0.65898002,  0.65029082,  0.45666254,  0.16179329, -0.12205105,
       -0.3161808 , -0.37471125, -0.30605753, -0.1348069 ,  0.09158727,
        0.2975632 ,  0.4207074 ,  0.41183954,  0.27020758,  0.04496208,
       -0.17428715, -0.33045026, -0.37287834, -0.28555061, -0.11794414,
        0.08293231,  0.24897507,  0.32752101,  0.28335919,  0.1375272 ,
       -0.05526386, -0.22973205, -0.31338879, -0.29355684, -0.17897285,
       -0.01769038])

jseabold added a commit to jseabold/statsmodels that referenced this issue Sep 26, 2014

jseabold added a commit that referenced this issue Sep 26, 2014

Merge pull request #2012 from jseabold/acf-dataframe
BUG: 2d 1 columns -> 1d. Closes #322.

yarikoptic added a commit to yarikoptic/statsmodels that referenced this issue Oct 23, 2014

Merge commit 'v0.5.0-1491-g850e0e4' into debian-experimental
* commit 'v0.5.0-1491-g850e0e4': (178 commits)
  DOC: Fix versions to match other docs.
  REF/ENH: Use clip pattern. Use it for resid_dev in Poisson.
  STY: Pep-8
  ENH: More numerically stable inv. nbinom.
  STY: Pep-8
  ENH: More numerically stable version of invlogit.
  TST: Test invlogit stability.
  BUG: Fix prediction for ARIMA d > 1. Closes statsmodels#1562.
  TST: Test predict for ARIMA with d > 1
  TST: Test forecast with ARIMA d > 1.
  BUG: Fix ARIMA.forecast for d > 1.
  ENH: Cleanup unintegrate. Add unintegrate_levels
  STY: Cleanup imports
  ENH: Better error message on object dtype. Closes statsmodels#880
  TST: Test dtype object error
  TST: Test DataFrame ACF with FFT.
  BUG: 2d 1 columns -> 1d. Closes statsmodels#322.
  TST: Silence convergence warnings in tests.
  ENH: Do not warn on intermediate results convergence.
  TST: Silence test warnings.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.