New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: rolling.corr() produces wrong result with equal values #18430

Closed
byospe opened this Issue Nov 22, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@byospe

byospe commented Nov 22, 2017

Code Sample, a copy-pastable example if possible

s = pd.Series([1,1,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,7,0,0,0])
pd.rolling_corr(s,s,6)

Problem description

rolling_corr is producing the wrong result:

python
pd.rolling_corr(s,s,6)

0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 1.0
6 1.0
7 1.0
8 1.0
9 0.0
10 0.0
11 0.0
12 0.0
13 0.0
14 0.0
15 0.0
16 0.0
17 0.0
18 0.0
19 0.0
20 0.0
21 0.0
22 0.0
23 0.0
24 0.0
25 0.0
26 1.0
27 1.0
28 1.0
29 1.0
30 1.0
31 1.0
32 1.0
33 1.0

This should have nan's instead of 0's for windows with static data.

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.35-pv-ts2
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.0.7
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.6.2
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.3.0
numexpr: 2.6.2
feather: None
matplotlib: 1.5.1
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: None
xlsxwriter: 0.9.8
lxml: None
bs4: 4.5.3
html5lib: 0.999
sqlalchemy: 1.1.10
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 22, 2017

Contributor

I think this is a very similar problem to #18044, recently fixed in #18085. We are NaNing out values that are numerically very close to zero (e.g. denominator is std * std), but in this case we are missing it because they are not identically zero.

Contributor

jreback commented Nov 22, 2017

I think this is a very similar problem to #18044, recently fixed in #18085. We are NaNing out values that are numerically very close to zero (e.g. denominator is std * std), but in this case we are missing it because they are not identically zero.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Nov 22, 2017

Contributor

@byospe want to try a PR to fix?

Contributor

jreback commented Nov 22, 2017

@byospe want to try a PR to fix?

@jreback jreback added this to the Next Major Release milestone Nov 22, 2017

@jreback jreback changed the title from pd.rolling_corr produces wrong result to BUG:rolling.corr() produces wrong result with equal values Nov 22, 2017

@jreback jreback changed the title from BUG:rolling.corr() produces wrong result with equal values to BUG: rolling.corr() produces wrong result with equal values Nov 22, 2017

@Licht-T

This comment has been minimized.

Show comment
Hide comment
@Licht-T

Licht-T Nov 25, 2017

Contributor

I am working on this and the fixing is almost done.
Seems that the numerical calculation matter.
https://github.com/pandas-dev/pandas/blob/master/pandas/core/window.py#L1064

Contributor

Licht-T commented Nov 25, 2017

I am working on this and the fixing is almost done.
Seems that the numerical calculation matter.
https://github.com/pandas-dev/pandas/blob/master/pandas/core/window.py#L1064

@Licht-T Licht-T referenced this issue Nov 25, 2017

Merged

BUG: Fix inaccurate rolling.var calculation #18481

4 of 4 tasks complete

@jreback jreback modified the milestones: Next Major Release, 0.21.1 Nov 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment