Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Offset-based rolling window, with only one raw in dataframe and closed='left', max and min functions make python crash #24718

Closed
jh-wu opened this issue Jan 11, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@jh-wu
Copy link
Contributor

commented Jan 11, 2019

Code Sample

import pandas as pd

# Case 1
df = pd.Series(data=[2], index=pd.date_range('2000', periods=1))
s = df.rolling('10D', closed='left').max()
print(s)

# Case 2 -- by groupby
df = pd.DataFrame(data={'A':[1,1,2],'B': [3,2,1]}, index=pd.date_range('2000', periods=3))
s = df.groupby('A', sort = False)['B'].rolling('10D', closed='left').max()
print(s)

Problem description

This issue is very much similar to issue #21704. So that I copied its problem description below. The only difference is that there is only one entry in the series, which causes the problem.

(From #21704)
"With this rolling and aggregation function, python just crashes. It does too with .min() or .agg(np.max) as aggregation steps. It does not when closed='right' or with mean as an aggregation function."

Expected Output

#Case 1
2000-01-01    NaN
Freq: D, dtype: float64

#Case 2
A            
1  2000-01-01    NaN
   2000-01-02    3.0
2  2000-01-03    NaN
Name: B, dtype: float64


Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: en_AU.UTF-8

pandas: 0.24.0.dev0+1563.gd46d2a5ea
pytest: 4.1.0
pip: 18.1
setuptools: 40.6.2
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.2.0
pyarrow: 0.11.1
xarray: 0.11.2
IPython: 7.2.0
sphinx: 1.8.3
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.9
blosc: 1.7.0
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml.etree: 4.3.0
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.2.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: 0.2.0
fastparquet: 0.2.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd

This comment has been minimized.

Copy link
Member

commented Jan 11, 2019

OK thanks for the report. Investigation and PRs are always welcome!

@jh-wu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 17, 2019

Hi @WillAyd, I have spent some time on investigating this bug today. It seems to me that it is caused by not handling the case when the deque Q is empty in function _roll_min_max_variable of window.pyx (line 1342). I created a PR for this and please review when you have time. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.