pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D #24127

lucky06688 · 2018-12-06T09:26:05Z

Code Sample, a copy-pastable example if possible

>>> pd.__version__
'0.23.4'
>>> index = pd.date_range('11/11/2000 06:00:00', periods=50, freq='H')
>>> series = pd.Series(range(50), index=index)
>>> series.head(10)
2000-11-11 06:00:00    0
2000-11-11 07:00:00    1
2000-11-11 08:00:00    2
2000-11-11 09:00:00    3
2000-11-11 10:00:00    4
2000-11-11 11:00:00    5
2000-11-11 12:00:00    6
2000-11-11 13:00:00    7
2000-11-11 14:00:00    8
2000-11-11 15:00:00    9
Freq: H, dtype: int64
>>> series.resample('24H').count()
2000-11-11    18
2000-11-12    24
2000-11-13     8
Freq: 24H, dtype: int64
>>> series.resample('1D').count()
2000-11-11    18
2000-11-12    24
2000-11-13     8
Freq: D, dtype: int64
>>> series.resample('48H').count()
2000-11-11    42
2000-11-13     8
Freq: 48H, dtype: int64
>>> series.resample('2D').count()
2000-11-11 06:00:00    48
2000-11-13 06:00:00     2
dtype: int64

Problem description

As you can see, when I set rule to '24H' or '1D', the behavior of resample is consistent (in both cases it starts at 0 o'clock on the first day), but for '48H' and '2D' it's obviously not.

Expected Output

If the behavior consistency of ‘24H’ and ‘1D’ is correct, then the behaviors of '48H' and '2D' should also be consistent.

Output of `pd.show_versions()`

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.4.3
Cython: None
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

aa1371 · 2018-12-07T15:39:06Z

Seems the issue is in pandas.core.resample._get_range_edges, line 1591.

I think day_nanos % offset.nanos == 0 should be the other way around. This will only ever evaluate to True for Day(n=1). I’m guessing this wasn’t the inention as it would be overly complicated compared to day_nanos == offset.nanos.

aa1371 · 2018-12-07T22:48:40Z

Added the fix in this commit since I was already updating _get_range_edges. Tests pass and now gives the expected output.

eb05501

Edit: It was suggested that a new PR be opened for this (#24195)

This was referenced Dec 7, 2018

BUG/ENH - base argument no longer ignored in period resample #23941

Merged

BUG - anchoring dates for resample with Day(n>1) #24159

Merged

jreback added Bug Resample resample method labels Dec 13, 2018

jreback added this to the 0.24.0 milestone Dec 13, 2018

jreback closed this as completed in #24159 Dec 13, 2018

jwenfai mentioned this issue Dec 22, 2018

CFTimeIndex Resampling pydata/xarray#2593

Merged

3 tasks

cwhanse mentioned this issue May 30, 2019

aggregation returns error with 1D frequency. '24h' frequency is ok. NREL/rdtools#114

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D #24127

pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D #24127

lucky06688 commented Dec 6, 2018

aa1371 commented Dec 7, 2018

aa1371 commented Dec 7, 2018 •

edited

Loading

pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D #24127

pandas.DataFrame.resample behaves inconsistently when the rule is set to 24H/1D and 48H/2D #24127

Comments

lucky06688 commented Dec 6, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

aa1371 commented Dec 7, 2018

aa1371 commented Dec 7, 2018 • edited Loading

Output of `pd.show_versions()`

aa1371 commented Dec 7, 2018 •

edited

Loading