New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resample() silently ignores misstyped str kwargs, e.g. for label #19303

Closed
cchwala opened this Issue Jan 18, 2018 · 1 comment

Comments

Projects
None yet
3 participants
@cchwala
Contributor

cchwala commented Jan 18, 2018

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: df = pd.DataFrame(index=pd.date_range(start='2010-01-23', periods=1000, freq='min'), data={'foo': pd.np.random.rand(1000)})

In [3]: df.resample('h').mean().tail()
Out[3]: 
                          foo
2010-01-23 12:00:00  0.477997
2010-01-23 13:00:00  0.488294
2010-01-23 14:00:00  0.426095
2010-01-23 15:00:00  0.549081
2010-01-23 16:00:00  0.466148

In [4]: df.resample('h', label='right').mean().tail()
Out[4]: 
                          foo
2010-01-23 13:00:00  0.477997
2010-01-23 14:00:00  0.488294
2010-01-23 15:00:00  0.426095
2010-01-23 16:00:00  0.549081
2010-01-23 17:00:00  0.466148

In [5]: 

In [5]: df.resample('h', label='right_wrong').mean().tail()
Out[5]: 
                          foo
2010-01-23 12:00:00  0.477997
2010-01-23 13:00:00  0.488294
2010-01-23 14:00:00  0.426095
2010-01-23 15:00:00  0.549081
2010-01-23 16:00:00  0.466148

# The last command should fail!

Problem description

Recently (and maybe many times before without noticing...) I mistyped the string passed as kwarg label in resample, e.g. df.resample('5min', label='rigth') and wondered later why the alignment of my resampled data with the raw data was wrong. It turns out that the actual string passed as kwarg label is never checked to be either left or right. Hence, if it is

Is this on purpose, or should this be fixed?

If yes, a check could be inserted here. Or maybe deeper in the class hierarchy? Maybe other kwargs, like closed could also be affected.

If I get a pointer I could come up with a PR.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.14.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: None.None

pandas: 0.21.0
pytest: 3.2.5
pip: 9.0.1
setuptools: 36.7.2
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: 0.7.1
xarray: 0.10.0
IPython: 5.5.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.1.0
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: 2.7.3.2 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
@chris-b1

This comment has been minimized.

Contributor

chris-b1 commented Jan 18, 2018

I agree this should raise, and I think where you pointed (TimeGrouper.__init__ ) is the right spot to do it.

I also agree that closed can & should be checked. convention looks like it raises an error eventually.

df.index = df.index.to_period()
df.resample('h', convention='junk').mean().tail()
ValueError: How must be one of S or E

So a PR is definitely welcome!

@chris-b1 chris-b1 added this to the Next Major Release milestone Jan 18, 2018

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Jan 18, 2018

@jreback jreback changed the title from `resample()` silently ignores misstyped str kwargs, e.g. for `label` to resample() silently ignores misstyped str kwargs, e.g. for label Jan 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment