Resampling converts int to float, but only in group by #12202

Closed
m313 opened this Issue Feb 2, 2016 · 1 comment

Comments

Projects
None yet
2 participants

m313 commented Feb 2, 2016

import pandas as pd
import numpy as np

df = pd.DataFrame({'date': pd.date_range(start='2016-01-01', periods=4, freq='W'),
               'group': [1, 1, 2, 2],
               'val': [5, 6, 7, 8]})
df['val'] = df['val'].astype(np.int32)
df.set_index('date', inplace=True)

df['val'].dtype
#[out] dtype('int32')

Calling resample() on the dataframe does not change the type.

df1 = df.resample('1D', fill_method='ffill')
df1['val'].dtype
#[out] dtype('int32')

However, when calling resample() in a group by statement, float type is returned!

df2 = df.groupby('group').resample('1D', fill_method='ffill')
df2['val'].dtype
#[out] dtype('float64')

Why is val converted to float in the group by statement?
I originally posted this on stack overflow.

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8

pandas: 0.17.1
nose: None
pip: 7.1.2
setuptools: 18.2
Cython: 0.23.4
numpy: 1.10.1
scipy: None
statsmodels: None
IPython: 4.0.0
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.4.6
matplotlib: 1.5.0
openpyxl: None
xlrd: 0.9.4
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
Jinja2: None
Contributor

jreback commented Feb 2, 2016

I don't think this was getting passed thru correctly, and is fixed in the recently merged
#11841 (in master / 0.18.0 coming soon)

Note the new api

just need confirming tests

In [13]: df.groupby('group').resample('1D').ffill()
Out[13]: 
            val
date           
2016-01-03    5
2016-01-10    6
2016-01-17    7
2016-01-24    8

In [14]: df.groupby('group').resample('1D').ffill().val.dtype
Out[14]: dtype('int32')

jreback added this to the 0.18.0 milestone Feb 2, 2016

jreback added the Testing label Feb 10, 2016

@jreback jreback added a commit to jreback/pandas that referenced this issue Feb 12, 2016

@jreback jreback TST: validation tests for resample/groupby preservation
closes #12202
bde178a

jreback closed this in 311b9a9 Feb 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment