Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Resample category data with timedelta index #12169
Comments
|
hmm, does appear a little buggy. you shouldn't need to specify the dtype on aggregations they are inferred. Here I think there is an embedded exception which is caught in stead of actuallly computing correctly. |
jreback
added this to the
0.18.0
milestone
Jan 28, 2016
|
I look after #11841 as the timedelta resampling is tested a bit more there (but not enough!) |
|
The root cause of this issue is that, when construct Series from a dict with TimedeltaIndex as key, it will treat the value as float64. See pandas/core/series.py, from line 172 to 185 try:
if isinstance(index, DatetimeIndex):
if len(data):
# coerce back to datetime objects for lookup
data = _dict_compat(data)
data = lib.fast_multiget(data, index.astype('O'),
default=np.nan)
else:
data = np.nan
elif isinstance(index, PeriodIndex):
data = ([data.get(i, nan) for i in index]
if data else np.nan)
else:
data = lib.fast_multiget(data, index.values,
default=np.nan)I believe just change Before In [5]: fxx = d2.resample('10s', how=lambda x: (x.value_counts().index[0]))
In [6]: fxx
Out[6]:
Group_obj Group
00:00:00 A NaN
00:00:10 A NaNAfter In [5]: fxx = d2.resample('10s', how=lambda x: (x.value_counts().index[0]))
In [6]: fxx
Out[6]:
Group_obj Group
00:00:00 A A
00:00:10 A A |
BranYang
added a commit
to BranYang/pandas
that referenced
this issue
Feb 9, 2016
|
|
BranYang |
7cf1be9
|
BranYang
referenced
this issue
Feb 9, 2016
Closed
Fix #12169 - Resample category data with timedelta index #12271
jreback
closed this
in e9558d3
Feb 10, 2016
cldy
added a commit
to cldy/pandas
that referenced
this issue
Feb 11, 2016
|
|
BranYang + cldy |
fa1e2c8
|
mapa17 commentedJan 28, 2016
Hi,
I get a very strange behavior when i try to resample categorical data with and timedelta index, as compared to a datetime index.
It seems to me the aggregated result in case of using timedelta as an index for the category is always NaN.
Should this be?
Thx
PS: is there a way to specify the dtype for the aggregated columns?