DataFrame.apply not working with datetimes #6125

dbew · 2014-01-27T15:19:18Z

When you use apply on a DataFrame with datetimes in, the result is unexpected. This is a dataframe with just integers and strings and the result is that we get the market names back out.

positions = pd.DataFrame([[1, 'ABC', 50], [1, 'YUM', 20], 
                          [1, 'DEF', 20], [2, 'ABC', 50],
                          [2, 'YUM', 20], [2, 'DEF', 20]],
                         columns=['a', 'market', 'position'])
positions.apply(lambda r: r['market'], axis=1)
Out[210]: 
0    ABC
1    YUM
2    DEF
3    ABC
4    YUM
5    DEF
dtype: object

If we replace the data in column 'a' with datetimes, then we get the wrong result - the first value in the market column is repeated:

import datetime

positions = pd.DataFrame([[datetime.datetime(2013, 1, 1), 'ABC', 50], 
                           [datetime.datetime(2013, 1, 1), 'YUM', 20],
                           [datetime.datetime(2013, 1, 1), 'DEF', 20],
                           [datetime.datetime(2013, 1, 2), 'ABC', 50],
                           [datetime.datetime(2013, 1, 2), 'YUM', 20], 
                           [datetime.datetime(2013, 1, 2), 'DEF', 20]],
                          columns=['a', 'market', 'position'])
positions.apply(lambda r: r['market'], axis=1)
Out[213]: 
0    ABC
1    ABC
2    ABC
3    ABC
4    ABC
5    ABC
dtype: object

If you replace the lambda function with a function which prints the object passed in, then you can see that you only ever receive the first row of the dataframe:

def print_input(r):
    print r
    return 1

positions.apply(print_input, axis=1)
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 0, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 1, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 2, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 3, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 4, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 5, dtype: object
Out[215]: 
0    1
1    1
2    1
3    1
4    1
5    1
dtype: int64

This is new in the master, I didn't see it in pandas 0.11.0 or 0.13.0.

jreback · 2014-01-27T15:35:45Z

in order to do apply perf improvements I am not copying the data that is passed to the apply and just overwriting it. This doesn't work with datelike types intermixed (which are themselves a view on the underlying data). So a mixed-type frame has to do this reduction using a slower method (which is python based)

dbew · 2014-01-28T10:55:22Z

Thanks, that's working for me now (on head of master).

jreback mentioned this issue Jan 27, 2014

BUG: DataFrame.apply when using mixed datelike reductions (GH6125) #6126

Merged

jreback closed this as completed in #6126 Jan 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.apply not working with datetimes #6125

DataFrame.apply not working with datetimes #6125

dbew commented Jan 27, 2014

jreback commented Jan 27, 2014

dbew commented Jan 28, 2014

DataFrame.apply not working with datetimes #6125

DataFrame.apply not working with datetimes #6125

Comments

dbew commented Jan 27, 2014

jreback commented Jan 27, 2014

dbew commented Jan 28, 2014