Skip to content

No way of keeping groupby column with ffill/bfill/pad #27751

@goriccardo

Description

@goriccardo
df = pd.DataFrame(0, index=[1,2,3,4], columns=['a', 'b', 'c'])
# Starting from 0.25 groupby pad behaves like other groupby functions in removing the grouping column
df.groupby(df.a).mean()
   b  c
a      
0  0  0
df.groupby(df.a).pad()
   b  c
1  0  0
2  0  0
3  0  0
4  0  0

# However for other functions it is possible to keep the group column using as_index=False
df.groupby(df.a, as_index=False).mean()
   a  b  c
0  0  0  0
# For ffill/bfill/pad instead the keyword as_index does not help as the column is not being used as the index anyway
df.groupby(df.a, as_index=False).pad()
   b  c
1  0  0
2  0  0
3  0  0
4  0  0
# There is no way of keeping the column, except creating a useless copy of the column series
df.groupby(df.a.copy()).pad()
   a  b  c
1  0  0  0
2  0  0  0
3  0  0  0
4  0  0  0

Problem description

Starting from 0.25 ffill behaves like other groupby functions in removing the group column. As far as I understand there is no non-hacky way of getting the old behaviour back. This is not consistent with what happens with other functions where the keyword argument as_index can be used to keep the grouping column.

The only way of having the old behaviour is to create a copy of the column. However this whole thing of view-dependent-behaviour is not only very confusing (at least to me) but it is also inefficient as it requires a useless copy. Moreover it is not a backward compatible solution.

I would suggest to either add a keep_group_columns that works consistently with all groupy functions or to add a special keyword only for groupby functions that keep the original index.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions