-
-
Notifications
You must be signed in to change notification settings - Fork 19k
Description
df = pd.DataFrame(0, index=[1,2,3,4], columns=['a', 'b', 'c'])
# Starting from 0.25 groupby pad behaves like other groupby functions in removing the grouping column
df.groupby(df.a).mean()
b c
a
0 0 0
df.groupby(df.a).pad()
b c
1 0 0
2 0 0
3 0 0
4 0 0
# However for other functions it is possible to keep the group column using as_index=False
df.groupby(df.a, as_index=False).mean()
a b c
0 0 0 0
# For ffill/bfill/pad instead the keyword as_index does not help as the column is not being used as the index anyway
df.groupby(df.a, as_index=False).pad()
b c
1 0 0
2 0 0
3 0 0
4 0 0
# There is no way of keeping the column, except creating a useless copy of the column series
df.groupby(df.a.copy()).pad()
a b c
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
Problem description
Starting from 0.25 ffill behaves like other groupby functions in removing the group column. As far as I understand there is no non-hacky way of getting the old behaviour back. This is not consistent with what happens with other functions where the keyword argument as_index can be used to keep the grouping column.
The only way of having the old behaviour is to create a copy of the column. However this whole thing of view-dependent-behaviour is not only very confusing (at least to me) but it is also inefficient as it requires a useless copy. Moreover it is not a backward compatible solution.
I would suggest to either add a keep_group_columns that works consistently with all groupy functions or to add a special keyword only for groupby functions that keep the original index.