-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
The attributes of groupby objects are currently accessible from the groupby directly, but they are hidden, i.e. they don't show up in dir calls:
>>> df = pd.DataFrame({"a": [1, 2, 3], "b": [1, 2, 3]})
>>> dfg = df.groupby("a")
>>> dfg.keys
'a'
>>> "keys" in dir(dfg)
False
>>> dfg._hidden_attrs
frozenset({'as_index',
'axis',
'dropna',
...,
'observed',
'sort'})I assume this has been done because we want the groupby attributes to be groupby methods / to not make its namespace noisy.
Feature Description
It is beneficial to be able to access the attributes and instead of using hidden attributes I propose a public/non-hidden attrs namespace, so to access an attribute, users can to e.g. dfg.attrs.keys.
This can also form the basis for a groupby repr and the groupby repr could take its data from the groupby attrs.
I'm not sure about the attrs name because we have already DataFrame.attrs, so I'm definitely open to suggestion for better names.
Alternative Solutions
The alternatives are:
1: keep things as they are / keep the attributes hidden
2. make the hidden attributes public
IMO these have disadvantages: For point 1 it is that the attributes are difficult to discover and for point 2 the disadvantage is that the groupby namespace becomes very large and groupby methods and attributes become mixed, making discoverability of groupby methods difficult.
An attrs attribute would avoid both of those disadvantages.
Additional Context
No response