Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: groupby.describe on a frame with duplicate column names #50806

Closed
rhshadrach opened this issue Jan 18, 2023 · 0 comments · Fixed by #50846
Closed

BUG: groupby.describe on a frame with duplicate column names #50806

rhshadrach opened this issue Jan 18, 2023 · 0 comments · Fixed by #50846

Comments

@rhshadrach
Copy link
Member

rhshadrach commented Jan 18, 2023

xref #46944

pd.set_option("display.max_columns", None)

df = DataFrame([[0, 1, 2, 3]])
df.columns = [0, 1, 2, 0]
gb = df.groupby(df[1])
print(gb.describe(percentiles=[]).to_string())
#       0                                                           2                        
#   count mean std  min  50%  max count mean std  min  50%  max count mean std  min  50%  max
# 1                                                                                          
# 1   1.0  0.0 NaN  0.0  0.0  0.0   1.0  3.0 NaN  3.0  3.0  3.0   1.0  2.0 NaN  2.0  2.0  2.0

With duplicate column names, describe only outputs the values for the first column. There should be two columns named 0 here. This issue occurs for any operation where the groupby code is using _selected_obj within the _group_selection_context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant