BUG: groupby.describe on a frame with duplicate column names #50806

rhshadrach · 2023-01-18T00:30:38Z

pd.set_option("display.max_columns", None)

df = DataFrame([[0, 1, 2, 3]])
df.columns = [0, 1, 2, 0]
gb = df.groupby(df[1])
print(gb.describe(percentiles=[]).to_string())
#       0                                                           2                        
#   count mean std  min  50%  max count mean std  min  50%  max count mean std  min  50%  max
# 1                                                                                          
# 1   1.0  0.0 NaN  0.0  0.0  0.0   1.0  3.0 NaN  3.0  3.0  3.0   1.0  2.0 NaN  2.0  2.0  2.0

With duplicate column names, describe only outputs the values for the first column. There should be two columns named 0 here. This issue occurs for any operation where the groupby code is using _selected_obj within the _group_selection_context.

The text was updated successfully, but these errors were encountered:

rhshadrach added Bug Groupby labels Jan 18, 2023

rhshadrach mentioned this issue Jan 18, 2023

BUG: groupby.describe on a frame with duplicate column names #50846

Merged

5 tasks

mroeschke closed this as completed in #50846 Feb 3, 2023

rhshadrach mentioned this issue Jul 11, 2023

BUG: groupby.describe on a frame with MultiIndex describes groupings #50804

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: groupby.describe on a frame with duplicate column names #50806

BUG: groupby.describe on a frame with duplicate column names #50806

rhshadrach commented Jan 18, 2023 •

edited

BUG: groupby.describe on a frame with duplicate column names #50806

BUG: groupby.describe on a frame with duplicate column names #50806

Comments

rhshadrach commented Jan 18, 2023 • edited

rhshadrach commented Jan 18, 2023 •

edited