Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: crosstab with duplicate column or index labels #37997

Merged
merged 57 commits into from
Nov 28, 2020

Conversation

arw2019
Copy link
Member

@arw2019 arw2019 commented Nov 22, 2020

Picking up from #28474

cc @jreback in case this can go in in time for 1.2

Given a list of row or column names, creates a mapper of unique names to
column/row names.

Parameters
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

raise ValueError("values cannot be used without an aggfunc.")
# We create our own mapping of row and columns names
# to prevent issues with duplicate columns/row names. GH Issue: #22529
shared_col_row_names = set(rownames).intersection(set(colnames))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

@arw2019 arw2019 self-assigned this Nov 22, 2020
@arw2019 arw2019 marked this pull request as ready for review November 22, 2020 17:55
@arw2019 arw2019 changed the title [WIP] BUG: crosstab with duplicate column or index labels BUG: crosstab with duplicate column or index labels Nov 22, 2020
@arw2019 arw2019 added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Nov 23, 2020
@arw2019 arw2019 added the Needs Review Waiting for review/response from a maintainer. label Nov 23, 2020
@arw2019 arw2019 closed this Nov 23, 2020
@arw2019 arw2019 reopened this Nov 23, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good thanks for picking this up. comments & ping on green.

def _build_names_mapper(
rownames: List[str], colnames: List[str]
) -> Tuple[Dict[str, str], List[str], Dict[str, str], List[str]]:
def get_duplicates(names):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a doc-string here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jreback jreback added this to the 1.2 milestone Nov 26, 2020
@jreback jreback added Bug and removed Needs Review Waiting for review/response from a maintainer. labels Nov 26, 2020
@arw2019
Copy link
Member Author

arw2019 commented Nov 27, 2020

Green + addressed comments

@jreback jreback merged commit e8085a7 into pandas-dev:master Nov 28, 2020
@jreback
Copy link
Contributor

jreback commented Nov 28, 2020

thanks @arw2019 and @cuchoi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crosstab Not Working with Duplicate Column Labels
4 participants