Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pd.crosstab not working when margin and normalize are set together #27663

Merged
merged 17 commits into from Aug 6, 2019

Conversation

charlesdong1991
Copy link
Member

doc/source/whatsnew/v0.25.1.rst Outdated Show resolved Hide resolved
pandas/core/reshape/pivot.py Outdated Show resolved Hide resolved
pandas/core/reshape/pivot.py Outdated Show resolved Hide resolved
index_margin = table.loc[margins_name, :].drop(margins_name)
# separate cases between multiindex and index
if isinstance(table_index, MultiIndex):
index_margin = table.loc[margins_name, :].drop(margins_name, axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we not always passing axis= here? I don't like the need for this if/then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i figured out a way to walk it around, and also avoid MI problem, seems even test speed is faster ^^ @jreback

index_margin = index_margin / index_margin.sum()
# index_margin is a dataframe, and use a hacky way: sum(axis=1)[0]
# to get the normalized result, and use sum() instead for series
if isinstance(index_margin, ABCDataFrame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to distinguish MI here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, after i changed the code above, this issue can be solve as well. And i removed this if/then. thank you for pointing it out @jreback

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jul 31, 2019
@jreback jreback added this to the 0.25.1 milestone Aug 4, 2019
@jreback
Copy link
Contributor

jreback commented Aug 4, 2019

ok this looks good, can you merge master and fix the conflic. ping on green.

@charlesdong1991
Copy link
Member Author

Thanks for your follow-up review @jreback !! I merged the master and resolved the conflict.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. but a comment

# to keep index and columns names
table_index_names = table.index.names
table_columns_names = table.columns.names
# save the column and index margin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if margins is True, then we are guaranteed to have the margins_name be the last row / column? can you add an assert to this, that the last row.name / col.name == margins_name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, since this normalization will take one step further from the output of pivot_table function, and in this function, if margin is set to True, then there will be a new column/index added to the end ('All' or 'New_Margin_Name'). But you are right, it's better to add an assertion to this, will do later today! @jreback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, this just makes it clear to a future reader, otherwise lgtm. ping on green.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added assertion, pls feel free to take a look @jreback

@charlesdong1991
Copy link
Member Author

Yeah, this time it passes all checks. @jreback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

crosstabs doesn't work with margin and normalize together
2 participants