-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
BUG: Fix pivot_table margins to include NaN groups when dropna=False #61524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This change fixes the margin behavior in Before I go ahead and update those tests to match the new behavior, I just wanted to double-check if this is the direction we want to take, treating |
@iabhi4 - thanks for putting this up. The crosstab failures here look like bugs to me as well. E.g. in the penultimate column
|
@rhshadrach Thanks for confirming! I’ve updated the test cases to reflect the corrected behavior |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a small request.
[1, 0, 1], | ||
[1, 0, 1], | ||
[0, 0, np.nan], | ||
[2, 0, 2.0], | ||
[1, 1, 2.0], | ||
[0, 1, np.nan], | ||
[5, 2, 7.0], | ||
[2, 0, 2], | ||
[1, 1, 2], | ||
[0, 1, 1], | ||
[5, 2, 7], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you revert this change. The numbers in the third column remain floats because of np.nan
. The change here makes it look like the result is object dtype.
Fix incorrect margin computation in
pivot_table
when index or columns contain NA valuesThis PR fixes an issue where the
"All"
row or column (i.e.,margins=True
) inpd.pivot_table
does not account for rows that containNA
values in the index or column dimensions. These rows were incorrectly excluded from the overall aggregation used to compute the margin, leading to incorrect totals.The fix modifies the margin calculation to ensure that rows with
NA
values are included in the aggregation, consistent with how the data is treated in the main table whendropna=False
.doc/source/whatsnew/v3.0.0.rst
underReshaping