Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: df.agg(sum, axis=1) uses different method than when axis=0 #21222

Closed
wants to merge 1 commit into from

Conversation

topper-123
Copy link
Contributor

@topper-123 topper-123 commented May 27, 2018

This is a splitoff from #21123, to only fix #21134. #19629 will be fixed in a separate PR afterwards.

Passing builtins to df.agg is ok when axis=0, but can give wrong result, when axis=1 when NaNs are supplied.

Explanation

Passing the functions in SelectionMixin._cython_table to df.agg should defer to use the relevant cython functions. This currently works as expected when axis=0, but not always when axis=1.

The reason for this difference is that df.aggregate currently defers to df._aggregate when axis=0, but defers to df.apply, when axis=1, and these give different result when passed funcions and the series/frame contains Nan values. I've solved this by transposing df in _aggragate when axis=1.

The tests have been heavily parametrized, helping ensure that the various ways to call df.agg now give correct result.

@pep8speaks
Copy link

pep8speaks commented May 27, 2018

Hello @topper-123! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on May 27, 2018 at 10:02 Hours UTC

@topper-123
Copy link
Contributor Author

topper-123 commented May 27, 2018

I’ve thought of a couple issues that should be tested. I mark this a WIP untill this is done.

Temporarily closed.

@topper-123 topper-123 changed the title BUG: bug where df.agg(..., axis=1) gives wrong result WIP/BUG: bug where df.agg(..., axis=1) gives wrong result May 27, 2018
@topper-123 topper-123 closed this May 27, 2018
@topper-123 topper-123 changed the title WIP/BUG: bug where df.agg(..., axis=1) gives wrong result BUG: bug where df.agg(..., axis=1) uses different method than when axis=0 May 27, 2018
@topper-123 topper-123 changed the title BUG: bug where df.agg(..., axis=1) uses different method than when axis=0 BUG: df.agg(sum, axis=1) uses different method than when axis=0 May 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: df.agg(sum, axis=1) gives wrong result when Nan value is in frame
2 participants