Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix bug for kendall corr when in DF num and bool #11560

Conversation

roman-khomenko
Copy link
Contributor

Hi,

  1. When DataFrame contain Numerics and Booleans, than numpy will have type object,
    so np.isfinite(mat) will raise Exception.

I've fixed this by using com._ensure_float64 like for other correlation.

  1. I've skipped half of computation, because correlation is symmetrical

if i > j:
continue
elif i == j:
correl[i, i] = 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct if all of the values are NaN, then the result is Nan, so you can check that case or let this fall thru. Pls test for this (if its not done already)

@jreback jreback added the Numeric Operations Arithmetic, Comparison, and Logical operations label Nov 9, 2015
@roman-khomenko roman-khomenko force-pushed the roman-khomenko/fix-kendall-for-num-and-bool branch from 2091994 to 171ec36 Compare November 9, 2015 12:41
@roman-khomenko
Copy link
Contributor Author

@jreback Jeff,
I've fixed handling NaN and added test for that.

@roman-khomenko roman-khomenko force-pushed the roman-khomenko/fix-kendall-for-num-and-bool branch from 171ec36 to b04da65 Compare November 9, 2015 13:40
# so it need to be properly handled
df = DataFrame({"a": [True, False], "b": [1, 0]})

expected = np.ones((2, 2), dtype=np.float64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

construct an actual DataFrame here and use assert_frame_equal for comparison

@jreback
Copy link
Contributor

jreback commented Nov 10, 2015

couple of comments. pls add a whatsnew (put in bug fixes), use this PR number as the issue number. squash, then ping when green.

@jreback jreback added this to the 0.17.1 milestone Nov 10, 2015
@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Nov 10, 2015
@roman-khomenko roman-khomenko force-pushed the roman-khomenko/fix-kendall-for-num-and-bool branch 2 times, most recently from babfe01 to fe30837 Compare November 10, 2015 14:26
@roman-khomenko roman-khomenko force-pushed the roman-khomenko/fix-kendall-for-num-and-bool branch from fe30837 to f6b11fe Compare November 10, 2015 14:55
@roman-khomenko
Copy link
Contributor Author

@jreback Done

jreback added a commit that referenced this pull request Nov 13, 2015
…all-for-num-and-bool

BUG: Fix bug for kendall corr when in DF num and bool
@jreback jreback merged commit 49cd89b into pandas-dev:master Nov 13, 2015
@jreback
Copy link
Contributor

jreback commented Nov 13, 2015

thanks!

@roman-khomenko roman-khomenko deleted the roman-khomenko/fix-kendall-for-num-and-bool branch November 13, 2015 15:15
@roman-khomenko
Copy link
Contributor Author

@jreback Thank you for pandas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants