-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Closes #5027 distance function always casts bool to double #5384
Conversation
@@ -107,7 +107,6 @@ | |||
|
|||
from . import _distance_wrap | |||
from ..linalg import norm | |||
import collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused import.
Sorry for the delay:
I would suggest the following for clarity: instead of
write
and require that each
Then it's possible to check no casts are missing. |
The |
1a093bb
to
58da4e8
Compare
@@ -1249,34 +1249,41 @@ def dfun(u, v): | |||
elif isinstance(metric, string_types): | |||
mstr = metric.lower() | |||
|
|||
#if X.dtype != np.double and \ | |||
# (mstr != 'hamming' and mstr != 'jaccard'): | |||
# TypeError('A double array must be passed.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this 7 year old commented code.
@pv, right I agree with all those things you said. I also did the This should be ready to (re-)review. |
This needs a rebase as #5665 got rid of the matching distance (because it's a duplicate of the Hamming distance). |
Always casting to double probably incurred a performance hit for the boolean-based distance metrics such as the "matching distance".
Thanks, @larsmans, I've rebased it. |
@@ master #5384 diff @@
======================================
Files 235 235
Stmts 43211 43281 +70
Branches 8163 8163
Methods 0 0
======================================
+ Hit 33537 33609 +72
+ Partial 2602 2598 -4
- Missed 7072 7074 +2
|
I'm merging this as it's a big improvement. Thanks @Garrett-R! |
BUG: Closes #5027 distance function always casts bool to double
Thanks! |
Closes #5027: Always casting to double probably incurred a performance hit for the boolean-based distance metrics such as the "matching distance".
I'm a new SciPy contributor, so let me know if I can do anything better.
Two questions:
Wasn't sure if I should add to the release notes since it seems like it was all done by rgommers in the [last release|https://github.com/scipy/scipy/commit/75c17a04530c2604a241209b1e1ed5c79c59cdff], but on the other hand the new developer docs [say you should add a release note|https://github.com/scipy/scipy/blame/master/HACKING.rst.txt#L243].
I didn't add a test since nothing about the functionality of this function (
pdist
) actually changed and I couldn't think how to test it. Is that fine?