-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG +1] Support metric='precomputed' in nearest neighbors #4090
Conversation
I think the |
Good point. |
(Also the regressor and possible other classes I didn't look at the details yet) |
Of course; anything that can |
@amueller I've added |
c85c0f3
to
3e22ac1
Compare
@@ -919,7 +931,9 @@ def chi2_kernel(X, Y=None, gamma=1.): | |||
'euclidean': euclidean_distances, | |||
'l2': euclidean_distances, | |||
'l1': manhattan_distances, | |||
'manhattan': manhattan_distances, } | |||
'manhattan': manhattan_distances, | |||
'precomputed': None, # HACK: precomputed is always allowed, never called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this become necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, got it. For some reason github thinks this is part of chi2_kernel
.
I wonder if we can have a good test that ensures whether we forgot to add |
@@ -1092,6 +1106,7 @@ def pairwise_distances(X, Y=None, metric="euclidean", n_jobs=1, **kwds): | |||
"callable" % (metric, _VALID_METRICS)) | |||
|
|||
if metric == "precomputed": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metric= "precomputed"
is not documented in the function, right?
apart from my minor comments LGTM. |
rebased and awaiting final review. |
travis is unhappy. Btw in the initial comment you said something about |
Thanks for both those pointers (a while ago), @amueller. I hope this is fixed now. |
So again, someone else's review is welcome. |
travis is still failing. |
boo! |
Also, handle radius_neighbors case identically to kneighbors for precomputed
@@ -100,6 +102,75 @@ def test_unsupervised_inputs(): | |||
assert_array_almost_equal(ind1, ind2) | |||
|
|||
|
|||
def test_precomputed(): | |||
"""Tests unsupervised NearestNeighbors with a distance matrix.""" | |||
# Note: smaller samples may result in spurious test success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use a random number generator created in this test, to avoid having side effects across tests.
Support metric='precomputed' in nearest neighbors [rebased version of #4090]
Merged via #5188 |
This fixes up #2532 (sorry to not offer the changes back there, @robertlayton: the rebase was too messy with my changes to validation in sklearn.metrics.pairwise) to handle queries that differ from the index data. This case is a little awkward in that the index data doesn't matter, but really should be supported for
KNNClassifier
et al.)This also provides stronger validation to precomputed metrics in `pairwise_{distances,kernels}