[MRG+1] removed precomputed support in nearestcentroid #8515
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8515 +/- ##
==========================================
+ Coverage 95.48% 95.53% +0.05%
==========================================
Files 342 333 -9
Lines 60913 61209 +296
==========================================
+ Hits 58160 58477 +317
+ Misses 2753 2732 -21
Continue to review full report at Codecov.
|
Hm that is really odd. codecov says the coverage decreased, but shows no changes. Looks like a bug in codecov? |
I'm guessing that a unit test counts as a documented part of the code, and so removing it slightly lowers the covered code. |
But isn't that test trying to define what X should be in the precomputed case, i.e. not make it ambiguous? |
Ah, with our settings, tests are included in the coverage check, and removing a test therefore decreases coverage. Maybe we should change our threshold? -0.01 seems pretty strict. |
@jnothman X is a distance matrix between samples and centroids but we do not know what centroids are. Then the question is how to predict classes for new data samples. If X is supposed to be a pre-computed centroids, that is a different story. We do not need run fit in that case. |
So I would actually raise an error if someone passes "precomputed". I have no idea what it's supposed to do, and the documentation doesn't clarify. |
yes, better to raise an exception than to just remove tests and leave
unspecified
…On 8 Mar 2017 1:23 am, "Andreas Mueller" ***@***.***> wrote:
So I would actually raise an error if someone passes "precomputed". I have
no idea what it's supposed to do, and the documentation doesn't clarify.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8515 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6_S6dVK1BgkpMjglLt6od21sX_xHks5rjWhngaJpZM4MTJUd>
.
|
This reverts commit 71ace3a.
OK, now it raises an error if "precomputed" is used as a metric. |
I think this is right. The current test makes sense in predict but not in
fit. The only way I can imagine doing fit with precomputed pause distances
is to select the medoid as the "centroid" of each class. This might be a
nice feature, but better saved for a NearestMedoid estimator.
LGTM
…On 8 Mar 2017 12:01 pm, "Sergul Aydore" ***@***.***> wrote:
OK, now it raises an error if "precomputed" is used as a metric.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8515 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz67roHRN8wBM5P2Hiw0kw3u2zfZ55ks5rjf3OgaJpZM4MTJUd>
.
|
@@ -133,6 +133,8 @@ def fit(self, X, y): | |||
self.centroids_[cur_class] = np.median(X[center_mask], axis=0) | |||
else: | |||
self.centroids_[cur_class] = csc_median_axis_0(X[center_mask]) | |||
elif self.metric == 'precomputed': |
MechCoder
Mar 27, 2017
Member
I think we should raise this error earlier?
I think we should raise this error earlier?
jnothman
Mar 27, 2017
Member
I guess so. And perhaps the error message should be more informative and explicitly say precomputed not supported.
I guess so. And perhaps the error message should be more informative and explicitly say precomputed not supported.
MechCoder
May 18, 2017
Member
You can raise it as early as here (https://github.com/sergulaydore/scikit-learn/blob/38ad1d24690fea9f8d1867c2f8b9587c0204268d/sklearn/neighbors/nearest_centroid.py#L98), with an explicit error message saying metric="precomputed"
not supported.
You can raise it as early as here (https://github.com/sergulaydore/scikit-learn/blob/38ad1d24690fea9f8d1867c2f8b9587c0204268d/sklearn/neighbors/nearest_centroid.py#L98), with an explicit error message saying metric="precomputed"
not supported.
+1 otherwise, also this might deserve a whatsnew entry. |
4397e7e
into
scikit-learn:master
Thanks! |
Thanks everyone for your patience. It was my first PR :)
… On May 18, 2017, at 11:22 PM, Manoj Kumar ***@***.***> wrote:
Thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#8515 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKaEhMKr7kQO7KK0DCDHNZFY5P4ipzQwks5r7QsDgaJpZM4MTJUd>.
|
congrats!
On Fri, May 19, 2017 at 12:02 AM, Sergul Aydore <notifications@github.com>
wrote:
… Thanks everyone for your patience. It was my first PR :)
> On May 18, 2017, at 11:22 PM, Manoj Kumar ***@***.***>
wrote:
>
> Thanks!
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub <
#8515#
issuecomment-302598762>, or mute the thread <https://github.com/
notifications/unsubscribe-auth/AKaEhMKr7kQO7KK0DCDHNZFY5P4ipz
Qwks5r7QsDgaJpZM4MTJUd>.
>
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8515 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABx9EMbUX2HPH-iMfTey21V-IRWPUS0aks5r7RRhgaJpZM4MTJUd>
.
--
Manoj,
http://github.com/MechCoder
|
Reference Issue #8505
What does this implement/fix? Explain your changes.
It is ambiguous what the self.centrodis_ should be when X is already a distance metrics. So, there is no reason to do test for pre-computed matrix.
Any other comments?
This was added in commit cf811d2 by @mblondel