-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Clean up deprecations for 1.5: in NearestCentroid #28813
Merged
adrinjalali
merged 7 commits into
scikit-learn:main
from
jeremiedbb:cln-deprecations-1.5-nearest_centroid_metrics
May 2, 2024
Merged
Changes from 1 commit
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
0187b96
cln deprecations + rework docstring
jeremiedbb 41e5919
review comments
jeremiedbb 9f33af8
Merge remote-tracking branch 'upstream/main' into pr/jeremiedbb/28813
jeremiedbb c986f52
lint
jeremiedbb ea96e44
Update sklearn/neighbors/_nearest_centroid.py
jeremiedbb ba2c3a5
Update sklearn/neighbors/_nearest_centroid.py
jeremiedbb f68fe40
Update sklearn/neighbors/_nearest_centroid.py
jeremiedbb File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,6 @@ | |
# | ||
# License: BSD 3 clause | ||
|
||
import warnings | ||
from numbers import Real | ||
|
||
import numpy as np | ||
|
@@ -34,20 +33,13 @@ class NearestCentroid(ClassifierMixin, BaseEstimator): | |
|
||
Parameters | ||
---------- | ||
metric : str or callable, default="euclidean" | ||
Metric to use for distance computation. See the documentation of | ||
`scipy.spatial.distance | ||
<https://docs.scipy.org/doc/scipy/reference/spatial.distance.html>`_ and | ||
the metrics listed in | ||
:class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric | ||
values. Note that "wminkowski", "seuclidean" and "mahalanobis" are not | ||
supported. | ||
|
||
The centroids for the samples corresponding to each class is | ||
the point from which the sum of the distances (according to the metric) | ||
of all samples that belong to that particular class are minimized. | ||
If the `"manhattan"` metric is provided, this centroid is the median | ||
and for all other metrics, the centroid is now set to be the mean. | ||
metric : {"euclidean", "manhattan"}, default="euclidean" | ||
Metric to use for distance computation. | ||
|
||
If `metric="euclidean"`, the centroid for the samples corresponding to each | ||
class is the arithmetic mean, which minimizes the sum of squared L1 distances. | ||
If `metric="manhattan"`, the centroid is the feature-wise median, which | ||
minimizes the sum of L1 distances. | ||
|
||
.. deprecated:: 1.3 | ||
Support for metrics other than `euclidean` and `manhattan` and for | ||
|
@@ -111,12 +103,7 @@ class NearestCentroid(ClassifierMixin, BaseEstimator): | |
_valid_metrics = set(_VALID_METRICS) - {"mahalanobis", "seuclidean", "wminkowski"} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This attribute can probably also be deleted, no? |
||
|
||
_parameter_constraints: dict = { | ||
"metric": [ | ||
StrOptions( | ||
_valid_metrics, deprecated=_valid_metrics - {"manhattan", "euclidean"} | ||
), | ||
callable, | ||
], | ||
"metric": [StrOptions({"manhattan", "euclidean"})], | ||
"shrink_threshold": [Interval(Real, 0, None, closed="neither"), None], | ||
} | ||
|
||
|
@@ -143,19 +130,6 @@ def fit(self, X, y): | |
self : object | ||
Fitted estimator. | ||
""" | ||
if isinstance(self.metric, str) and self.metric not in ( | ||
"manhattan", | ||
"euclidean", | ||
): | ||
warnings.warn( | ||
( | ||
"Support for distance metrics other than euclidean and " | ||
"manhattan and for callables was deprecated in version " | ||
"1.3 and will be removed in version 1.5." | ||
), | ||
FutureWarning, | ||
) | ||
|
||
# If X is sparse and the metric is "manhattan", store it in a csc | ||
# format is easier to calculate the median. | ||
if self.metric == "manhattan": | ||
|
@@ -195,14 +169,7 @@ def fit(self, X, y): | |
self.centroids_[cur_class] = np.median(X[center_mask], axis=0) | ||
else: | ||
self.centroids_[cur_class] = csc_median_axis_0(X[center_mask]) | ||
else: | ||
# TODO(1.5) remove warning when metric is only manhattan or euclidean | ||
if self.metric != "euclidean": | ||
warnings.warn( | ||
"Averaging for metrics other than " | ||
"euclidean and manhattan not supported. " | ||
"The average is set to be the mean." | ||
) | ||
else: # metric==euclidean | ||
jeremiedbb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self.centroids_[cur_class] = X[center_mask].mean(axis=0) | ||
|
||
if self.shrink_threshold: | ||
|
@@ -231,7 +198,6 @@ def fit(self, X, y): | |
self.centroids_ = dataset_centroid_[np.newaxis, :] + msd | ||
return self | ||
|
||
# TODO(1.5) remove note about precomputed metric | ||
def predict(self, X): | ||
"""Perform classification on an array of test vectors `X`. | ||
|
||
|
@@ -246,12 +212,6 @@ def predict(self, X): | |
------- | ||
C : ndarray of shape (n_samples,) | ||
The predicted classes. | ||
|
||
Notes | ||
----- | ||
If the metric constructor parameter is `"precomputed"`, `X` is assumed | ||
to be the distance matrix between the data to be predicted and | ||
`self.centroids_`. | ||
""" | ||
check_is_fitted(self) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should be updated as a
versionchanged
(as done for themetric='precomputed'
case in 0.19) or just removed (but then the previousversionchanged
should also be removed for consistency).