API Allow users to pass `DistanceMetric` objects to `metric` keyword argument in `neighbors.KNeighborsRegressor` #26267

Micky774 · 2023-04-23T23:55:34Z

Reference Issues/PRs

Addresses #26010
Addresses #26329

What does this implement/fix? Explain your changes.

Allow users to pass DistanceMetric objects to metric keyword argument in neighbors.KNeighborsRegressor.

Any other comments?

I have discussed with @betatim the idea of developing an Engine for DistanceMetric, however that is probably not easily done in the near-future without adding to the complexity of the already-open work on the plugin system. That is probably the best strategy once it is a bit more mature, but until then, this option of directly passing DistanceMetric objects does provide a fast and simple route for the development/use of third-party DistanceMetric implementations.

jjerphan

Thank you, @Micky774.

As discussed directly with you, I think having efficient vectorized distance metrics implementations is of value for scikit-learn generally. In this regard I am +5 regarding merging this PR.

As you mention, the use of third party packages (such as yours, https://github.com/Micky774/distmetric-xsimd/) is limited by the absence of the plugins systems.

Yet, I am not sure if we want to relax our API, change scikit-learn UX, and mention it publicly if this is meant to be experimental or temporary: what this PR allows (i.e. to pass DistanceMetric{32,} instances to the metric keywords) relaxes the UX into. I think there are notable UX benefits in constraining users to use one single package-uniform specification (mainly clarity and uniformity across all API using metric). In this regard, I am -1 regarding merging this PR.

Still, we probably might want to have a way for advanced users (or developers of third party package like you) to use those packages in the meantime. Another alternative temporary approach would be to have third party packages monkey patch scikit-learn internals when they are imported at runtime. This is the approach https://github.com/intel/scikit-learn-intelex uses and it has the benefit of keeping scikit-learn UX unchanged. Using this approach, you could change those mappings to point to your implementations instead:

scikit-learn/sklearn/metrics/_dist_metrics.pyx.tp

Lines 82 to 111 in 42ca718

    
           ###################################################################### 
        
           # metric mappings 
        
           #  These map from metric id strings to class names 
        
           METRIC_MAPPING{{name_suffix}} = { 
        
               'euclidean': EuclideanDistance{{name_suffix}}, 
        
               'l2': EuclideanDistance{{name_suffix}}, 
        
               'minkowski': MinkowskiDistance{{name_suffix}}, 
        
               'p': MinkowskiDistance{{name_suffix}}, 
        
               'manhattan': ManhattanDistance{{name_suffix}}, 
        
               'cityblock': ManhattanDistance{{name_suffix}}, 
        
               'l1': ManhattanDistance{{name_suffix}}, 
        
               'chebyshev': ChebyshevDistance{{name_suffix}}, 
        
               'infinity': ChebyshevDistance{{name_suffix}}, 
        
               'seuclidean': SEuclideanDistance{{name_suffix}}, 
        
               'mahalanobis': MahalanobisDistance{{name_suffix}}, 
        
               'wminkowski': WMinkowskiDistance{{name_suffix}}, 
        
               'hamming': HammingDistance{{name_suffix}}, 
        
               'canberra': CanberraDistance{{name_suffix}}, 
        
               'braycurtis': BrayCurtisDistance{{name_suffix}}, 
        
               'matching': MatchingDistance{{name_suffix}}, 
        
               'jaccard': JaccardDistance{{name_suffix}}, 
        
               'dice': DiceDistance{{name_suffix}}, 
        
               'kulsinski': KulsinskiDistance{{name_suffix}}, 
        
               'rogerstanimoto': RogersTanimotoDistance{{name_suffix}}, 
        
               'russellrao': RussellRaoDistance{{name_suffix}}, 
        
               'sokalmichener': SokalMichenerDistance{{name_suffix}}, 
        
               'sokalsneath': SokalSneathDistance{{name_suffix}}, 
        
               'haversine': HaversineDistance{{name_suffix}}, 
        
               'pyfunc': PyFuncDistance{{name_suffix}}, 
        
           }

This also has the benefit of making other methods of scikit-learn use your implementations.

What do you think?

Micky774 · 2023-05-24T14:02:19Z

As you mention, the use of third party packages (such as yours, https://github.com/Micky774/distmetric-xsimd/) is limited by the absence of the plugins systems.

Yet, I am not sure if we want to relax our API, change scikit-learn UX, and mention it publicly if this is meant to be experimental or temporary: what this PR allows (i.e. to pass DistanceMetric{32,} instances to the metric keywords) relaxes the UX into. I think there are notable UX benefits in constraining users to use one single package-uniform specification (mainly clarity and uniformity across all API using metric). In this regard, I am -1 regarding merging this PR.

We actually discussed this a bit during the most recent monthly meeting, and from my conversation with @ogrisel my current understanding is that the plugin system is intended to expose estimator-specific computational routines rather than wide-used computational backends (e.g. PairwiseDistanceReduction or DistanceMetric{32} implementations). In this way, the expansion of the metric keyword to accept DistanceMetric{32} objects is independent of that and would not need deprecation to favor plugins.

Still, we probably might want to have a way for advanced users (or developers of third party package like you) to use those packages in the meantime. Another alternative temporary approach would be to have third party packages monkey patch scikit-learn internals when they are imported at runtime. This is the approach https://github.com/intel/scikit-learn-intelex uses and it has the benefit of keeping scikit-learn UX unchanged. Using this approach, you could change those mappings to point to your implementations instead:

scikit-learn/sklearn/metrics/_dist_metrics.pyx.tp

Lines 82 to 111 in 42ca718

######################################################################

# metric mappings

# These map from metric id strings to class names

METRIC_MAPPING{{name_suffix}} = {

'euclidean': EuclideanDistance{{name_suffix}},

'l2': EuclideanDistance{{name_suffix}},

'minkowski': MinkowskiDistance{{name_suffix}},

'p': MinkowskiDistance{{name_suffix}},

'manhattan': ManhattanDistance{{name_suffix}},

'cityblock': ManhattanDistance{{name_suffix}},

'l1': ManhattanDistance{{name_suffix}},

'chebyshev': ChebyshevDistance{{name_suffix}},

'infinity': ChebyshevDistance{{name_suffix}},

'seuclidean': SEuclideanDistance{{name_suffix}},

'mahalanobis': MahalanobisDistance{{name_suffix}},

'wminkowski': WMinkowskiDistance{{name_suffix}},

'hamming': HammingDistance{{name_suffix}},

'canberra': CanberraDistance{{name_suffix}},

'braycurtis': BrayCurtisDistance{{name_suffix}},

'matching': MatchingDistance{{name_suffix}},

'jaccard': JaccardDistance{{name_suffix}},

'dice': DiceDistance{{name_suffix}},

'kulsinski': KulsinskiDistance{{name_suffix}},

'rogerstanimoto': RogersTanimotoDistance{{name_suffix}},

'russellrao': RussellRaoDistance{{name_suffix}},

'sokalmichener': SokalMichenerDistance{{name_suffix}},

'sokalsneath': SokalSneathDistance{{name_suffix}},

'haversine': HaversineDistance{{name_suffix}},

'pyfunc': PyFuncDistance{{name_suffix}},

}

This also has the benefit of making other methods of scikit-learn use your implementations.

What do you think?

This is definitely an approach worth considering, but truth be told I also think it would be good for scikit-learn to more publicly expose opportunities for third-party acceleration of computational backends even if not through the plugin system. We have recently been discussing what the role of scikit-learn in the community is, and what our obligations are. One of the more critical questions, I think, is whether or not maximizing performance optimization is our obligation.

It's a bit tricky, imo. We should aim for performance wherever possible -- and indeed work like your PairwiseDistanceReduction backend are a massive effort in this regard -- but at the same time there are so many niche tools and system-specific boons that we could use but would be incredibly costly to implement/maintain. I favor trying to make it as easy and canonical as possible for third-party programs to accelerate scikit-learn, in parallel to our own internal efforts.

In this way, I see the relaxation of the API for the metric keyword as a feature, not a bug. It is an invitation and advertisement for interested users/developers to think of their own solutions for DistanceMetric{32} and contribute it to the community -- it is more obvious to users that this is something that they could write their own implementation of. Plus, we can still ensure the stability and uniformity of metric semantics across scikit-learn. Otherwise, I worry that if our go-to solution for third-party accelerators is just "monkey-patch us", then it poses a higher barier to entry.

I will admit my inexperience with this though. If other core-devs believe that the cost to relaxing the API outweighs the benefits of offering a more obvious avenue of contribution for third-party acceleration, then I am okay with the monkey-patching solution. I am not too attached to the specific path taken, just to the results of a faster scikit-learn and -- if at all possible -- an invitation to the community to try out their own acceleration strategies on DistanceMetric{32}.

cc: @ogrisel @adrinjalali since we had actively discussed the role of the plugins system, and the burden of this proposed API change during the monthly meeting

betatim · 2023-05-24T22:09:34Z

Could you post an example snippet of what user code with KNeighborsRegressor and this PR merged would look like? I looked at the tests to get a feeling for what user code would look like but decided that the tests are quite tricky because they test a lot of things. Hence I think have a snippet of user code would be nice.

I also looked at https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KDTree.html where you can already pass a DistanceMetric to the metric keyword. Is this precedence for this new API?

I'm not sure I have a firm opinion yet, but my first reaction is that we should preferably use one way to allow users to extend/modify scikit-learn. I'm not sure this is the case here, but imagine we'd end up with a world where there were "estimator plugins" and "distance metric plugins" (using "plugin" as a generic word for "extending scikit-learn"). I think that would be quite confusing to explain to users. I think having one system would be better.

Micky774 · 2023-05-24T23:59:43Z

Could you post an example snippet of what user code with KNeighborsRegressor and this PR merged would look like? I looked at the tests to get a feeling for what user code would look like but decided that the tests are quite tricky because they test a lot of things. Hence I think have a snippet of user code would be nice.

Sure! For the sake of example I'll assume the user has installed my sample accelerated implementations with an installation of scikit-learn from this branch.

Then it would look like:

from sklearn.neighbors import KNeighborsRegressor
import numpy as np
# Step 1
from distance_metrics import get_distance_metric # package name is a WIP and will soon be slsdm :)

random_state = 0
n_samples_X = 100
n_features = 10
n_classes = 3

rng = np.random.RandomState(random_state)
X = rng.randn(n_samples_X, n_features)
y = rng.randint(n_classes, size=n_samples_X)

# Step 2
dst = get_distance_metric(X, "manhattan")

# Step 3
neigh = KNeighborsRegressor(n_neighbors=2, metric=dst)
neigh.fit(X, y)

print(neigh.predict(X[:10]))

I also looked at https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KDTree.html where you can already pass a DistanceMetric to the metric keyword. Is this precedence for this new API?

Indeed! This actually makes the implementation very straightforward for many estimators since the functionality is largely already available (just need to make more lax the estimators' own validation).

I'm not sure I have a firm opinion yet, but my first reaction is that we should preferably use one way to allow users to extend/modify scikit-learn. I'm not sure this is the case here, but imagine we'd end up with a world where there were "estimator plugins" and "distance metric plugins" (using "plugin" as a generic word for "extending scikit-learn"). I think that would be quite confusing to explain to users. I think having one system would be better.

I definitely agree that having one system would be better. I personally would be okay with multiple paths if it made sense and reduced barriers to entry. I don't know if this API change I propose actually accomplishes that or not, but that is the underlying hope.

I've attached a list of potential paths forward. What do you think of option four:

Partially accept the API change as an experimental feature and remove without deprecation in favor of the Engine API later

Some Potential Paths Forward

I just want to go ahead and summarize some of the paths forwards we have discussed.

1. Accept the API change and do not extend the `Engine` API to cover `DistanceMetric{32}`

If you want to create a new distance metric then you probably either:

Want to use a new (perhaps exotic) metric, in which case you may extend DistanceMetric{32} and flesh it out.
Want to use a fancy implementation (e.g. SIMD/GPU) of an existing metric, in which case you may extend the default implementation for the metric and overwrite whatever you need.

And in either case instantiate the object and pass it directly via metric.

There is a clear, single path to accelerating DistanceMetric{32}.

2. Accept the API change and extend the `Engine` API to cover `DistanceMetric{32}`

The prior options still apply, however now you can also develop a new distance metric using its corresponding Engine which is a stricter API which includes usability checks and allows for use through entry point registration. From a user-experience perspective, Engine API compliant implementations are almost always preferred, except for perhaps cases where a user wants to write a new distance metric as quickly as possible, in which case simply extending the classes as mentioned prior may be preferred (even if not recommended).

This is where the confusion you mentioned would set in, making it unclear when one should make an implementation of the Engine or DistanceMetric{32} directly.

3. Reject the API and instead extend the `Engine` API to cover `DistanceMetric{32}`

Developers must write Engine API compliant implementations, but get all the benefits of doing so.

There is a clear, single path to accelerating DistanceMetric{32}.

4. Partially accept the API change as an experimental feature and remove without deprecation in favor of the `Engine` API later

Although the feature will be more obscure, it will still be available. Users with a strong need may cross the extra hurdle to enable the feature and use a third-party implementation while average users see no change. Later, when the Engine API is more mature and is extended to cover DistanceMetric{32} the experimental feature can be disabled and we can advocate for the Engine API as the sole path for extending DistanceMetric{32}.

At any given time, there is exactly one path for extending DistanceMetric{32}, though initially it is less discoverable. However, the less-discoverable path is likely available much sooner than the full Engine API.

adrinjalali · 2023-05-25T14:21:41Z

I'm in favor of accepting this API enhancement no matter what we decide on the engine API side (I slightly prefer this to the engine API acceleration path I think, but no strong feelings there).

It would be nice to unify metrics around the code base, rather than having different set of accepted metrics for different classes.

jjerphan

It would be nice to unify metrics around the code base, rather than having different set of accepted metrics for different classes.

I also agree. Since some classes like KDTree and BallTree support DistanceMetric for metric as indicated by @betatim in #26267 (comment), the natural solution would be to have metric support DistanceMetric for all other interfaces exposing it.

Hence I am in favor of this PR, especially given that @Micky774 has come up with compelling performance improvement in the meantime. 🌟

I think some preliminary refactoring might be needed before approving this PR.

doc/whats_new/v1.3.rst

sklearn/metrics/__init__.py

sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py

Micky774 · 2023-06-02T03:03:45Z

As an update regarding the motivation of this API, it turns out for KNeighborsRegressor on data with many features (e.g. n_features=500) we see some drastic speedups of the predict method as a whole. We see anywhere from 5.5-8x speedup on float32 data, and a 3-4x speedup on float64 data. Note that this is when n_jobs=None for algorithm='brute' which parallelizes using all available threads (I believe it is equivalent to n_jobs=-1).

Note that the following benchmarks were generated with this gist

Plots

sklearn/metrics/pairwise.py

betatim · 2023-06-02T08:12:31Z

With this PR merged I'd be able to use a DistanceMetric object (or instance? but this is a detail) from anywhere, it doesn't have to be part of scikit-learn, right? This in turn means I could use an implementation that uses SIMD instructions or other dark magic.

If yes, can we spell this out in the docs somewhere and maybe even link to an alternative source of DistanceMetrics? Given the speedups shown I think it is worth investing in some docs to make sure the average person on the internet highway understands the potential gains this PR brings. Put differently, I worry that without docs this is too obscure/too many steps for people to understand and as a result won't get used that much.

Micky774 · 2023-06-02T09:21:11Z

With this PR merged I'd be able to use a DistanceMetric object (or instance? but this is a detail) from anywhere, it doesn't have to be part of scikit-learn, right? This in turn means I could use an implementation that uses SIMD instructions or other dark magic.

Precisely 😄

If yes, can we spell this out in the docs somewhere and maybe even link to an alternative source of DistanceMetrics? Given the speedups shown I think it is worth investing in some docs to make sure the average person on the internet highway understands the potential gains this PR brings. Put differently, I worry that without docs this is too obscure/too many steps for people to understand and as a result won't get used that much.

Fully agreed. I originally intended to include it in documentation once the functionality was exposed through the major pairwise distance reduction backends mentioned in the issue, but I suppose there's not really any reason to wait that long. We can have a maintained list of estimators for which the functionality is available, and then remove the list once the use of the metric keyword is made consistent across the codebase.

I'll start on that, and can provide my library as an example implementation (this'll kick me into gear to actually improve the library's documentation as well 😅).

github-actions · 2023-07-03T16:43:25Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: e4d8413. Link to the linter CI: here}

Micky774 · 2023-07-19T20:29:01Z

For those interested in trying this, there are now published wheels for Linux/MacOS x86-64 systems for my SIMD accelerated DistanceMetric implementation. You can also build from source if you want anything more than the SSE3 that the wheels come with.

If you want to quickly sanity check the speedups you can:

Build scikit-learn from source on my metric_obj branch (this PR)
pip install slsdm or follow the instructions to build from source (requires gcc)
Follow along the quick demo in the repo

With that aside, I wanted to go ahead and ping @scikit-learn/core-devs to see if anyone has the capacity/interest in reviewing this PR. I think the API proposed here is a reasonable extension of what we already have for {KD, Ball}Tree, and in concert with #25914 and #25561 would bring us closer to a unified API for the metric keyword argument while inviting third-party acceleration.

jjerphan

LGTM.

Could we treat other estimators in other dedicated PRs?

sklearn/neighbors/_base.py

doc/whats_new/v1.4.rst

sklearn/neighbors/tests/test_neighbors.py

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Micky774 · 2023-07-31T16:22:41Z

@thomasjpfan @ogrisel In case either of you were interested in reviewing

thomasjpfan · 2023-07-31T19:22:34Z

sklearn/neighbors/_base.py

+            elif (
+                callable(self.metric)
+                or self.metric in VALID_METRICS["ball_tree"]
+                or isinstance(self.metric, DistanceMetric)


API wise, self.algorithm is selecting ball_tree all the time if self.metric is a DistanceMetric. Is this the preferred default for self.algorithm="auto"?

This preserves the standard behavior, e.g. metric="euclidean" and metric=DistanceMetric.get_metric("euclidean") will both result in a BallTree with a EuclideanDistance object.

sklearn/neighbors/tests/test_neighbors.py

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

…argument in `neighbors.KNeighborsRegressor` (scikit-learn#26267) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Minimal changes

d192333

Micky774 marked this pull request as draft April 23, 2023 23:55

github-actions bot added module:metrics module:neighbors cython labels Apr 23, 2023

Micky774 added 4 commits April 23, 2023 19:55

Merge branch 'main' into metric_obj

80809cd

Merge branch 'main' into metric_obj

275ef1a

Updated tests

e32cd0d

Removed extraneous return

0923b93

Micky774 mentioned this pull request May 5, 2023

API Allow users to pass instances of DistanceMetric directly to metric keyword arguments #26329

Open

5 tasks

Micky774 marked this pull request as ready for review May 11, 2023 22:12

Micky774 added 3 commits May 11, 2023 22:19

Added changelog entry

1312e03

Merge branch 'main' into metric_obj

ab084a4

Updated test to appease coverage gods

3e6622b

jjerphan self-requested a review May 19, 2023 15:13

Merge branch 'main' into metric_obj

25bdaf5

jjerphan reviewed May 24, 2023

View reviewed changes

Merge branch 'main' into metric_obj

bc5aa67

jjerphan reviewed May 26, 2023

View reviewed changes

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved

sklearn/metrics/__init__.py Outdated Show resolved Hide resolved

sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py Outdated Show resolved Hide resolved

Micky774 mentioned this pull request May 31, 2023

MNT Updated DistanceMetric API with new ABC/interface #26471

Merged

Merge branch 'main' into metric_obj

6aaf2aa

betatim reviewed Jun 2, 2023

View reviewed changes

sklearn/metrics/pairwise.py Outdated Show resolved Hide resolved

Merge branch 'main' into metric_obj

33af09b

Micky774 added 4 commits July 7, 2023 13:30

Merge branch 'main' into metric_obj

d7a0bd0

Updated with new DistanceMetric API

71326f2

Merge branch 'main' into metric_obj

3c00807

Moved acceptence of DistanceMetric to specific estimator

cf3b5eb

jjerphan approved these changes Jul 20, 2023

View reviewed changes

sklearn/neighbors/_base.py Outdated Show resolved Hide resolved

doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved

doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved

sklearn/neighbors/tests/test_neighbors.py Outdated Show resolved Hide resolved

Micky774 and others added 3 commits July 20, 2023 12:01

Apply suggestions from code review

41c6257

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Updated metric parsing for new DistanceMetric API

4c132a3

Added DistanceMetric __repr__ and removed old inline comments

85d18ea

jjerphan added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jul 20, 2023

Micky774 changed the title ~~API Allow users to pass DistanceMetric objects to metric keyword argument~~ API Allow users to pass DistanceMetric objects to metric keyword argument in neighbors.KNeighborsRegressor Jul 20, 2023

Micky774 added 2 commits July 21, 2023 16:26

Merge branch 'main' into metric_obj

737eff3

Merge branch 'main' into metric_obj

9cb077f

thomasjpfan reviewed Jul 31, 2023

View reviewed changes

thomasjpfan reviewed Aug 1, 2023

View reviewed changes

sklearn/neighbors/tests/test_neighbors.py Outdated Show resolved Hide resolved

Update sklearn/neighbors/tests/test_neighbors.py

e4d8413

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

thomasjpfan approved these changes Aug 1, 2023

View reviewed changes

thomasjpfan enabled auto-merge (squash) August 1, 2023 17:39

thomasjpfan merged commit a16d367 into scikit-learn:main Aug 1, 2023
26 checks passed

Micky774 deleted the metric_obj branch August 1, 2023 19:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Allow users to pass `DistanceMetric` objects to `metric` keyword argument in `neighbors.KNeighborsRegressor` #26267

API Allow users to pass `DistanceMetric` objects to `metric` keyword argument in `neighbors.KNeighborsRegressor` #26267

Micky774 commented Apr 23, 2023 •

edited

jjerphan left a comment •

edited

Micky774 commented May 24, 2023 •

edited

betatim commented May 24, 2023

Micky774 commented May 24, 2023

adrinjalali commented May 25, 2023

jjerphan left a comment •

edited

Micky774 commented Jun 2, 2023

betatim commented Jun 2, 2023

Micky774 commented Jun 2, 2023 •

edited

github-actions bot commented Jul 3, 2023 •

edited

Micky774 commented Jul 19, 2023 •

edited

jjerphan left a comment

Micky774 commented Jul 31, 2023

thomasjpfan Jul 31, 2023

Micky774 Aug 1, 2023

	######################################################################
	# metric mappings
	# These map from metric id strings to class names
	METRIC_MAPPING{{name_suffix}} = {
	'euclidean': EuclideanDistance{{name_suffix}},
	'l2': EuclideanDistance{{name_suffix}},
	'minkowski': MinkowskiDistance{{name_suffix}},
	'p': MinkowskiDistance{{name_suffix}},
	'manhattan': ManhattanDistance{{name_suffix}},
	'cityblock': ManhattanDistance{{name_suffix}},
	'l1': ManhattanDistance{{name_suffix}},
	'chebyshev': ChebyshevDistance{{name_suffix}},
	'infinity': ChebyshevDistance{{name_suffix}},
	'seuclidean': SEuclideanDistance{{name_suffix}},
	'mahalanobis': MahalanobisDistance{{name_suffix}},
	'wminkowski': WMinkowskiDistance{{name_suffix}},
	'hamming': HammingDistance{{name_suffix}},
	'canberra': CanberraDistance{{name_suffix}},
	'braycurtis': BrayCurtisDistance{{name_suffix}},
	'matching': MatchingDistance{{name_suffix}},
	'jaccard': JaccardDistance{{name_suffix}},
	'dice': DiceDistance{{name_suffix}},
	'kulsinski': KulsinskiDistance{{name_suffix}},
	'rogerstanimoto': RogersTanimotoDistance{{name_suffix}},
	'russellrao': RussellRaoDistance{{name_suffix}},
	'sokalmichener': SokalMichenerDistance{{name_suffix}},
	'sokalsneath': SokalSneathDistance{{name_suffix}},
	'haversine': HaversineDistance{{name_suffix}},
	'pyfunc': PyFuncDistance{{name_suffix}},
	}

API Allow users to pass DistanceMetric objects to metric keyword argument in neighbors.KNeighborsRegressor #26267

API Allow users to pass DistanceMetric objects to metric keyword argument in neighbors.KNeighborsRegressor #26267

Conversation

Micky774 commented Apr 23, 2023 • edited

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

jjerphan left a comment • edited

Choose a reason for hiding this comment

Micky774 commented May 24, 2023 • edited

betatim commented May 24, 2023

Micky774 commented May 24, 2023

Some Potential Paths Forward

1. Accept the API change and do not extend the Engine API to cover DistanceMetric{32}

2. Accept the API change and extend the Engine API to cover DistanceMetric{32}

3. Reject the API and instead extend the Engine API to cover DistanceMetric{32}

4. Partially accept the API change as an experimental feature and remove without deprecation in favor of the Engine API later

adrinjalali commented May 25, 2023

jjerphan left a comment • edited

Choose a reason for hiding this comment

Micky774 commented Jun 2, 2023

betatim commented Jun 2, 2023

Micky774 commented Jun 2, 2023 • edited

github-actions bot commented Jul 3, 2023 • edited

✔️ Linting Passed

Micky774 commented Jul 19, 2023 • edited

jjerphan left a comment

Choose a reason for hiding this comment

Micky774 commented Jul 31, 2023

thomasjpfan Jul 31, 2023

Choose a reason for hiding this comment

Micky774 Aug 1, 2023

Choose a reason for hiding this comment

API Allow users to pass `DistanceMetric` objects to `metric` keyword argument in `neighbors.KNeighborsRegressor` #26267

API Allow users to pass `DistanceMetric` objects to `metric` keyword argument in `neighbors.KNeighborsRegressor` #26267

Micky774 commented Apr 23, 2023 •

edited

jjerphan left a comment •

edited

Micky774 commented May 24, 2023 •

edited

1. Accept the API change and do not extend the `Engine` API to cover `DistanceMetric{32}`

2. Accept the API change and extend the `Engine` API to cover `DistanceMetric{32}`

3. Reject the API and instead extend the `Engine` API to cover `DistanceMetric{32}`

4. Partially accept the API change as an experimental feature and remove without deprecation in favor of the `Engine` API later

jjerphan left a comment •

edited

Micky774 commented Jun 2, 2023 •

edited

github-actions bot commented Jul 3, 2023 •

edited

Micky774 commented Jul 19, 2023 •

edited