Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] add _pairwise property to BaseSearchCV #13925

Closed

Conversation

eyes-robson
Copy link

Fixes #13920

Previously BaseSearchCV did not support a 'precomputed' distance metric.

This PR simply adds a _pairwise property to the BaseSearchCV to ensure that some X evaluated with the 'precomputed' distance metric can be properly split during cross-validation by _safe_split

@eyes-robson eyes-robson changed the title add _pairwise property to BaseSearchCV [MRG] add _pairwise property to BaseSearchCV May 22, 2019
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe what I really want here is a non-regression test for the example posted in the original issue.

setattr(est, '_pairwise', _pairwise_setting)
cv = PairwiseCV(est)

# check if cv is pairwise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is redundant with the variable name

cv = PairwiseCV(est)

# check if cv is pairwise
cv_is_pairwise = getattr(cv, '_pairwise', False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use cv._pairwise?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the convention of safe_split but hit the 80 character limit so I broke it into a separate line. I suppose since checking cv._pairwise won't throw an error for not being instantiated, we can just use it here. I'll make the swap

@eyes-robson
Copy link
Author

I made your suggested edits and added a classification test similar to #13920 with KNeighborsClassifier, although I used the make_classification dataset instead of iris. This is a bit slimmer computationally and saves an import.

We could get rid of the old test case I'd written, but the original is a bit broader in scope as it uses BaseSearchCV for any custom CV object instead of just using GridSearchCV.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an entry to the change log at doc/whats_new/v0.22.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments but looks good

:mod:`sklearn.model_selection`
..................

- |Fix| :class:`model_selection.BaseSearchCV` now supports the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to GridSearchCV and RandomizedSearchCV instead.

Test implementation of BaseSearchCV has the _pairwise property
which matches the _pairwise property of its estimator.
"""
class PairwiseCV(BaseSearchCV):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using GridSearchCV instead of creating a dummy class?

def __init__(self, estimator, **kwargs):
super().__init__(estimator, **kwargs)

# first test: check BaseSearchCV children copy _pairwise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what that means.
what about:

Make sure _pairwise is delegated to the base estimator

@NicolasHug NicolasHug added this to To do in Sprint Scipy 2019 Jul 13, 2019
@NicolasHug NicolasHug added Easy Well-defined and straightforward way to resolve help wanted labels Jul 13, 2019
@xun-tang
Copy link
Contributor

xun-tang commented Nov 2, 2019

No activity for several months, I'm picking this one up - will try to address the lingering comments.

@cmarmo
Copy link
Member

cmarmo commented Dec 12, 2019

@jnothman, @NicolasHug, if I understand correctly this PR has been superseded by #15524, already merged.
Should this one be closed?

@jnothman
Copy link
Member

Thanks @cmarmo

@jnothman jnothman closed this Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve help wanted
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Nested Cross Validation for precomputed KNN
5 participants