Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add routing to LogisticRegressionCV #26525

Merged
merged 20 commits into from Jul 14, 2023

Conversation

OmarManzoor
Copy link
Contributor

@OmarManzoor OmarManzoor commented Jun 7, 2023

Reference Issues/PRs

Closes #25906
Fixes #8950
Follow up of #24498

What does this implement/fix? Explain your changes.

  • Adds routing of additional parameters including sample weight to LogisticRegressionCV
  • The routing is added in the fit and score methods
  • A test is added for checking that the scores different when sample weight is requested compared to when it is not

Any other comments?

cc: @adrinjalali
Should I also add the tests that I added for scorers and splitters in test_metaestimators_metadata_routing.py? Also should any additional tests be added now that we have two scenarios, one where the config enable_metadata_routing is True and another where it is False?

@adrinjalali
Copy link
Member

re:tests: yes, they were quite nice, it'd be nice to have them here.

@OmarManzoor OmarManzoor marked this pull request as ready for review June 13, 2023 15:23
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, I'd also add a changelog to 1.4.

sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_logistic.py Outdated Show resolved Hide resolved
sklearn/tests/test_metadata_routing.py Outdated Show resolved Hide resolved
sklearn/tests/test_metaestimators_metadata_routing.py Outdated Show resolved Hide resolved
sklearn/tests/test_metaestimators_metadata_routing.py Outdated Show resolved Hide resolved
@OmarManzoor
Copy link
Contributor Author

This is great, I'd also add a changelog to 1.4.

Should this be a feature?

@github-actions
Copy link

github-actions bot commented Jun 25, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: a111bdd. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @OmarManzoor this is great!

@adrinjalali
Copy link
Member

@OmarManzoor there are some merge conflicts which need to be resolved.

@thomasjpfan @glemaitre this is ready for a second review.

@adrinjalali
Copy link
Member

@OmarManzoor I think the codecov errors are legit and needs testing.

@glemaitre glemaitre self-requested a review July 12, 2023 14:50
@glemaitre glemaitre changed the title Add routing to LogisticRegressionCV FEA Add routing to LogisticRegressionCV Jul 12, 2023
@glemaitre glemaitre changed the title FEA Add routing to LogisticRegressionCV ENH Add routing to LogisticRegressionCV Jul 12, 2023
doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
if params and not _routing_enabled():
raise ValueError(
"params is only supported if enable_metadata_routing=True."
" See the User Guide for more information."
Copy link
Member

@glemaitre glemaitre Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to get the link to the user guide.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will definitely be useful. But if we add it here in one place whereas there are a number of other cases like this (this particular ValueError) that don't contain this link, won't that be a bit inconsistent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can open a subsequent PR to improve the consistency.

sklearn/linear_model/tests/test_logistic.py Outdated Show resolved Hide resolved
sklearn/linear_model/tests/test_logistic.py Show resolved Hide resolved
assert pytest.approx(lr_cv1.scores_[1]) != lr_cv2.scores_[1]


def test_lr_cv_scores_without_enabling_metadata_routing():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding an informative docstring.

Comment on lines 393 to 396
@pytest.mark.parametrize(
"cv_scorer",
CV_SCORERS,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize(
"cv_scorer",
CV_SCORERS,
)
@pytest.mark.parametrize("cv_scorer", CV_SCORERS)

Comment on lines 419 to 422
@pytest.mark.parametrize(
"cv_splitter",
CV_SPLITTERS,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize(
"cv_splitter",
CV_SPLITTERS,
)
@pytest.mark.parametrize("cv_splitter", CV_SPLITTERS)

@@ -109,12 +109,18 @@ def record_metadata(obj, method, record_default=True, **kwargs):
obj._records[method] = kwargs


def check_recorded_metadata(obj, method, **kwargs):
def check_recorded_metadata(obj, method, split_params=tuple(), **kwargs):
"""Check whether the expected metadata is passed to the object's method."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it will start to be worth documented the parameter here.
For instance what is the meaning of split_params

@pytest.mark.usefixtures("enable_slep006")
def test_lr_cv_scores_differ_when_sample_weight_is_requested():
"""Test sample_weight is correctly passed to the scorer in
LogisticRegressionCV :meth:`fit` by checking the difference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LogisticRegressionCV :meth:`fit` by checking the difference
`LogisticRegressionCV.fit` by checking the difference

@@ -2065,6 +2076,54 @@ def test_liblinear_not_stuck():
clf.fit(X_prep, y)


@pytest.mark.usefixtures("enable_slep006")
def test_lr_cv_scores_differ_when_sample_weight_is_requested():
"""Test sample_weight is correctly passed to the scorer in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Test sample_weight is correctly passed to the scorer in
"""Test `sample_weight` is correctly passed to the scorer in

def test_lr_cv_scores_differ_when_sample_weight_is_requested():
"""Test sample_weight is correctly passed to the scorer in
LogisticRegressionCV :meth:`fit` by checking the difference
in scores with the case when sample_weight is not requested.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
in scores with the case when sample_weight is not requested.
in scores with the case when `sample_weight` is not requested.

@glemaitre glemaitre self-requested a review July 13, 2023 18:22
@glemaitre

This comment was marked as outdated.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @OmarManzoor

@glemaitre glemaitre merged commit 02d20c1 into scikit-learn:main Jul 14, 2023
28 checks passed
@OmarManzoor OmarManzoor deleted the logistic_cv_routing branch July 17, 2023 10:27
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
Co-authored-by: Omar Salman <omar.salman@arbisoft>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Comment on lines +590 to +608
*,
pos_class,
Cs,
scoring,
fit_intercept,
max_iter,
tol,
class_weight,
verbose,
solver,
penalty,
dual,
intercept_scaling,
multi_class,
random_state,
max_squared_sum,
sample_weight,
l1_ratio,
score_params,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the removal of the default parameters, the docstring would need their removal too.

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Co-authored-by: Omar Salman <omar.salman@arbisoft>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants