Fix `n_nonzero_coefs_` in `OrthogonalMatchingPursuit` always `None` when ignored #28557

lucyleeow · 2024-03-01T05:21:23Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Ensures that n_nonzero_coefs_ in OrthogonalMatchingPursuit is always None when ignored. Clarified n_nonzero_coefs and n_nonzero_coefs_ docstrings

Any other comments?

cc @StefanieSenger you may be interested in reviewing?

github-actions · 2024-03-01T05:23:02Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: ddb458b. Link to the linter CI: here}

StefanieSenger

I've given it another look and have suggested some little changes in the phrasing of the docstring.

Thinking about it, I was wondering why n_nonzero_coefs_ is actually what it is and not _n_nonzero_coefs instead. It‘s not reflecting a lot of information about the fitted model, because it only holds a value under certain conditions (this was already so before this PR) and thus doesn't reflect "the number of non-zero coefficients in the solution".

And there is a codecov warning for the newly added line. So I think you need to add this in one of the tests.

I've approved this PR (seen this button for the first time), but I'm not a maintainer, so you need two more reviewers.

StefanieSenger · 2024-03-01T08:58:30Z

sklearn/linear_model/_omp.py

+        Desired number of non-zero entries in the solution. If `None` and `tol` is
+        also `None` this value is either set to 10% of `n_features` or 1, whichever is
+        greater. Ignored if `tol` is not `None`.


Hm, I think since the "10% or 1" are a calculation that refer to n_nonzero_coefs_, not n_nonzero_coefs, it's okay to leave it away here. That reads a bit cleaner:

Suggested change

Desired number of non-zero entries in the solution. If `None` and `tol` is

also `None` this value is either set to 10% of `n_features` or 1, whichever is

greater. Ignored if `tol` is not `None`.

Desired number of non-zero entries in the solution. Ignored if `tol` is set.

It's also fine if it stays as it is.

I think it is useful to say what happens when it is left as None (which it does do in main, just not completely correct). But I agree it does not read as clean now, though I couldn't think of a way to improve.

StefanieSenger · 2024-03-01T09:19:41Z

sklearn/linear_model/_omp.py

        The number of non-zero coefficients in the solution. If
        `n_nonzero_coefs` is None and `tol` is None this value is either set
        to 10% of `n_features` or 1, whichever is greater.
+        When `None`, `n_nonzero_coefs` has been ignored because `tol` has been set.


And then maybe explain the internal logic of calculating n_nonzero_coefs_ here:

Suggested change

The number of non-zero coefficients in the solution. If

`n_nonzero_coefs` is None and `tol` is None this value is either set

to 10% of `n_features` or 1, whichever is greater.

When `None`, `n_nonzero_coefs` has been ignored because `tol` has been set.

The number of non-zero coefficients in the solution. `None`, if `tol` is set,

otherwise is `n_nonzero_coefs` if this param is set. If both

`n_nonzero_coefs` and `tol` are `None`, this value is set to either 10%

of `n_features` or 1, whichever is greater.

I tried to rephrase so it's more intuitive. You can also ignore that, it it doesn't feel more intuitive to you.

I see your point, I've re-word.

lucyleeow · 2024-03-01T11:15:30Z

Good point about the test, I've added a separate test, to avoid testing too many things in one of the existing tests.

Thinking about it, I was wondering why n_nonzero_coefs_ is actually what it is and not _n_nonzero_coefs instead.

I guess it is mostly for the case when n_nonzero_coefs and tol are both None and n_nonzero_coefs needs to be calculated? Just to tell the user what n_nonzero_coefs value was used in that case? But yeah, otherwise it's not really providing much info.

I've approved this PR (seen this button for the first time), but I'm not a maintainer, so you need two more reviewers.

All good. Non-maintainer reviews are also valuable! :)

jeremiedbb

LGTM. Thanks @lucyleeow and @StefanieSenger

jeremiedbb · 2024-03-06T09:19:07Z

Actually, I would make it a |Fix| since the old behavior was wrong or at least misleading: when tol is not None, the effective number of non-zero coefs doesn't match n_nonzero_coefs_.

lucyleeow · 2024-03-06T09:48:24Z

Done, thanks for review @jeremiedbb !

fix coefs

7836bd1

github-actions bot added the module:linear_model label Mar 1, 2024

whats new

8d3e5ea

StefanieSenger reviewed Mar 1, 2024

View reviewed changes

StefanieSenger approved these changes Mar 1, 2024

View reviewed changes

lucyleeow added 2 commits March 1, 2024 21:38

wording

3123a58

add test

0b49025

Merge branch 'main' into doc_omp

bddaf38

jeremiedbb approved these changes Mar 6, 2024

View reviewed changes

review

ddb458b

jeremiedbb enabled auto-merge (squash) March 6, 2024 10:14

jeremiedbb merged commit 630961c into scikit-learn:main Mar 6, 2024
28 checks passed

lucyleeow deleted the doc_omp branch March 6, 2024 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `n_nonzero_coefs_` in `OrthogonalMatchingPursuit` always `None` when ignored #28557

Fix `n_nonzero_coefs_` in `OrthogonalMatchingPursuit` always `None` when ignored #28557

lucyleeow commented Mar 1, 2024

github-actions bot commented Mar 1, 2024 •

edited

StefanieSenger left a comment •

edited

StefanieSenger Mar 1, 2024

lucyleeow Mar 1, 2024

StefanieSenger Mar 1, 2024

lucyleeow Mar 1, 2024

lucyleeow commented Mar 1, 2024 •

edited

jeremiedbb left a comment

jeremiedbb commented Mar 6, 2024 •

edited

lucyleeow commented Mar 6, 2024

Fix n_nonzero_coefs_ in OrthogonalMatchingPursuit always None when ignored #28557

Fix n_nonzero_coefs_ in OrthogonalMatchingPursuit always None when ignored #28557

Conversation

lucyleeow commented Mar 1, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Mar 1, 2024 • edited

✔️ Linting Passed

StefanieSenger left a comment • edited

Choose a reason for hiding this comment

StefanieSenger Mar 1, 2024

Choose a reason for hiding this comment

lucyleeow Mar 1, 2024

Choose a reason for hiding this comment

StefanieSenger Mar 1, 2024

Choose a reason for hiding this comment

lucyleeow Mar 1, 2024

Choose a reason for hiding this comment

lucyleeow commented Mar 1, 2024 • edited

jeremiedbb left a comment

Choose a reason for hiding this comment

jeremiedbb commented Mar 6, 2024 • edited

lucyleeow commented Mar 6, 2024

Fix `n_nonzero_coefs_` in `OrthogonalMatchingPursuit` always `None` when ignored #28557

Fix `n_nonzero_coefs_` in `OrthogonalMatchingPursuit` always `None` when ignored #28557

github-actions bot commented Mar 1, 2024 •

edited

StefanieSenger left a comment •

edited

lucyleeow commented Mar 1, 2024 •

edited

jeremiedbb commented Mar 6, 2024 •

edited