Skip to content

Conversation

bharatr21
Copy link
Contributor

@bharatr21 bharatr21 commented May 10, 2022

Reference Issues/PRs

#17207

What does this implement/fix? Explain your changes.

Fixes #17207 by making error message clearer

Tagging @cmarmo and @joshuacwnewton for review based on the earlier PR

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

I think it would be good to have a test based on #17207 to check the new error message.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@bharatr21
Copy link
Contributor Author

I didn't know we could use f-strings @thomasjpfan! In that case, this will be available only in v0.23 or later, right? (That's when support for Python v3.5 and earlier was removed)

@bharatr21 bharatr21 requested a review from thomasjpfan May 10, 2022 16:25
@thomasjpfan
Copy link
Member

thomasjpfan commented May 10, 2022

This fix will not be part of 0.23, but in 1.1.X or 1.2, so we can use f-strings. f-strings are currently used throughout the library since Python 3.5 was dropped.

@thomasjpfan
Copy link
Member

As noted in #23317 (review), this PR requires a test in sklearn/neighbors/tests/test_lof.py based on #17207 (comment) that checks that the correct new error message.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in #23317 (comment), this PR still requires a test to check the new message.

@cmarmo
Copy link
Contributor

cmarmo commented Jul 9, 2022

Hi @Bharat123rox will you be able to address the comments to this pull request? Thanks!

bharatr21 and others added 2 commits July 10, 2022 11:29
@Kshitij68
Copy link
Contributor

Hi @Bharat123rox ,
Will you be able to continue working on this PR? If not, I would like to take over

@bharatr21
Copy link
Contributor Author

Apologies @Kshitij68 @cmarmo I'm still working on this! I was having setup issues with my system on running pytest that's why I couldn't add the test so far

@bharatr21 bharatr21 requested a review from thomasjpfan October 15, 2022 18:46
@Kshitij68
Copy link
Contributor

Kshitij68 commented Oct 15, 2022

I am not a maintainer or core contributor to this repository, so please take my suggestions with a grain of salt.

Thanks for looking into this quickly.
Just looking at the CI build it seems two things are remaining:-

@bharatr21
Copy link
Contributor Author

bharatr21 commented Oct 16, 2022

  • I think this issue is eligible to be under changelog so that should be added.

Which version of the changelog? 1.1.x or 1.2.x?

@cmarmo
Copy link
Contributor

cmarmo commented Oct 16, 2022

  • I think this issue is eligible to be under changelog so that should be added.

Which version of the changelog? 1.1.x or 1.2.x?

1.2.x: hope it will be out soon.... with your PR included... :)

@bharatr21 bharatr21 requested a review from Kshitij68 November 27, 2022 15:34
@bharatr21 bharatr21 requested a review from Kshitij68 December 21, 2022 15:42
@Kshitij68
Copy link
Contributor

/home/runner/work/scikit-learn/scikit-learn/doc/whats_new/v1.2.rst:30: WARNING: Field list ends without a blank line; unexpected unindent.
A small fix is required for the docstrings.

Overall the PR looks good to me. Pre-approving it.

Note: I am not a core maintainer for scikit-learn, but a contributor. So you might get some additional change requests

Copy link
Member

@TomDLT TomDLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is confusing.

  • First, we have n_neighbors=2 but the error message says n_neighbors=1. This is because LocalOutlierFactor.fit silently uses self.n_neighbors_ = max(1, min(n_neighbors, n_samples - 1)). We should be more explicit in the test.
  • Second, clf is not correctly fitted (because the fit raised an error). The fact that it does not raise a NotFittedError is because the error was caught by pytest, but this is confusing. We should add a proper fit for clarity.
  • Third, the test makes kneighbors raise both n_neighbors < n_samples_fit and n_neighbors <= n_samples_fit with the same number of neighbors. We should test that one is more restrictive than the other.

Proposed test:

def test_lof_error():
    X = np.ones((7, 7))

    # self.fit only raises if n_samples == 1.
    msg = ("Expected n_neighbors < n_samples_fit, but n_neighbors = 1, "
           "n_samples_fit = 1, n_samples = 1")
    with pytest.raises(ValueError, match=msg):
        clf = neighbors.LocalOutlierFactor(n_neighbors=1).fit(X[:1])
    clf = neighbors.LocalOutlierFactor(n_neighbors=2).fit(X[:2])

    # self.kneighbors(None) raises if n_neighbors == n_samples.
    msg = ("Expected n_neighbors < n_samples_fit, but n_neighbors = 2, "
           "n_samples_fit = 2, n_samples = 2")
    with pytest.raises(ValueError, match=msg):
        clf.kneighbors(None, n_neighbors=2)
    clf.kneighbors(None, n_neighbors=1)

    # self.kneighbors(X) raises if n_neighbors > n_samples.
    msg = ("Expected n_neighbors <= n_samples_fit, but n_neighbors = 3, "
           "n_samples_fit = 2, n_samples = 7")
    with pytest.raises(ValueError, match=msg):
        clf.kneighbors(X, n_neighbors=3)
    clf.kneighbors(X, n_neighbors=2)

@bharatr21 bharatr21 requested review from TomDLT and removed request for TomDLT December 22, 2022 19:29
Co-authored-by: Tom Dupré la Tour <tom.duprelatour.10@gmail.com>
@bharatr21 bharatr21 requested a review from TomDLT December 23, 2022 07:11
@glemaitre glemaitre self-requested a review November 18, 2023 16:07
Copy link

github-actions bot commented Nov 18, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: e51c52d. Link to the linter CI: here

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I solve the conflicts and just added a couple of assert instead of just running some of code lines.

Activating the auto-merge.

Thanks @bharatr21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LocalOutlierFactor error when n_samples = 1 and n_neighbors = 1
7 participants