Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Use tag requires_positive_X for NMF + ComplementNB #14680

Merged
merged 7 commits into from Aug 20, 2019

Conversation

@agramfort
Copy link
Member

agramfort commented Aug 18, 2019

I need the tag requires_positive_X for #13246 following the suggestion of @rth

@agramfort agramfort changed the title Tag positive x Use tag requires_positive_X for NMF + ComplementNB Aug 18, 2019
@agramfort agramfort changed the title Use tag requires_positive_X for NMF + ComplementNB [MRG] Use tag requires_positive_X for NMF + ComplementNB Aug 18, 2019
@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Aug 19, 2019

Shall we have a check_non_negative for NMF as well. We could also have a common test that check that an error is raised when the tag is True?

@wdevazelhes

This comment has been minimized.

Copy link
Contributor

wdevazelhes commented Aug 19, 2019

I had been starting a version of this locally but I just saw this PR (see #14685)
Shouldn't ComplementNB also have the flag requires_positive_X ? Looking at the user guide it seems like an enhancement of MultinomialNB. Also, maybe BernoulliNB too ? Since it takes binary inputs (0 or 1 so positives)

Also, maybe it could be a good occasion to update the statements X -= X.min() that deal with that, using a enforce_estimator_tags_X like done for enforce_estimator_tags_y in #14095 ? (I tried to implement this in #14685)

@wdevazelhes

This comment has been minimized.

Copy link
Contributor

wdevazelhes commented Aug 19, 2019

Also LatentDirichletAllocation ?

X = self._check_non_neg_array(X, "LatentDirichletAllocation.fit")

agramfort added 2 commits Aug 19, 2019
Copy link
Contributor

glemaitre left a comment

LGTM. Just an entry in the what's new since it changes check_estimator which is public.

Copy link
Member

qinhanmin2014 left a comment

We can always add this tag to more classes/functions in the future if needed.

estimator = clone(estimator_orig)
assert_raises_regex(ValueError, "Negative values in data passed to",
estimator.fit, X, y)

This comment has been minimized.

Copy link
@qinhanmin2014

qinhanmin2014 Aug 20, 2019

Member

a redundant link (flake8 error which can't be detected in PR)

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 20, 2019

Author Member

sorry I don't get what you mean here. My editor does not complain at these lines (but elsewhere in the file)

This comment has been minimized.

Copy link
@qinhanmin2014

qinhanmin2014 Aug 20, 2019

Member

there're 3 lines between this functions and the next function.
E303 too many blank lines (3)

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 20, 2019

Author Member

my bad... that's what happens when you have so many flake8 errors in the file that you stop paying attention to them :(

X = np.array([[-1., 1], [-1., 1]])
y = np.array([1, 2])
estimator = clone(estimator_orig)
assert_raises_regex(ValueError, "Negative values in data passed to",

This comment has been minimized.

Copy link
@qinhanmin2014

qinhanmin2014 Aug 20, 2019

Member

We've decided to deprecate assert_raise_regex, see #14216

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 20, 2019

Author Member

I would need to import pytest in a non-test file. Do we have such a context manager in the testing module?

This comment has been minimized.

Copy link
@qinhanmin2014

qinhanmin2014 Aug 20, 2019

Member

I don't know, let's leave it as it is.

@qinhanmin2014 qinhanmin2014 merged commit 951c501 into scikit-learn:master Aug 20, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python 1 new alert
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.88%)
Details
codecov/project 96.89% (+<.01%) compared to 7e53554
Details
scikit-learn.scikit-learn Build #20190820.15 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@amueller

This comment has been minimized.

Copy link
Member

amueller commented Aug 20, 2019

we should also do the inverse. Right now many tests use positive data, they should use this tag instead.

@agramfort

This comment has been minimized.

Copy link
Member Author

agramfort commented Aug 20, 2019

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Aug 20, 2019

We get the tags in most of them already, why would that be an issue? We check if they are multi-output only in each check I think.

@agramfort

This comment has been minimized.

@wdevazelhes

This comment has been minimized.

Copy link
Contributor

wdevazelhes commented Aug 20, 2019

Not sure if that's what you are talking about, but if you want I can open a PR that uses a enforce_estimator_tags_X function to make the X positive if needed depending on the tags, just like the enforce_estimator_tags_y function introduced in #14095 (see this closed PR: #14685) ?

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Aug 20, 2019

@wdevazelhes yes, ideally that'll replace the existing helpers on X.

arpanchowdhry added a commit to arpanchowdhry/scikit-learn that referenced this pull request Aug 20, 2019
@wdevazelhes

This comment has been minimized.

Copy link
Contributor

wdevazelhes commented Aug 21, 2019

@wdevazelhes yes, ideally that'll replace the existing helpers on X.

Done, in #14705

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.