Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't overwrite precompute=True in lassocv #14591

Merged
merged 6 commits into from Aug 22, 2019

Conversation

@amueller
Copy link
Member

amueller commented Aug 7, 2019

Fixes #11014, alternative to #11021

Unfortunately this can't be tested (in a way I can see) because LassoCV doesn't store the underlying model

Copy link
Member

jnothman left a comment

(could be tested with a mock)

@amueller

This comment has been minimized.

Copy link
Member Author

amueller commented Aug 8, 2019

Oh you'd mock the model attribute and check that it's passed to that correctly? how would that work?
don't think it's necessary here, but I'd love to learn from your mocking skills ;)

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Aug 8, 2019

@amueller

This comment has been minimized.

Copy link
Member Author

amueller commented Aug 9, 2019

cc @agramfort maybe?

model.precompute = False
precompute = getattr(self, "precompute", None)
if isinstance(precompute, str) and precompute == "auto":
model.precompute = False

This comment has been minimized.

Copy link
@rth

rth Aug 10, 2019

Member

Shouldn't there be an,

elif isinstance(precompute, bool):
    model.precompute = precompute

?
Say in Lasso the default for dense is precompute=False, which means that even if precompute=True is passed to LassoCV, it would still not be used. Or am I missing something?

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 11, 2019

Member

I don't think it's the right fix. The _pre_fit function uses:

precompute = (n_samples > n_features)

I would introduce a private function

def _get_precompute(X, precompute):
      if isinstance(precompute, str) and precompute == "auto":
           n_samples, n_features = X.shape
           return (n_samples > n_features)
      else:
           return precompute

that gets called in both _pre_fit and here taking as input self.precompute

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 11, 2019

Member

and I yes I agree we should use the same strategy for the path and the single fit at the end. Otherwise it's dangerous as seen with the issue reported.

now this being said it's quite uncommon to use a Lasso with such n_samples >> n_features @Meta95
that's why i guess the problem was never reported.

This comment has been minimized.

Copy link
@amueller

amueller Aug 11, 2019

Author Member

@rth model.precompute was already set above to precompute as part of common_params.
@agramfort In Lasso, we removed the "auto" function because it was not helpful according to #3249. So I assume for a single fit setting it to False is a better choice than using the heuristic here. That would mean that if it's set to "auto", the path algorithm might use precompute=True while the last fit doesn't. But I think that makes sense, given that the path algorithm reuses results while the single fit doesn't.
That's purely based on the discussion in #3249 though.

This comment has been minimized.

Copy link
@agramfort

agramfort Aug 16, 2019

Member

ok you're right. it's the right fix as auto was removed from Lasso. Now I realize that 'auto' could/should have been kept with a rule like (n_samples > 3 * n_features) for example (3 or more based on a bench). Anyhow +1 for MRG as it is. thx @amueller for taking a stab at this.

TST Adds test to precompute
super().fit(X, y)
assert self.precompute == inner_precompute

monkeypatch.setattr("sklearn.linear_model.coordinate_descent.Lasso",

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 13, 2019

Member

You humoured me!

However this test can pass if LassoCV does not use Lasso. So you need to assert (or believe coverage) that the mock is used.

This comment has been minimized.

Copy link
@amueller

amueller Aug 13, 2019

Author Member

@thomasjpfan did ;)

This comment has been minimized.

Copy link
@amueller

amueller Aug 13, 2019

Author Member

Surprisingly monkeypatch doesn't seem to keep track? I could probably make it inherit from mock or something but this ugly solution seems to work?

Copy link
Member

jnothman left a comment

What's new?

@amueller

This comment has been minimized.

Copy link
Member Author

amueller commented Aug 14, 2019

@jnothman will add tomorrow. I would love to hear back from @agramfort about whether he agrees with the current behavior.

@amueller

This comment has been minimized.

Copy link
Member Author

amueller commented Aug 21, 2019

merge?

@agramfort agramfort merged commit 17786ae into scikit-learn:master Aug 22, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python 7 new alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.88%)
Details
codecov/project 96.88% (+<.01%) compared to 71c3afb
Details
scikit-learn.scikit-learn Build #20190815.24 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@agramfort

This comment has been minimized.

Copy link
Member

agramfort commented Aug 22, 2019

thx @amueller

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.