Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: make LinearRegression perfectly consistent across sparse or dense #13279

Merged

Conversation

@agramfort
Copy link
Member

@agramfort agramfort commented Feb 26, 2019

due to non centering of X when sparse, LinearRegression has never been 100% the same as the dense solver. This now fixes this.

cc @amueller

@agramfort agramfort added this to In progress in Sprint Paris 2019 Feb 26, 2019
Copy link
Contributor

@glemaitre glemaitre left a comment

You probably want to add an entry in what's new

clf_dense.fit(X, y)
clf_sparse.fit(Xcsr, y)
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_)

This comment has been minimized.

@glemaitre

glemaitre Feb 26, 2019
Contributor

Suggested change
assert_array_almost_equal(clf_dense.coef_, clf_sparse.coef_)
assert_allclose(clf_dense.coef_, clf_sparse.coef_)
clf_sparse = LinearRegression(**params)
clf_dense.fit(X, y)
clf_sparse.fit(Xcsr, y)
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)

This comment has been minimized.

@glemaitre

glemaitre Feb 26, 2019
Contributor

Suggested change
assert_almost_equal(clf_dense.intercept_, clf_sparse.intercept_)
assert clf_dense.intercept_ == pytest.approx(clf_sparse.intercept_)
@glemaitre glemaitre changed the title FIX : make LinearRegression perfectly consistent across sparse or dense FIX: make LinearRegression perfectly consistent across sparse or dense Feb 26, 2019
@glemaitre glemaitre self-requested a review Feb 26, 2019
Copy link
Member

@jnothman jnothman left a comment

Otherwise LGTM

@@ -174,6 +174,10 @@ Support for Python 3.4 and below has been officially dropped.
parameter value ``copy_X=True`` in ``fit``.
:issue:`12972` by :user:`Lucio Fernandez-Arjona <luk-f-a>`

- |Fix| Fixed a bug in :class:`linear_model.LinearRegression` that
was not returning the same coeffecient and intercepts with

This comment has been minimized.

@jnothman

jnothman Feb 26, 2019
Member

I think this is missing mention of sparse/dense

def matvec(b):
return X.dot(b) - b.dot(X_offset_scale)
def rmatvec(b):
return X.T.dot(b) - (X_offset_scale) * np.sum(b)

This comment has been minimized.

@jnothman

jnothman Feb 26, 2019
Member

redundant parentheses

sklearn/linear_model/base.py Show resolved Hide resolved
@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Feb 26, 2019

We should almost have a common test. Wrong PR.

@agramfort agramfort moved this from In progress to Needs review in Sprint Paris 2019 Feb 27, 2019

X_centered = sparse.linalg.LinearOperator(shape=X.shape,
matvec=matvec,
rmatvec=rmatvec)

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Feb 27, 2019
Member

Very elegant!

Copy link
Member

@GaelVaroquaux GaelVaroquaux left a comment

Beautiful solution. +1 for merge.

Merging.

@GaelVaroquaux GaelVaroquaux merged commit 66899ed into scikit-learn:master Feb 27, 2019
18 checks passed
18 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
@codecov
codecov/patch 100% of diff hit (target 92.49%)
Details
@codecov
codecov/project 92.5% (+0.01%) compared to face9da
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@azure-pipelines
scikit-learn.scikit-learn Build #20190226.131 succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (Linux py35_np_atlas) Linux py35_np_atlas succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (Linux pylatest_conda) Linux pylatest_conda succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (Windows py35) Windows py35 succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (Windows py37) Windows py37 succeeded
Details
@azure-pipelines
scikit-learn.scikit-learn (macOS pylatest_conda) macOS pylatest_conda succeeded
Details
Sprint Paris 2019 automation moved this from Needs review to Done Feb 27, 2019
wdevazelhes added a commit to wdevazelhes/scikit-learn that referenced this pull request Feb 27, 2019
…e the fit_intercept=False that should not be needed since scikit-learn#13279 is merged
@jnothman
Copy link
Member

@jnothman jnothman commented Feb 28, 2019

xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
scikit-learn#13279)

* FIX : make LinearRegression perfectly consistent across sparse or dense

* comments

* review
xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde added a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
scikit-learn#13279)

* FIX : make LinearRegression perfectly consistent across sparse or dense

* comments

* review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants