Skip to content

[WIP] Ridge fit_intercept with sparse X (issue #1389) #1560

Open
wants to merge 3 commits into from

3 participants

@vene
scikit-learn member
vene commented Jan 10, 2013

At the moment partly ignores fit_intercept if X is sparse: It sets the intercept to the mean of y but doesn't take into account X, because _center_data doesn't center sparse matrices.

This PR fits a (penalized) intercept in the sparse case by using add_dummy_feature.

TODO:

  • Default values as suggested by @mblondel
  • Intercept weight
  • Figure out why this doesn't work in the wide X case: either what I'm doing is wrong, or the test isn't appropriate?

See #1389

@vene
scikit-learn member
vene commented Jan 10, 2013

Ping @amueller and @mblondel

@mblondel
scikit-learn member

What do you mean by "wide X"?

@vene

@mblondel: this test passes when shape is (5, 3) but not when it's (3, 5) -- this is what I mean by wide X.

@mblondel
scikit-learn member

@vene When solver="auto" (the default), the solver used is not the same when X is dense and when X is sparse so you don't get the exact same results. Can you try to set the solver explicitly? (e.g. sparse_cg, which despite the name handle dense arrays too)

The unit tests are getting messy. It would be nice to break the tests into smaller units (not needed for this PR).

@vene
scikit-learn member
vene commented Jan 11, 2013

@mblondel: it doesn't fix it, and furthermore the results are the same, it's not that.

Does adding a column of ones change anything fundamental when the problem admits exact solutions?

@vene
scikit-learn member
vene commented Jul 25, 2013

@mblondel should I close this and let you take over? :bow:

@mblondel
scikit-learn member

I won't be able to tackle this issue this week unfortunately :-/
Feel free to finish it :-)

@MechCoder
scikit-learn member

@vene Can I try working on this, on a new branch, and send a separate PR if you do not mind?

@vene
scikit-learn member
vene commented Aug 11, 2014
@MechCoder
scikit-learn member

@vene @mblondel I do not really understand how we should center X (without breaking the sparsity or adding an extra column of ones for the intercept). Do we implement some sort of trick/hack after we get the coefficients?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.