WIP: LinearClassifierMixin support for sparse coef_ #1702

pprett · 2013-02-22T10:14:11Z

In the case of high-dimensional feature spaces and/or large number of classes it might be wise to store the coef_ array of a linear model as a scipy sparse matrix rather than a dense numpy array.

We might even do this automagically e.g. if the sparsity of coef_ reaches a certain level, however, in the meanwhile I'd suggest that the user is in charge of making the decision whether to convert coef_ to a sparse representation.

This popped up on the mailing list recently: http://sourceforge.net/mailarchive/message.php?msg_id=30515870

This PR is just a minor modification of LinearModelMixin.decision_function that enables the use of sparse coef_.

NOTE: this PR is work in progress - no tests yet.

I was suspicious that adding dense_output=True might cause a performance regression, however, it seems that this is not the case::

The following test is on the test set of RCV1 ccat; shape: (23149, 47152)

Original

Dense

In [7]: %timeit clf.predict(x)
10000 loops, best of 3: 51.6 us per loop

In [8]: %timeit clf.predict(X)
100 loops, best of 3: 6.55 ms per loop

This PR

Dense

In [16]: %timeit clf.predict(x)
10000 loops, best of 3: 54 us per loop

In [15]: %timeit clf.predict(X)
100 loops, best of 3: 6.83 ms per loop

CSC coef_

In [12]: %timeit clf.predict(x)
1000 loops, best of 3: 324 us per loop

In [13]: %timeit clf.predict(X)
10 loops, best of 3: 29.6 ms per loop

CSR coef_

In [18]: %timeit clf.predict(x)
1000 loops, best of 3: 566 us per loop

In [19]: %timeit clf.predict(X)
10 loops, best of 3: 30.3 ms per loop

larsmans · 2013-02-22T11:25:45Z

We've discussed the automagic conversion to CSR before -- the problem is that you can't further train with a sparse coef_ unless you first densify it again.

ogrisel · 2013-02-22T14:59:46Z

We've discussed the automagic conversion to CSR before -- the problem is that you can't further train with a sparse coef_ unless you first densify it again.

I think this is fair.

Maybe we could have a SparseLinearModelPredictor that would not have a fit method but could be constructed out of a fitted LinearModel subclass for the purpose of fast memory efficient prediction (e.g. to be deployed "in production").

larsmans · 2013-02-22T15:15:50Z

I was just comtemplating such a solution, but it would mean a deviation from our usual API.

larsmans · 2013-02-22T15:23:11Z

A different solution would be a method, say sparsify, on LinearClassifierMixin or LinearModel. That would cause fewer headaches with pipelines.

Aside, guessing whether coef_ is sparse enough to warrant automatic conversion is likely going to cause to cause trouble if the planned 64-bit index support for scipy.sparse is implemented upstream.

ogrisel · 2013-02-22T15:42:31Z

A different solution would be a method, say sparsify, on LinearClassifierMixin or LinearModel. That would cause fewer headaches with pipelines.

I am ok with that option too.

decision_function forces dense output (in the case of sparse coef_)

e6c85a3

larsmans mentioned this pull request Feb 22, 2013

MRG: scipy.sparse coef_ on linear models #1706

Merged

larsmans merged commit e6c85a3 into scikit-learn:master Feb 26, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: LinearClassifierMixin support for sparse coef_ #1702

WIP: LinearClassifierMixin support for sparse coef_ #1702

pprett commented Feb 22, 2013

larsmans commented Feb 22, 2013

ogrisel commented Feb 22, 2013

larsmans commented Feb 22, 2013

larsmans commented Feb 22, 2013

ogrisel commented Feb 22, 2013

WIP: LinearClassifierMixin support for sparse coef_ #1702

WIP: LinearClassifierMixin support for sparse coef_ #1702

Conversation

pprett commented Feb 22, 2013

Original

Dense

This PR

Dense

larsmans commented Feb 22, 2013

ogrisel commented Feb 22, 2013

larsmans commented Feb 22, 2013

larsmans commented Feb 22, 2013

ogrisel commented Feb 22, 2013