Support large sparse matrices in SGD* and SequentialDataset #11355

jnothman · 2018-06-25T23:51:01Z

Support for sparse matrices having indices, indptr, row or col attributes with int64 dtype was recently added or confirmed for most sparse-supporting estimators.

SGDClassifier and SGDRegressor do not yet support such large sparse matrices (they use accept_large_sparse=False). We could try to fix this.

The text was updated successfully, but these errors were encountered:

TomDLT · 2018-07-04T15:59:01Z

Supporting int64 indices in sparse SGD mainly resorts to updating CSRDataset.
As Cython fused types do not work with class attributes, this will need to use the same template workaround as in #11155, which might become quite heavy.
I wonder if there is not a cleaner way based on fused types, maybe by dropping SequentialDataset.

rth · 2019-03-26T22:26:27Z

As Cython fused types do not work with class attributes, this will need to use the same template workaround as in #11155, which might become quite heavy.

That or some void pointer arithmetics maybe (cf http://blog.yclin.me/deep/learning/2016/08/08/Fused-Types-Limitation/). Neither seems ideal.

Also actually there is an existing PR for this issue in #6889 . This was also previously reported in #5776.

jnothman added Enhancement Moderate Anything that requires some knowledge of conventions and best practices help wanted labels Jun 25, 2018

rth mentioned this issue Mar 26, 2019

ValueError: Buffer dtype mismatch, expected 'int' but got 'long' #13526

Closed

cmarmo mentioned this issue Aug 15, 2020

Adding accept_large_sparse flag to SGDRegressor #18090

Closed

cmarmo added the module:linear_model label Jan 16, 2022

ogrisel mentioned this issue Jun 16, 2022

[RFC] Support for int64 indexed SciPy sparse matrices in Cython code #23653

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support large sparse matrices in SGD* and SequentialDataset #11355

Support large sparse matrices in SGD* and SequentialDataset #11355

jnothman commented Jun 25, 2018

TomDLT commented Jul 4, 2018

rth commented Mar 26, 2019 •

edited

Loading

Support large sparse matrices in SGD* and SequentialDataset #11355

Support large sparse matrices in SGD* and SequentialDataset #11355

Comments

jnothman commented Jun 25, 2018

TomDLT commented Jul 4, 2018

rth commented Mar 26, 2019 • edited Loading

rth commented Mar 26, 2019 •

edited

Loading