Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Sparse matrices are limited to 2^32 non-zero elements (Trac #1307) #1833

Closed
scipy-gitbot opened this Issue Apr 25, 2013 · 26 comments

Comments

Projects
None yet
3 participants

Original ticket http://projects.scipy.org/scipy/ticket/1307 on 2010-10-13 by trac user peb, assigned to @wnbell.

compressed.py in scipy.sparse creates arrays with a dtype = intc. This limits the size of a sparse matrix to 2^31 non-zero elements, preventing the solution of very large matrices.

It is suggested that a keyword be added to allow these arrays to be created with other dtypes, such at uintc and uint64.

trac user peb wrote on 2010-10-13

I take that back. The C code in sparsetools also needs to be modified.

@wnbell wrote on 2010-10-14

Extending sparsetools to 64-bit indices should be straightforward. All you need to do is tell SWIG to instantiate the template functions with 64-bit integer indices here [1]. A portable type (e.g. int64) should be used instead of long or long long since those vary in size from platform to platform.

The larger issue is (1) modifying the Python code to accept both int32 and int64 (2) ensuring that external libraries (ARPACK, SuperLU) continue to receive 32-bit integers (if that is what they require) and (3) silencing the complains that arise when sparsetools compile times double :)

[1] http://projects.scipy.org/scipy/browser/trunk/scipy/sparse/sparsetools/sparsetools.i#L167

Milestone changed to Unscheduled by @wnbell on 2010-10-14

@stefanv wrote on 2010-10-14

Erk!

trac user peb wrote on 2010-10-14

Thanks for the guidance on how to modify the code. We'll investigate the effects of changing the type to int64 since our array exceeds the current int32 limit.

trac user peb wrote on 2010-10-27

OK. We modified the sparsetools.i file to include int64 indexes as suggested and get the following SWIG error: "sparsetools.i:146: Warning(453): Can't apply (int64 *IN_ARRAY1). No typemaps are defined.". How do we correct this?

I have attached my version of the sparsetools.i file for you to comment on.

Notes:

  1. We only need the indexes to be 64-bit, so if the sparsetools.i file is overspecified, please let me know.
  2. We only need to generate large matrices by adding and multiplying them together. We do not need to solve them at this time.

Attachment added by trac user peb on 2011-02-18: bsr.py

Attachment added by trac user peb on 2011-02-18: compressed.py

Attachment added by trac user peb on 2011-02-18: construct.py

Attachment added by trac user peb on 2011-02-18: coo.py

Attachment added by trac user peb on 2011-02-18: csc.py

Attachment added by trac user peb on 2011-02-18: csr.py

Attachment added by trac user peb on 2011-02-18: dia.h

Attachment added by trac user peb on 2011-02-18: dia.py

Attachment added by trac user peb on 2011-02-18: dok.py

Attachment added by trac user peb on 2011-02-18: lil.py

Attachment added by trac user peb on 2011-02-18: sparsetools.i

trac user peb wrote on 2011-02-18

We have modified and tested the 64-bit indexing enhancement. The modified files are attached. The change defaults to 32-bit indexes and only use 64-bit indexes when needed. We would like to get these changes into the scipy codebase, since patching scipy each time a new version comes along is a pain.

@wnbell wrote on 2011-02-18

Thanks, I'll take a look at this over the weekend.

trac user peb wrote on 2011-03-02

Can we make some progress on this issue?

trac user ntq wrote on 2012-01-04

I would also be extremely interested in getting this change added to the codebase!

@pv wrote on 2012-05-28

Some integration work here: http://github.com/pv/scipy-work/commits/ticket/1307

More work is needed -- there's no test coverage that the 64-bit stuff actually works (the 64-bitifaction actually breaks several existing tests).

@pv wrote on 2013-02-19

PR here: #442

Milestone changed to 0.13.0 by @pv on 2013-02-19

Owner

pv commented Apr 26, 2013

Pull request: gh-442

Owner

rgommers commented Feb 2, 2014

Implemented in gh-442.

@rgommers rgommers closed this Feb 2, 2014

@mr0re1 mr0re1 referenced this issue in scikit-learn/scikit-learn Jun 11, 2015

Closed

sklearn.nearestneaighbors.kneighbors() produces exception #2416

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment