Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278

snyke7 · 2018-01-10T09:28:09Z

When using scipy.sparse.linalg.spsolve with scikit-umfpack as solver, one sometimes hits a 32-bit memory limit.

This is because the 64-bit umfpack family solver is only called when the indices have dtype int64, which only happens when there is an index exceeding the int32max. However, in my case (homework for a course) none of the indices exceeded this but more memory was needed.

Relevant code snippets of scipy internals:
scipy/sparse/linalg/dsolve/linsolve.py, inside the spsolve function

umf = umfpack.UmfpackContext(_get_umf_family(A))
x = umf.linsolve(umfpack.UMFPACK_A, A, b_vec, autoTranspose=True)

_get_umf_family(A) is in the same file, selects 'di' for indices of dtype int32. Note that scikit-umfpack checks if the type of provided sparse matrix corresponds to the selected family in function _getIndx, so it seems to me that a patch should be applied in scipy, not in scikit-umfpack.

Furthermore, I saw really no way to change the dtype of the indices of the matrix. Looking at scipy/sparse/compressed.py, which is the parent class of csr/csc matrices, the function get_index_dtype is always called to determine the dtype of the indices. This function is declared in scipy/sparse/sputils.py, and only returns int64 when there are values larger than int32max.

Reproducing code example:

import scipy.sparse as sp
import scipy.sparse.linalg
import numpy as np

N = 2 ** 6
h = 1/N
Ah1D = sp.diags([-1,2,-1],[-1,0,1], shape=(N-1,N-1))/(h**2)
eyeN = sp.eye(N - 1)
A = sp.kron(eyeN, sp.kron(eyeN, Ah1D)) + sp.kron(eyeN, sp.kron(Ah1D, eyeN)) + sp.kron(Ah1D, sp.kron(eyeN, eyeN))
b = np.random.rand((N-1)**3)
x = sp.linalg.spsolve(A, b)

When one changes the get_index_dtype function to always return np.int64, the code runs fine (however, it does use about 6.6GB of memory on my system)

Error message:

Traceback (most recent call last):
  File "issue.py", line 11, in <module>
    x = sp.linalg.spsolve(A, b)
  File "/somepath/python3.6/site-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 176, in spsolve
    autoTranspose=True)
  File "/somepath/python3.6/site-packages/scikits/umfpack/umfpack.py", line 756, in linsolve
    self.numeric(mtx)
  File "/somepath/python3.6/site-packages/scikits/umfpack/umfpack.py", line 577, in numeric
    umfStatus[status]))
RuntimeError: <function umfpack_di_numeric at 0x7fef7224f158> failed with UMFPACK_ERROR_out_of_memory

Scipy/Numpy/Python version information:

1.0.0 1.14.0 sys.version_info(major=3, minor=6, micro=1, releaselevel='final', serial=0)

The text was updated successfully, but these errors were encountered:

fix Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack. when using umfpack to solve Ax=b, which have 10X speedup in Ubuntu and MacOS，an int32 indices will hit a 32 bit memory limit. This is because the 64-bit umfpack family solver is only called when the indices have dtype int64 (see more: scipy/scipy#8278). An easily patch for this bug is changing the dtype of A to np.int64

ma-sadeghi · 2020-01-31T00:32:20Z

Does anybody know what the status of this issue is? Is it something that's going to be addressed soon?

mdhaber · 2020-01-31T01:41:26Z

Hmm this sounds like an easy fix. @ilayn how do you think this should be addressed? Add an option that allows the user to force use of 64 bit indices is an obvious choice, but is it the right one?

mdhaber · 2020-01-31T01:44:03Z

@ma-sadeghi even if we do patch this now, it will be about 4-5 months before the next release. The original poster suggested a patch - you might try applying this yourself if you need a workaround soon.

mdhaber · 2020-01-31T02:08:49Z

Upon closer reading, it seems that we would actually need to change the data type of the indices in the sparse matrix. If we were to just create an option in splu that overrides _get_umf_family and forces the use of dl instead of di, then scikit-umfpack will raise an error because the dtype does not match the family.

@perimosocordiae Any thoughts on whether the CSC/CSR matrix classes can/should have a manual override for the index/indptr dtypes?

pv · 2020-02-01T12:54:43Z

I think this is spsolve issue --- if 64-bit indices are needed, the cast should done there (not in csr_matrix). I'm not sure when umfpack exactly requires 64-bit indices --- if it's similar to full inversion, then the condition probably is prod(shape) > int32max.

mdhaber · 2020-02-01T15:34:08Z

Ok so just cast index and indptr to int64 automatically when that condition is met. I'll (try to) try that.

ma-sadeghi · 2020-02-01T18:08:16Z

@pv I might be wrong but I think umfpack only requires 64-bit when the number of unknowns are more than 2^32, which is ~ 4 billion, which is not very common, at least for everyday use. The problem is that because umfpack assumes 32-bit, it runs out of memory even when dealing with ~250k unknowns (see code snippet posted at the top). This is probably an umfpack issue, but a workaround is to automatically change the dtype of A.indices and A.indptr to int64 in spsolve if umfpack is found.

mdhaber · 2020-02-01T18:56:57Z

@ma-sadeghi Please take a look at #11453. Does that look like it will help?
I think @pv is probably right that it's not the number of unknowns that matters but the potential number of nonzeros in the factorized form of the matrix.

ma-sadeghi · 2020-02-02T00:46:12Z

@mdhaber I manually modified my linsolve.py based on the changes in #11453, and the problem I used to have seems to have been fixed. Thanks, and yes, I agree, it makes sense that the number of nonzeroes might have caused the problem.

john-mathew · 2020-02-17T12:13:56Z

@ma-sadeghi would it be possible for you to share your modified linsolve.py? I tried modifying it according to #11453 but the error persists

mdhaber · 2020-02-24T01:12:34Z

@john-mathew are you able to provide your problem?

In your case, is np.prod(A.shape) > np.iinfo(np.int32).max true or false?

If True, does that mean you are still getting the error with 64-bit indptr and indices?

If False, can you tell me what np.prod(A.shape) and np.iinfo(np.int32).max are?

vdrhtc · 2020-04-18T18:17:57Z

Helped me too, thank you.
@john-mathew
I have just copied the linsolve.py from here https://raw.githubusercontent.com/mdhaber/scipy/baaac2b74c7aa5387a7fe4ec599bed9e1d3c627c/scipy/sparse/linalg/dsolve/linsolve.py

ma-sadeghi · 2020-04-22T00:48:46Z

@john-mathew sorry for the late reply, it's been a while, so I had to reproduce the bug on a new virtualenv, which to my surprise, the fix in #11453 didn't work anymore. I don't know why I thought it worked back then, maybe I was manually changing the dtype in my script to int64 prior to calling umfpack. Anyways, we're in the same situation as it seems. Sorry for the long wait!

@mdhaber in my case, it's False, with the values -1901007231 and 2147483647 respectively.

@vdrhtc I also tried your version of linsolve.py but it didn't work either.

mdhaber · 2020-05-07T07:45:50Z

@ma-sadeghi was the negative sign in your response accidental or was there an overflow?
Update: looks like it was an overflow.

mdhaber · 2020-05-07T16:56:23Z

@ma-sadeghi @vdrhtc @john-mathew Would you please try the fix in gh-11453 again? I think there was an overflow issue.

alexbovet · 2020-05-08T07:49:33Z

I have the same problem using scipy.sparse.linalg.expm.

In [18]: T = expm(A)
Traceback (most recent call last):

  File "<ipython-input-18-2360f9946a1a>", line 1, in <module>
    T = expm(A)

  File "/.../lib/python3.6/site-packages/scipy/sparse/linalg/matfuncs.py", line 595, in expm
    return _expm(A, use_exact_onenorm='auto')

  File "/.../lib/python3.6/site-packages/scipy/sparse/linalg/matfuncs.py", line 669, in _expm
    X = _solve_P_Q(U, V, structure=structure)

  File "/.../lib/python3.6/site-packages/scipy/sparse/linalg/matfuncs.py", line 703, in _solve_P_Q
    return spsolve(Q, P)

  File "/.../lib/python3.6/site-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 205, in spsolve
    Afactsolve = factorized(A)

  File "/.../lib/python3.6/site-packages/scipy/sparse/linalg/dsolve/linsolve.py", line 463, in factorized
    umf.numeric(A)

  File "/.../lib/python3.6/site-packages/scikits/umfpack/umfpack.py", line 577, in numeric
    umfStatus[status]))

RuntimeError: <function umfpack_di_numeric at 0x7f1a4a4a4400> failed with UMFPACK_ERROR_out_of_memory

A is a scipy.sparse.csc.csc_matrix and A.data.dtype is float64.

In my case, np.prod(A.shape) > np.iinfo(np.int32).max is False but there is no overflow as it is a relatively small matrix: A.shape is (13168, 13168), np.prod(A.shape) is 173396224 and np.iinfo(np.int32).max is 2147483647.

In this case, #11453 would not be applied?

I am using Python 3.6.10, numpy 1.18.2 and scipy 1.4.1.

pv · 2020-05-08T08:02:26Z

Ok, maybe we need to switch to the 64-bit indices unconditionally. For small matrices this won't cost much, and for large matrices, as we see here, it seems mandatory... It's maybe some question of umfpack internal data structure, and we cannot really guess exactly under which conditions this happens.

mdhaber · 2020-05-08T15:05:12Z

Thanks @alexbovet. Can you provide your matrix so we can use it as a test case?

ma-sadeghi · 2020-05-08T18:31:26Z

@mdhaber Sorry for the late reply. I applied the changes you mentioned, and I confirm that it worked! Thanks!

alexbovet · 2020-05-11T08:15:39Z

Here is the matrix in question: https://www.swisstransfer.com/d/75a17c4d-6e1c-49ac-9e63-0992a8a5c9b9 saved with scipy.sparse.save_npz

By setting the dtypes of the indices and indptr in linsolve.py to int64, the error disappear, but now my ram fills completely and the ipython kernel crashes... Therefore I cannot confirm that the this solves the problem, but this seems like a good sign. :-)

ilayn added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse.linalg labels Feb 28, 2018

thirtywang mentioned this issue Jul 2, 2018

change A.dtype to np.int64 PMEAL/OpenPNM#939

Closed

mdhaber mentioned this issue Feb 1, 2020

BUG: sparse: convert csc/csr indices to int64 as needed #11453

Merged

mdhaber added a commit to mdhaber/scipy that referenced this issue Feb 1, 2020

TST: add tests for scipygh-8278

cd579f8

ma-sadeghi mentioned this issue Feb 2, 2020

Ad hoc conversion of A.indices.dtype for umfpack compatibility no longer needed (soon) PMEAL/OpenPNM#1389

Closed

tylerjereddy closed this as completed in #11453 May 26, 2020

tylerjereddy added this to the 1.5.0 milestone May 26, 2020

perimosocordiae mentioned this issue Jun 5, 2023

BUG: Floating point CSC with int64 indices doesn't work with linalg #18603

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278

Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278

snyke7 commented Jan 10, 2018

ma-sadeghi commented Jan 31, 2020

mdhaber commented Jan 31, 2020

mdhaber commented Jan 31, 2020 •

edited

Loading

mdhaber commented Jan 31, 2020

pv commented Feb 1, 2020

mdhaber commented Feb 1, 2020

ma-sadeghi commented Feb 1, 2020

mdhaber commented Feb 1, 2020 •

edited

Loading

ma-sadeghi commented Feb 2, 2020

john-mathew commented Feb 17, 2020

mdhaber commented Feb 24, 2020

vdrhtc commented Apr 18, 2020

ma-sadeghi commented Apr 22, 2020

mdhaber commented May 7, 2020 •

edited

Loading

mdhaber commented May 7, 2020

alexbovet commented May 8, 2020 •

edited

Loading

pv commented May 8, 2020

mdhaber commented May 8, 2020

ma-sadeghi commented May 8, 2020

alexbovet commented May 11, 2020 •

edited

Loading

Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278

Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278

Comments

snyke7 commented Jan 10, 2018

Reproducing code example:

Error message:

Scipy/Numpy/Python version information:

ma-sadeghi commented Jan 31, 2020

mdhaber commented Jan 31, 2020

mdhaber commented Jan 31, 2020 • edited Loading

mdhaber commented Jan 31, 2020

pv commented Feb 1, 2020

mdhaber commented Feb 1, 2020

ma-sadeghi commented Feb 1, 2020

mdhaber commented Feb 1, 2020 • edited Loading

ma-sadeghi commented Feb 2, 2020

john-mathew commented Feb 17, 2020

mdhaber commented Feb 24, 2020

vdrhtc commented Apr 18, 2020

ma-sadeghi commented Apr 22, 2020

mdhaber commented May 7, 2020 • edited Loading

mdhaber commented May 7, 2020

alexbovet commented May 8, 2020 • edited Loading

pv commented May 8, 2020

mdhaber commented May 8, 2020

ma-sadeghi commented May 8, 2020

alexbovet commented May 11, 2020 • edited Loading

mdhaber commented Jan 31, 2020 •

edited

Loading

mdhaber commented Feb 1, 2020 •

edited

Loading

mdhaber commented May 7, 2020 •

edited

Loading

alexbovet commented May 8, 2020 •

edited

Loading

alexbovet commented May 11, 2020 •

edited

Loading