-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack #8278
Comments
fix Memory limit for scipy.sparse.linalg.spsolve with scikit-umfpack. when using umfpack to solve Ax=b, which have 10X speedup in Ubuntu and MacOS,an int32 indices will hit a 32 bit memory limit. This is because the 64-bit umfpack family solver is only called when the indices have dtype int64 (see more: scipy/scipy#8278). An easily patch for this bug is changing the dtype of A to np.int64
Does anybody know what the status of this issue is? Is it something that's going to be addressed soon? |
Hmm this sounds like an easy fix. @ilayn how do you think this should be addressed? Add an option that allows the user to force use of 64 bit indices is an obvious choice, but is it the right one? |
@ma-sadeghi even if we do patch this now, it will be about 4-5 months before the next release. The original poster suggested a patch - you might try applying this yourself if you need a workaround soon. |
Upon closer reading, it seems that we would actually need to change the data type of the indices in the sparse matrix. If we were to just create an option in @perimosocordiae Any thoughts on whether the CSC/CSR matrix classes can/should have a manual override for the |
I think this is spsolve issue --- if 64-bit indices are needed, the cast should done there (not in csr_matrix). I'm not sure when umfpack exactly requires 64-bit indices --- if it's similar to full inversion, then the condition probably is |
Ok so just cast |
@pv I might be wrong but I think |
@ma-sadeghi Please take a look at #11453. Does that look like it will help? |
@ma-sadeghi would it be possible for you to share your modified |
@john-mathew are you able to provide your problem? In your case, is If If |
Helped me too, thank you. |
@john-mathew sorry for the late reply, it's been a while, so I had to reproduce the bug on a new virtualenv, which to my surprise, the fix in #11453 didn't work anymore. I don't know why I thought it worked back then, maybe I was manually changing the @mdhaber in my case, it's @vdrhtc I also tried your version of |
@ma-sadeghi was the negative sign in your response accidental or was there an overflow? |
@ma-sadeghi @vdrhtc @john-mathew Would you please try the fix in gh-11453 again? I think there was an overflow issue. |
I have the same problem using
In my case, In this case, #11453 would not be applied? I am using Python 3.6.10, numpy 1.18.2 and scipy 1.4.1. |
Ok, maybe we need to switch to the 64-bit indices unconditionally. For small matrices this won't cost much, and for large matrices, as we see here, it seems mandatory... It's maybe some question of umfpack internal data structure, and we cannot really guess exactly under which conditions this happens. |
Thanks @alexbovet. Can you provide your matrix so we can use it as a test case? |
@mdhaber Sorry for the late reply. I applied the changes you mentioned, and I confirm that it worked! Thanks! |
Here is the matrix in question: https://www.swisstransfer.com/d/75a17c4d-6e1c-49ac-9e63-0992a8a5c9b9 saved with scipy.sparse.save_npz By setting the dtypes of the indices and indptr in linsolve.py to int64, the error disappear, but now my ram fills completely and the ipython kernel crashes... Therefore I cannot confirm that the this solves the problem, but this seems like a good sign. :-) |
When using
scipy.sparse.linalg.spsolve
withscikit-umfpack
as solver, one sometimes hits a 32-bit memory limit.This is because the 64-bit umfpack family solver is only called when the indices have dtype int64, which only happens when there is an index exceeding the int32max. However, in my case (homework for a course) none of the indices exceeded this but more memory was needed.
Relevant code snippets of scipy internals:
scipy/sparse/linalg/dsolve/linsolve.py
, inside the spsolve function_get_umf_family(A)
is in the same file, selects 'di' for indices of dtype int32. Note thatscikit-umfpack
checks if the type of provided sparse matrix corresponds to the selected family in function_getIndx
, so it seems to me that a patch should be applied inscipy
, not inscikit-umfpack
.Furthermore, I saw really no way to change the dtype of the indices of the matrix. Looking at
scipy/sparse/compressed.py
, which is the parent class of csr/csc matrices, the functionget_index_dtype
is always called to determine the dtype of the indices. This function is declared inscipy/sparse/sputils.py
, and only returns int64 when there are values larger than int32max.Reproducing code example:
When one changes the get_index_dtype function to always return np.int64, the code runs fine (however, it does use about 6.6GB of memory on my system)
Error message:
Scipy/Numpy/Python version information:
The text was updated successfully, but these errors were encountered: