-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
numpy.dot out of memory when multiplying big matrix #4062
Comments
Got to ask, but are you sure you didn't transpose the wrong argument making the result 1 Million x 1 Million instead of 2000x2000? |
Ah, sorry. There is a point. The thing is that If this is not the problem, maybe check the blas you are linking, i.e. some openblas versions are apparently known to sometimes misbehave for large arrays. |
This would explain the situation. The matrix A uses indeed over half of my memory. I would still put it on the wishlist. np.dot should divide a problem into subproblems if it'll not be able to deal with it at once. This should be easy to fix. Thanks! |
Hi, NumPy don't do the minimium amount of copy. For example, it was coping c If A is not c contiguous, current version of numpy will copy it (and its On Tue, Nov 19, 2013 at 7:00 AM, grfo notifications@github.com wrote:
|
I'm using 1.7.1 but I'm unfortunatelly not allowed to upgrade to 1.8 on this machine. However, good to see this should be better in 1.8. I would like to see numpy be better prepared for big data. I can work out a solution for my own. But I would expect me to not be the only one running in this kind of trouble. |
Yeah 1.8. has fixed this exact thing. The |
#3916 might solve this problem |
no, that pr it does not prevent the large copy, but the chunking could be moved to a higher level to make smaller copies. |
@seberg Sounds like this is fixed in 1.8. True? |
We still do copies when in principle blas could do it without the copy I think. But yes the specific operation won't copy since 1.8. |
I'll close this then. If dot can still use improvement in that respect, might open a task issue for that. |
I am getting memory crash error while executing the following code in Linux sever d is a matrix of size 1024x50625 I need to find dTd = np.matmul(d.T, d) |
You need > 20 GiB for that just for the result. how much memory do you have? |
I did the implementation on Google Colab.
…On Fri, 9 Apr, 2021, 9:34 pm Charles Harris, ***@***.***> wrote:
You need > 20 GiB for that just for the result. how much memory do you
have?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4062 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACTALHJBKGI3EB5QDJUDSN3TH4QQFANCNFSM4AJZFDLQ>
.
|
I have a 2000 by 1,000,000 matrix A and want to calculate the 2000 by 2000 matrix
.> B = numpy.dot(A,A.T)
but numpy just eats up all my memory, slows down my whole computer and crashes after a couple of hours.
I then rewrote the matrix multiplication to
.> B = numpy.zeros(2000,2000)
.> A.shape = (2000,10000,100)
.> for M in numpy.rollaxis(A,2):
.>.. B += numpy.dot(M,M.T)
and it just runs fine in a couple of minutes.
I don't see the reason numpy needs so much memory for a matrix multiplication. But I think at least it should not crash on that problem.
The text was updated successfully, but these errors were encountered: