New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sgesvd_bufferSize int32 overflow with CUDA 10.1 #2351
Comments
@kmaehashi and I confirmed that |
In my opinion the first issue is on CUDA's side, because the memory required to perform an SVD on tall matrices should be linear with respect to the input size, and here it is clearly quadratic. I posted about this in nvidia developer channel (https://devtalk.nvidia.com/default/topic/1057891/gpu-accelerated-libraries/cusolver-can-t-compute-svd-on-tall-matrices-with-cuda-10-1-buffersize-grows-quadratically/) but haven't got any answer yet. The second issue is the integer overflow. If CUDA's |
@anaruse do you have any insight on this issue? |
Indeed, with previous version of CUDA After doing more tests, I confirm the overflow is still here in CUDA9.2, but you would need tremendous matrices to reach it. Here is on CUDA9.2 with the same function >>> test((1<<31)-224-1)
2147483647
>>> test((1<<31)-224)
-2147483648
>>> test((1<<31)-1)
-2147483553
>>> test(1<<31)
nan This suggests the issue is not cupy at all, but a wrong logic in CUDA10's |
Please wait for a while. This issue is being inquired to library team. |
Thanks for your information again. |
@anaruse thanks a lot! |
@anaruse, is this still an issue with the recent updates applied to CUDA 10.1? cc @pentschev (for awareness) |
Yes, this is still an issue even with CUDA 10.1 Update 2. |
@anaruse Is this fixed in CUDA 11.x? |
This issue should have been fixed in CUDA 11.0. |
After upgrading from CUDA 9.0 to CUDA 10.1, I noticed I'm not able to compute svd on big matrices because of a int32 overflow in sgesvd_bufferSize here: https://github.com/cupy/cupy/blob/master/cupy/linalg/decomposition.py#L257
When bufferSize is just above
2**31 = 2147483648
, thensgesvd_bufferSize
fails withCUSOLVER_STATUS_INVALID_VALUE
, likely because of a wrong cast to negative values. If you continue increasing, you can get positive values again, but wrong ones. See graph below.This might be an issue with cuSOLVER itself, not cupy, but since I'm not familiar with testing CUDA without cupy I can't tell.
It might be related to #1365 as well, but I haven't tested on CUDA 9.1 nor 10.0.
The text was updated successfully, but these errors were encountered: