Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUSPARSE is broken in CUDA.jl 1.2 #322

Closed
michel2323 opened this issue Jul 26, 2020 · 20 comments · Fixed by #331
Closed

CUSPARSE is broken in CUDA.jl 1.2 #322

michel2323 opened this issue Jul 26, 2020 · 20 comments · Fixed by #331
Labels
cuda array Stuff about CuArray. regression Something that used to work, doesn't anymore.

Comments

@michel2323
Copy link

michel2323 commented Jul 26, 2020

*, mult! and possibly other operations are broken with ::CuSparseMatrixCSR

using CUDA
using CUDA.CUSPARSE
using SparseArrays
using LinearAlgebra
A = SparseMatrixCSC{Float64,Int64}(I,1000,1000)
cuA = CuSparseMatrixCSR(A)
v = CuVector{Float64}(ones(1000))
typeof(cuA * v)
w = similar(v)
mul!(w, cuA, v)

typeof(A*v) returns Nothing and mul!(w, A, v) returns all zeros in w with v1.2, whereas it returns the correct vector with v1.1. Maybe it's linked to #259 . I tested with CUDA 10.2 and CUDA 11.

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Details on CUDA:

CUDA toolkit 10.2.89, artifact installation
CUDA driver 10.2.0
NVIDIA driver 440.100.0

Libraries:
- CUBLAS: 10.2.2
- CURAND: 10.1.2
- CUFFT: 10.1.2
- CUSOLVER: 10.3.0
- CUSPARSE: 10.3.1
- CUPTI: 12.0.0
- NVML: 10.0.0+440.100
- CUDNN: 7.6.5 (for CUDA 10.2.0)
- CUTENSOR: 1.1.0 (for CUDA 10.2.0)

Toolchain:
- Julia: 1.4.1
- LLVM: 8.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3
- Device support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75
@michel2323 michel2323 added the bug Something isn't working label Jul 26, 2020
@maleadt maleadt added the cuda array Stuff about CuArray. label Jul 27, 2020
@maleadt
Copy link
Member

maleadt commented Jul 27, 2020

Can't reproduce on master/1.2 with Julia 1.5 or 1.4. Can you verify in a clean environment?

@maleadt maleadt added the needs information Further information is requested label Jul 27, 2020
@michel2323
Copy link
Author

I'm so sorry @maleadt , obviously cuA and not A should be used. This was lost in copy pasting. I fixed the typo. I also tried with a new environment and Julia depot.

@maximilian-gelbrecht
Copy link

maximilian-gelbrecht commented Jul 27, 2020

can confirm that this is broken and it includes CuSparseMatrixCSC as well

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = CUDA.rand(10); 

c=a*b

results in c==nothing for me.

This is with a CUDA.jl 1.2, CUDA 9.2, Julia 1.4.2. The Julia environment was fresh but I installed all packages that I needed for my current project (Flux + DiffEqFlux + some small utilities) alongside CUDA.jl. I use the pre-installed CUDA version on the HPC as the compute nodes have no internet connection for downloading artifacts.

@michel2323
Copy link
Author

Looks a lot like #256

@maleadt maleadt removed the needs information Further information is requested label Jul 28, 2020
@maleadt
Copy link
Member

maleadt commented Jul 28, 2020

I don't think this has ever worked. The reason it returned nothing is because of badly-attached docstrings, but the underlying cause is that you're performing a mixed Fkoat64 x Float32 multiplication that's not supported by CUSPARSE. Furthermore CUSPARSE seems to only support matrix-vector multiplication on the BCSR format, so this works:

julia> a = rand(Float32, 10,10); a[a.<0.9] .= 0;

julia> a = CuSparseMatrixCSR(sparse(a));

julia> a = CUSPARSE.switch2bsr(a,convert(Cint,5))

julia> b = CUDA.rand(10);

julia> a * b
10-element CuArray{Float32,1}:
 1.3360068
 0.0
 0.4224707
 0.0
 0.0
 1.2026718
 0.0
 1.1956103
 0.29753116
 0.0

I agree that this isn't particularly user-friendly though.

@maximilian-gelbrecht
Copy link

maximilian-gelbrecht commented Jul 28, 2020

Oh, you are right. I reduced my problem to a wrong, hastily written minimal example it seems. In this case it really was the Float32, Float64 mix up. (a side note: is it not possible to output a warning if nothing is returned because of this? the first time I encountered this it took me a really long time to spot the source of the mistake). However I also do encounter an error with a multiplication of a CuSparseMatrixCSC and a CuArray with CUDA.jl that wasn't present with CuArrays.jl before. I will look into it to find a proper MWE.

@maximilian-gelbrecht
Copy link

maximilian-gelbrecht commented Jul 28, 2020

@maleadt : From where do you take:

Furthermore CUSPARSE seems to only support matrix-vector multiplication on the BCSR format [...]

I roughly observed the following behaviour when porting my current project from CuArrays.jl (v.1.7.3) to CUDA.jl (v1.2) (just replacing CuArrays with CUDA): with CuArrays.jl CuSparseMatrixCSC * CuArray was working and with CUDA.jl it is not.

I also do not want to derail from the problem that @michel2323 has posted, but I think this might be very similar if not the same issue.

edit:

MWE:

Julia 1.3, CuArrays.jl (v.1.7.3), CUDA 9.2 (pre-installed on the HPC I am using)

using CuArrays 
using CuArrays.CUSPARSE
using SparseArrays

a = rand(Float32, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10); 
b = CuArray(b)

a*b # == something 

Julia 1.4.2, CUDA.jl (1.2), CUDA 9.2 (pre-installed on the HPC I am using)

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(Float32, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10); 
b = CuArray(b)

a*b # == nothing 

@maximilian-gelbrecht
Copy link

maximilian-gelbrecht commented Jul 28, 2020

@maleadt why is this closed? I don't think it is fixed. Personally, I have code that working just fine with the old versions that is not anymore and @michel2323 problem doesn't seem to be solved either.

care to explain it?

@maleadt
Copy link
Member

maleadt commented Jul 28, 2020

Sorry, I didn't see your comment with additional code before closing it.

@maleadt
Copy link
Member

maleadt commented Jul 29, 2020

Found the culprit: cusparseScsrmv was removed with CUDA 11, #291, should use cusparseSpMV instead

@maleadt
Copy link
Member

maleadt commented Jul 29, 2020

Although I plan to look at this, I don't think I'll have it block the next release (since it's currently broken already, and there's other issues that are fixed already on the master branch).

@maleadt maleadt added regression Something that used to work, doesn't anymore. and removed bug Something isn't working labels Jul 31, 2020
@nirmal-suthar
Copy link

nirmal-suthar commented Aug 2, 2020

Julia 1.4.2 CUDA v1.2.1 CUDA Version: 10.2

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(Float32, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10,3); 
b = CuArray(b)

a*b 
# ERROR: UndefVarError: cusparseScsrmm2 not defined

While CuSparseMatrixCSC{Float64} works

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(Float64, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10,3); 
b = CuArray(b)

a*b 
# 10×3 CuArray{Float64,2}:

@simonbyrne
Copy link
Contributor

simonbyrne commented Aug 5, 2020

Another issues is that sparse matrix * matrix works, but sparse matrix * vector does not:

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(Float64, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10,3); 
b = CuArray(b)

a*b  # works
a*b[:,1] # returns nothing

@maleadt
Copy link
Member

maleadt commented Aug 19, 2020

With #351, some cases work again already but CSC support is still missing.

@nmheim
Copy link

nmheim commented Aug 25, 2020

First, thank you for CUDA.jl! Its really great and making my life a lot easier! :)

Although #351 looks like it fixed this issue, if I run the script above

using CUDA 
using CUDA.CUSPARSE
using SparseArrays

a = rand(Float64, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float32,10,3); 
b = CuArray(b)

a*b  # causes scalar operation warning
a*b[:,1] # errors

I am getting:

┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`
└ @ GPUArrays ~/.julia/packages/GPUArrays/eVYIC/src/host/indexing.jl:43
ERROR: LoadError: MethodError: no method matching mv!(::Char, ::Float64, ::CuSparseMatrixCSC{Float64}, ::CuArray{Float32,1}, ::Float64, ::CuArray{Float64,1}, ::Char)
Closest candidates are:
  mv!(::Char, ::Float64, ::CuSparseMatrixBSR{Float64}, ::CuArray{Float64,1}, ::Float64, ::CuArray{Float64,1}, ::Char) at /home/niklas/.julia/packages/CUDA/dZvbp/lib/cusparse/wrappers.jl:186
  mv!(::Char, ::Float32, ::CuSparseMatrixBSR{Float32}, ::CuArray{Float32,1}, ::Float32, ::CuArray{Float32,1}, ::Char) at /home/niklas/.julia/packages/CUDA/dZvbp/lib/cusparse/wrappers.jl:186
  mv!(::Char, ::Complex{Float64}, ::CuSparseMatrixBSR{Complex{Float64}}, ::CuArray{Complex{Float64},1}, ::Complex{Float64}, ::CuArray{Complex{Float64},1}, ::Char) at /home/niklas/.julia/packages/CUDA/dZvbp/lib/cusparse/wrappers.jl:186
  ...
Stacktrace:
 [1] mul!(::CuArray{Float64,1}, ::CuSparseMatrixCSC{Float64}, ::CuArray{Float32,1}) at /home/niklas/.julia/packages/CUDA/dZvbp/lib/cusparse/interfaces.jl:12
 [2] *(::CuSparseMatrixCSC{Float64}, ::CuArray{Float32,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:51
 [3] top-level scope at /home/niklas/repos/SpatialEchoStateNetwork/test.jl:11
 [4] include(::String) at ./client.jl:457
 [5] top-level scope at REPL[1]:1
in expression starting at /home/niklas/repos/SpatialEchoStateNetwork/test.jl:11

I am on CUDA.jl v1.3.3 and julia 1.5:

(SpatialEchoStateNetwork) pkg> st
Project SpatialEchoStateNetwork v0.1.0
Status `~/repos/SpatialEchoStateNetwork/Project.toml`
  [7d9fca2a] Arpack v0.4.0
  [052768ef] CUDA v1.3.3
  [d9f16b24] Functors v0.1.0
  [37e2e46d] LinearAlgebra
  [2f01184e] SparseArrays

@maleadt
Copy link
Member

maleadt commented Aug 26, 2020

This has not been merged in 1.3.x, since it drops support for CUDA 10.1 and earlier.

@maleadt
Copy link
Member

maleadt commented Sep 3, 2020

#409 should fix most of the issues reported here, please test. Do note that you need to exactly match up array types, many of the scalar iterations reported here were due to combining e.g. a Float32 and Float64 array. We could improve that, of course, but let's just get the missing functionality back first.

@maleadt maleadt closed this as completed Sep 4, 2020
@maximilian-gelbrecht
Copy link

I keep coming back to this, but I still have several problems with CUSPARSE when migrating from the old CuArrays.jl to CUDA.jl . Things that used to work don't work anymore.

With Julia 1.4.2 and CUDA.jl 1.3.3, I get for

using CUDA 
using CUDA.CUSPARSE
using SparseArrays
a = rand(Float64, 10,10); a[a.<0.9] .= 0
a = CuSparseMatrixCSC(sparse(a));
b = rand(Float64,10,3); 
b = CuArray(b);
a*b

the following error

UndefVarError: cusparseDcsrmm2 not defined

Stacktrace:
 [1] mm2!(::Char, ::Char, ::Float64, ::CuSparseMatrixCSC{Float64}, ::CuArray{Float64,2}, ::Float64, ::CuArray{Float64,2}, ::Char) at /home/maxgelbr/.julia/packages/CUDA/dZvbp/lib/cusparse/wrappers.jl:520
 [2] mul! at /home/maxgelbr/.julia/packages/CUDA/dZvbp/lib/cusparse/interfaces.jl:19 [inlined]
 [3] *(::CuSparseMatrixCSC{Float64}, ::CuArray{Float64,2}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:153
 [4] top-level scope at In[3]:1

There are some other problems as well but I haven't had the time yet to sit down and find a proper MWE.

@simonbyrne
Copy link
Contributor

simonbyrne commented Oct 31, 2020

I believe the above fix is only in CUDA.jl 2.0. Try upgrading to the latest version (you may need Julia 1.5 as well).

@maximilian-gelbrecht
Copy link

maximilian-gelbrecht commented Oct 31, 2020

Thanks for the tip. Unfortunately Flux.jl restricts me from updating to CUDA.jl 2.0.
The error now is different than before though, so something def. changed through the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. regression Something that used to work, doesn't anymore.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants