Skip to content

Wrong sparse matrix-vector multiplication after v5.9+ #2945

@albertomercurio

Description

@albertomercurio

Describe the bug

After updating to v5.9 or higher, I get wrong results on a very simple code. It works with CSR and COO formats, so it is only related to CSC format.

Same issue also found by @matteosecli

N = 3

A = sprand(ComplexF64, N, N, 0.7)
x = rand(ComplexF64, N)
y = similar(x)

dA = CUSPARSE.CuSparseMatrixCSC(A)
dx = CuArray(x)
dy = CuArray(y)

mul!(y, A', x)

mul!(dy, dA', dx)

y

dy
3-element Vector{ComplexF64}:
 0.3576070351441064 + 0.026427315882171412im
 0.9574916205831332 - 0.6602476831363189im
 1.1290087953392993 - 0.5154161408908955im

3-element CuArray{ComplexF64, 1, CUDA.DeviceMemory}:
 0.2643116896023774 + 0.2423231836800406im
 0.4554185718598048 + 1.234688386484635im
  0.645728367194516 + 1.133002291085027im

Version info

Details on Julia:

Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13900KF
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 16 default, 0 interactive, 8 GC (on 32 virtual cores)
Environment:
  JULIA_NUM_THREADS = 16

Details on CUDA:

Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × 13th Gen Intel(R) Core(TM) i9-13900KF
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 16 default, 0 interactive, 8 GC (on 32 virtual cores)
Environment:
  JULIA_NUM_THREADS = 16

julia> CUDA.versioninfo()
\CUDA toolchain: 
- runtime 12.9, artifact installation
- driver 575.64.3 for 12.9
- compiler 12.9

CUDA libraries: 
- CUBLAS: 12.9.1
- CURAND: 10.3.10
- CUFFT: 11.4.1
- CUSOLVER: 11.7.5
- CUSPARSE: 12.5.10
- CUPTI: 2025.2.1 (API 12.9.1)
- NVML: 12.0.0+575.64.3

Julia packages: 
- CUDA: 5.9.1
- CUDA_Driver_jll: 13.0.2+0
- CUDA_Compiler_jll: 0.2.2+0
- CUDA_Runtime_jll: 0.19.2+0

Toolchain:
- Julia: 1.11.7
- LLVM: 16.0.6

1 device:
  0: NVIDIA GeForce RTX 4090 (sm_89, 23.004 GiB / 23.988 GiB available)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions