Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug when using MKLSparse following use of MKL #36

Open
simonp0420 opened this issue Feb 15, 2023 · 9 comments
Open

Bug when using MKLSparse following use of MKL #36

simonp0420 opened this issue Feb 15, 2023 · 9 comments

Comments

@simonp0420
Copy link

When I use MKLSparse after a previous use MKL, a bug manifests when multiplying a real-valued SparseMatrixCSC{ComplexF64, Int64} matrix times a ComplexF64 or Float64 vector. The following output exhibits the bug:

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 0.0
0.0

julia> using MKL  # This does not introduce the bug

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 0.0
0.0

julia> using MKLSparse  # this will introduce the bug

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 119.4196587781747
119.4196587781747

The final output should be zero since b was computed as QAAQ * fsrc and QAAQ has zero imaginary part for each element. Note that the bug does not manifest if I use the two packages in the other order: MKLSparse first, followed by MKL.

The four files necessary for reproducing this are in this gist. Here is my version info:

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65e (2023-01-08 06:45 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

and my package status:

(error_reproducer) pkg> st
Status `D:\peter\Documents\julia\book_codes\RumpfFDFD\error_reproducer\Project.toml`
  [33e6dc65] MKL v0.6.0
  [0c723cd3] MKLSparse v1.1.0
@TobiasHolicki
Copy link

TobiasHolicki commented Sep 10, 2023

I encountered essentially the same bug. Here is a smaller example that might be helpful:

A = sparse([1.0 1.0]); B = [1 0; 0 1.0]

using MKL
A * B # this returns [1.0 1.0] which is ok

using MKLSparse
A * B # this returns [1.0 0.0] which is bad

B = [1 0; 0 1];
A * B # this returns [1.0 1.0] which is ok again

@Leebre
Copy link

Leebre commented Apr 27, 2024

I am seeing similar problems with MKLSparse. However, note that the package seems to be abandoned, as it hasn't been updated in over 2 years:

https://github.com/JuliaSparse/MKLSparse.jl

@amontoison
Copy link
Member

I encountered essentially the same bug. Here is a smaller example that might be helpful:

A = sparse([1.0 1.0]); B = [1 0; 0 1.0]

using MKL
A * B # this returns [1.0 1.0] which is ok

using MKLSparse
A * B # this returns [1.0 0.0] which is bad

B = [1 0; 0 1];
A * B # this returns [1.0 1.0] which is ok again

I highly suspect that the issue is in MKL.
When we do a using MKL, we change the threading mode here.
It should have a side effect on how they perform the sparse products internally.
MKL.jl and MKLSParse.jl use the same library MKL_jll.libmkl_rt.

@amontoison
Copy link
Member

amontoison commented May 6, 2024

I find the culprit, MKL.jl is loading an LP64 interface by default:
JuliaLinearAlgebra/MKL.jl@cbbaacd
It was changed with a recent release of MKL.jl (v6.0).

We load an ILP64 here in MKLSparse.jl:
https://github.com/JuliaSparse/MKLSparse.jl/blob/master/src/MKLSparse.jl#L37

@amontoison
Copy link
Member

@ViralBShah Is it not possible to still load the interface ILP64 in MKL.jl?

@ViralBShah
Copy link
Member

ViralBShah commented May 6, 2024

I thought we use the ILP64 by default in MKL. MKL now has the _64 suffixes for ILP64 as well.

@amontoison
Copy link
Member

I thought we use the ILP64 by default in MKL. MKL now has the _64 suffixes for ILP64 as well.

Do you know in which version of Intel MKL , the _64 suffixes were added for ILP64?
I opened a PR in MKL.jl but we maybe need to update the compat entry on MKL_jll.jl.

@ViralBShah
Copy link
Member

IIRC, it was in 2023, but there were some issues where a few symbols got left out. So, I think 2024 is perhaps the safest. In fact, we may want to make 2024 a minimum requirement even in MKL.jl.

@amontoison
Copy link
Member

We will drop support for Mac but I think it's fine. It never worked well and required sesuential threading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants