Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large matrices cause LAPACK ArgumentError #42331

Open
r3tex opened this issue Sep 21, 2021 · 13 comments
Open

Large matrices cause LAPACK ArgumentError #42331

r3tex opened this issue Sep 21, 2021 · 13 comments
Labels
domain:linear algebra Linear algebra kind:upstream The issue is with an upstream dependency, e.g. LLVM

Comments

@r3tex
Copy link

r3tex commented Sep 21, 2021

Here is a small way to reproduce the problem.
When giving an argument close to 10000 it fails.

using LinearAlgebra
mirror!(A) = for i in 1:size(A, 1), j in (i + 1):size(A, 2); A[i, j] = A[j, i] = A[i, j] + A[j, i]; end
function spectralpositions(n)
    g = rand(0:1, n, n)
    mirror!(g)
    ϕ = Matrix(Diagonal(vec(sum(g; dims=1)) - diag(g)))
    λ = Float32[i == j ? ϕ[i, j] : -g[i, j] for i in 1:n, j in 1:n]
    eigvecs(λ, ϕ)
end
julia> spectralpositions(10000)
 ** On entry to SSYGVDNon-unitLeft parameter number 11 had an illegal value
ERROR: ArgumentError: invalid argument #11 to LAPACK call
Stacktrace:
  [1] chkargsok
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:27 [inlined]
  [2] sygvd!(itype::Int64, jobz::Char, uplo::Char, A::Matrix{Float32}, B::Matrix{Float32})
    @ LinearAlgebra.LAPACK /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lapack.jl:5167
  [3] #eigen!#101
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/symmetric.jl:828 [inlined]
  [4] eigen!(A::Matrix{Float32}, B::Matrix{Float32}; sortby::typeof(LinearAlgebra.eigsortby))
    @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:428
  [5] eigen!
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:428 [inlined]
  [6] eigen(A::Matrix{Float32}, B::Matrix{Int64}; kws::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ LinearAlgebra /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:501
  [7] eigen
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:500 [inlined]
  [8] #eigvecs#87
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:607 [inlined]
  [9] eigvecs
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/eigen.jl:607 [inlined]
 [10] spectralpositions(n::Int64)
    @ Main ~/KYUBI/examples/example-VRP-1.jl:62
 [11] top-level scope
    @ REPL[39]:1

@ViralBShah @andreasnoack

@dkarrasch dkarrasch added the domain:linear algebra Linear algebra label Sep 21, 2021
@inkydragon
Copy link
Sponsor Member

ccall((@blasfunc($sygvd), liblapack), Cvoid,
(Ref{BlasInt}, Ref{UInt8}, Ref{UInt8}, Ref{BlasInt},
Ptr{$elty}, Ref{BlasInt}, Ptr{$elty}, Ref{BlasInt},
Ptr{$elty}, Ptr{$elty}, Ref{BlasInt}, Ptr{BlasInt},
Ref{BlasInt}, Ptr{BlasInt}, Clong, Clong),
itype, jobz, uplo, n,
A, lda, B, ldb,
w, work, lwork, iwork,
liwork, info, 1, 1)

Argument #11 is lwork.

Parameter verification logic: LAPACK: dsygvd

@andreasnoack
Copy link
Member

The LAPACK source suggests that this can only happen if the value of LWORK is negative. I'm wondering if there is an issue with the integer size. We should be compiling OpenBLAS such that all the integers are 64 bit and we are passing LWORK as a 64 bit integer, but maybe it's being read as a 32 bit integer

julia> reinterpret(Int32, [typemax(Int32)+1])
2-element reinterpret(Int32, ::Vector{Int64}):
 -2147483648
           0

It would be useful if somebody could check the value of work[1] after a query call to the routine. If my conjecture is correct then this value should be larger than typemax(Int32).

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 27, 2021

It would be useful if somebody could check the value of work[1] after a query call to the routine.

Running with @show work[1]; @show lwork[] right after the ccall I get:

julia> spectralpositions(10000)
work[1] = 2.0006f8
lwork[] = -1
 ** On entry to SSYGVDNon-unitLeft parameter number 11 had an illegal value
work[1] = 2.0006f8
lwork[] = 200060000
ERROR: ArgumentError: invalid argument #11 to LAPACK call

@andreasnoack
Copy link
Member

Ha.

julia> eps(2.0006f8)
16.0f0

So in the query call, the minimal work space is computed as 1 + 6*N + 2*N^2 and that is then stored in the Float32 array work but that will truncate the integer and since

julia> N = 10000
10000

julia> lwork = 1 + 6*N + 2*N^2
200060001

julia> Int(Float32(lwork)) < lwork
true

this condition is satisfied. This must be considered a LAPACK bug. They should ensure that round tripping between integer and Flaot32 doesn't return a smaller value. I'll file an issue tonight.

@ViralBShah
Copy link
Member

I find it hard to believe that we are the first people to run into this!

@andreasnoack
Copy link
Member

Indeed we weren't Reference-LAPACK/lapack#600

@andreasnoack andreasnoack added the kind:upstream The issue is with an upstream dependency, e.g. LLVM label Sep 27, 2021
@ViralBShah
Copy link
Member

@r3tex On Julia 1.7 RCs, can you try this with MKL? Just do using MKL before running.

@inkydragon
Copy link
Sponsor Member

Julia 1.7.0 rc1 + MKL.jl works fine.

julia> versioninfo()
Julia Version 1.7.0-rc1
Commit 9eade6195e (2021-09-12 06:45 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

(@v1.7) pkg> st
      Status `C:\Users\woclass\.julia\environments\v1.7\Project.toml`
  [33e6dc65] MKL v0.4.2

julia> using LinearAlgebra

julia> using MKL

julia> BLAS.get_config()
LinearAlgebra.BLAS.LBTConfig
Libraries:
└ [ILP64] mkl_rt.1.dll

julia> mirror!(A) = for i in 1:size(A, 1), j in (i + 1):size(A, 2); A[i, j] = A[j, i] = A[i, j] + A[j, i]; end
mirror! (generic function with 1 method)

julia> function spectralpositions(n)
           g = rand(0:1, n, n)
           mirror!(g)
           ϕ = Matrix(Diagonal(vec(sum(g; dims=1)) - diag(g)))
           λ = Float32[i == j ? ϕ[i, j] : -g[i, j] for i in 1:n, j in 1:n]
           eigvecs(λ, ϕ)
       end
spectralpositions (generic function with 1 method)

julia> spectralpositions(10000)
10000×10000 Matrix{Float32}:
....

@andreasnoack
Copy link
Member

It looks like this has already been fixed in LAPACK master so we should be able to close this issue once a new release of LAPACK makes it into OpenBLAS.

@KristofferC
Copy link
Sponsor Member

Reference-LAPACK/lapack#600 (comment) suggests the whole problem is not fixed but maybe that does not affect us?

@andreasnoack
Copy link
Member

There are two somewhat separate issues there and the one that causes the example here to fail should be handled by the fix. The other issue is when the required workspace is larger than an Int32 but that shouldn't be an issue for use because we use Int64 in BLAS.

@andreasnoack
Copy link
Member

We could consider moving the work space computation to Julia as a workaround for this while we are waiting for new LAPACK release. There is a small risk that we'll forget it and that LAPACK later changes the workspace requirements.

@ViralBShah
Copy link
Member

Can't we just apply the LAPACK patch to our openblas build? Shouldn't be too difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:linear algebra Linear algebra kind:upstream The issue is with an upstream dependency, e.g. LLVM
Projects
None yet
Development

No branches or pull requests

6 participants