-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception access violation / openblas / Julia 1.6.1 #40963
Comments
In order to look more into this, it is almost required for us to have a way of reproducing the problem. Could you post the code that you use that causes the error? |
Also which processor, OS, and is it on cloud, etc. |
Same thing here. Unfortunately, the code is involved. I will try to derive a reduced code example. |
My guess is that this happens in the thread management code: notice the four times repeated, jumbled together, printout. |
Reported by @RaghuSivapuram in JuliaLinearAlgebra/IterativeSolvers.jl#301:
|
using LinearAlgebra, IterativeSolvers, SparseArrays A = sprand(19400, 19400, 0.1) + 1.0im * sprand(19400, 19400, 0.1) minres!(x, A, F; maxiter=20) # blows up. On Windows, Julia 1.6.1, 16GB RAM, Intel Core i7-7500U CPU |
FWIW, I can't reproduce @RaghuSivapuram 's issue on my Mac (but I don't have a core i7). |
I can reproduce on Win 10 with i-7. On the same machine with WSL2 this code runs fine. |
I can reproduce this on Win10 with Julia 1.6.1
|
I got it! Simplified test code: arrSzie = 10000 # No error
arrSzie = 10001 # Has error
arrType = ComplexF64
using LinearAlgebra: dot
using LinearAlgebra.BLAS: dotc
x = zeros(arrType, arrSzie);
y = zeros(arrType, arrSzie);
# dot(x, y) # LinearAlgebra\src\matmul.jl:10
dotc(length(x), x, stride(x, 1), y, stride(y, 1)) # LinearAlgebra\src\blas.jl:451 Then
https://github.com/JuliaLang/julia/blob/v1.6.3/stdlib/LinearAlgebra/src/blas.jl#L400 Some interesting findings:
Using WinDBG, I obtained the call stack when the error occurred, found that it was the function Crash msg & WinDBG stacktraceWinDBG
I can reproduce this on Win10 with Julia 1.6.3
I cannot reproduce this on Click to see Julia `versioninfo()`Win10 with Julia 1.7.0-rc1
WSL
macOS
|
Wow, fantastic detective work, @inkydragon ! |
1.6 ships with openblas 0.3.10, whereas 1.7 ships with 0.3.13. A quick check may be to copy the openblas dll from julia 1.7 into julia 1.6 and see if the problem goes away. |
Julia + OpenBLAS test
Use julia's
|
Julia 1.6.3 | Julia 1.7.0-rc1 | |
---|---|---|
OpenBLAS-0.3.10-x64 | crash | crash |
OpenBLAS-0.3.12-x64 | ✔ | ✔ |
OpenBLAS-0.3.13-x64 | ✔ | ✔ |
Note:
- OpenBLAS 0.3.11 is broken, skip.
- You may restart julia before test a new version of OpenBLAS DLL.
Click to see Crash msg
Julia 1.6.3 + OpenBLAS-0.3.10-x64
julia> ccall((:cblas_zdotc_sub, :libopenblas), Cvoid,
(BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}),
length(x), x, stride(x, 1), y, stride(x, 1), result)
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x47a321a0 -- at 0x47a321a0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x47a321a0 -- at 0x47a321a0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x47a321a0 -- at 0x47a321a0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x47a321a0 -- at 0x47a321a0 -- OLATION at 0x47a321a0 -- at 0x47a321a0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x47a321a0 -- DCABS1 at D:\proj\Julia\julia@github\40963-openblas\OpenBLAS-0.3.10-x64\bin\libopenblas.dll (unknown line)
DCABS1 at D:\proj\Julia\julia@github\DCABS1 at D:\proj\Julia\julia@github\40963-openblas\OpenBLAS-0.3.10-x64\bin\libopenblas.dll (unknown line)
Julia 1.7.0-rc1 + OpenBLAS-0.3.10-x64
julia> ccall((:cblas_zdotc_sub, :libopenblas), Cvoid,
(BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}),
length(x), x, stride(x, 1), y, stride(x, 1), result)
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception:
Perhaps we need this: "add emms to reset fpu registers before assembler routines by mattip · Pull Request #2881 · xianyi/OpenBLAS · GitHub" OpenMathLib/OpenBLAS#2881 |
Build CBLAS with MSYS2 MinGW64: Build v0.3.10 + cherry-pick OpenMathLib/OpenBLAS#2881.
Still crash. Between v0.3.10 and v0.3.12, the OpenBLAS main branch's symbol export is incomplete, missing function So it was hard to find out the commit that fixed the crash directly by I'm going to find the commits that fixed the symbol export first, then look for the commits that fixed the crash. |
It would also be good to check this with MKL.jl on 1.7. I am assuming it shouldn't be a problem - but good to know. Install MKL.jl on 1.7, then do |
Julia 1.7.0-rc1 + MKL.jl works well. julia> using LinearAlgebra
julia> BLAS.get_config()
LinearAlgebra.BLAS.LBTConfig
Libraries:
└ [ILP64] libopenblas64_.dll
julia> using MKL
julia> BLAS.get_config()
LinearAlgebra.BLAS.LBTConfig
Libraries:
└ [ILP64] mkl_rt.1.dll
julia> arrSzie = 10000; # No error
julia> arrSzie = 10001; # Has error
julia> arrType = ComplexF64;
julia> using LinearAlgebra: dot
julia> using LinearAlgebra.BLAS: dotc
julia> x = zeros(arrType, arrSzie);
julia> y = zeros(arrType, arrSzie);
julia> dot(x, y)
0.0 + 0.0im
julia> dotc(length(x), x, stride(x, 1), y, stride(y, 1))
0.0 + 0.0im
julia> x = ones(arrType, arrSzie);
julia> y = ones(arrType, arrSzie);
julia> dot(x, y)
10001.0 + 0.0im
julia> dotc(length(x), x, stride(x, 1), y, stride(y, 1))
10001.0 + 0.0im
julia> versioninfo()
Julia Version 1.7.0-rc1
Commit 9eade6195e (2021-09-12 06:45 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
(@v1.7) pkg> st
Status `C:\Users\woclass\.julia\environments\v1.7\Project.toml`
[42fd0dbc] IterativeSolvers v0.9.1
[33e6dc65] MKL v0.4.2
[2f01184e] SparseArrays |
|
Has zdotc_sub |
Not Crash | Commit message | |
---|---|---|---|
v0.3.12 | ✔ | ✔ | Fix typos |
... | - | - | - |
v0.3.11 | ❌ | ❓ | Update from develop for 0.3.11 |
... | - | - | - |
b205323 | ✔ | ✔ | Fix mssing dummy parameter (imag part of alpha) of zdot_thread_function |
fb3d80c4 | ✔ | ❌ | rebase |
... | - | - | - |
v0.3.10 | ✔ | ❌ | Merge develop into 0.3.0 for 0.3.10 release |
cherry-picking
git checkout v0.3.10
git cherry-pick b2053239fc3
make -j6 TARGET=HASWELL BINARY=64 ONLY_CBLAS=1
NO crash!
Test Julia 1.6.3+1.7.0
output
julia> arrSzie = 10001;
julia> arrType = ComplexF64;
julia> blasPath = raw"V:\OpenBLAS\libopenblas.dll";
julia> using LinearAlgebra.BLAS: BlasInt
julia> using Libdl: dlopen
julia> dlopen(blasPath)
Ptr{Nothing} @0x00007ffe5ceb0000
julia> strip(unsafe_string(ccall((:openblas_get_config, :libopenblas), Ptr{UInt8}, () )))
"OpenBLAS 0.3.10 NO_LAPACK NO_LAPACKE NO_AFFINITY HASWELL MAX_THREADS=6"
julia> x = zeros(arrType, arrSzie);
julia> y = zeros(arrType, arrSzie);
julia> result = Ref{ComplexF64}();
julia> ccall((:cblas_zdotc_sub, :libopenblas), Cvoid,
(BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}),
length(x), x, stride(x, 1), y, stride(x, 1), result)
julia> result[]
0.0 + 0.0im
julia> x = ones(arrType, arrSzie);
julia> y = ones(arrType, arrSzie);
julia> ccall((:cblas_zdotc_sub, :libopenblas), Cvoid,
(BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}, BlasInt, Ptr{ComplexF64}),
length(x), x, stride(x, 1), y, stride(x, 1), result)
julia> result[]
10001.0 + 0.0im
It looks like julia is using the compiled product of |
You basically add a patch to all the 0.3.10 build recipes in https://github.com/JuliaPackaging/Yggdrasil/tree/master/O/OpenBLAS. You'll see that there is a patches directory for each variant in there, and you just add a new patch. The You also have to add the patch to the julia source (for people who build from source). That goes in here: https://github.com/JuliaLang/julia/tree/master/deps/patches |
Build Julia v1.6.3 + Some questions:
For question 1, My guess: JuliaLang:backports-release-1.6 <== inkydragon:blas-zdot (based on v1.6.3) Questions 2: I don't know, but I will first add a patch to 0.3.10 and test the compatibility of this patch with other versions. |
That's a question for @KristofferC |
Yes a PR against |
xref issue: JuliaLang#40963 cherry pick: OpenMathLib/OpenBLAS@b205323
About 2., on Yggdrasil, it should be on master. Yggdrasil will build a new release for openblas 0.3.10. After that we have to update the package versions in |
…TION (#42397) * [OpenBLAS] cherry pick one patch to fix `zdot` crash xref issue: #40963 cherry pick: OpenMathLib/OpenBLAS@b205323 * [test/LinearAlgebra] test Issue #40963 * [OpenBLAS] Update version * [OpenBLAS] Update checksums
…TION (#42397) * [OpenBLAS] cherry pick one patch to fix `zdot` crash xref issue: #40963 cherry pick: OpenMathLib/OpenBLAS@b205323 * [test/LinearAlgebra] test Issue #40963 * [OpenBLAS] Update version * [OpenBLAS] Update checksums
I believe this should be fixed, but please reopen if still an issue. |
…TION (#42397) * [OpenBLAS] cherry pick one patch to fix `zdot` crash xref issue: #40963 cherry pick: OpenMathLib/OpenBLAS@b205323 * [test/LinearAlgebra] test Issue #40963 * [OpenBLAS] Update version * [OpenBLAS] Update checksums
Hi all,
I've just started programming in Julia to do numerical simulations, and while testing a function using included I got this message. does anyone have a clue about it? I've tried to search on the web, but since I'm a noob with Julia I didn't really get the point of possible solutions, and this seems very win10 related, so I hope that someone can give me explanations about this. thanks a lot in advance!
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1fd608e0 -- at 0x1fd608e0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1fd608e0 -- at 0x1fd608e0 -- OLATION with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1fd608e0 -- at 0x1fd608e0 -- OLATION\AppData\Local\Programs\Julia-1.6.1\bin\libopenblas64_.DLL (unknown line)
their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x1fd608e0 -- at 0x1fd608e0 -- OLATION\AppData\Local\Programs\Julia-1.6.1\bin\libopenblas64_.DLL (unknown line)
in expression starting at REPL[10]:1
in expression starting at REPL[10]:1
cal\Programs\Julia-1.6.1\bin\libopenblas64_.DLL (unknown line)
their entirety). Thanks.
Exception:
The text was updated successfully, but these errors were encountered: