Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS interface completeness? #138

Closed
ChrisRackauckas opened this issue Nov 10, 2023 · 11 comments
Closed

BLAS interface completeness? #138

ChrisRackauckas opened this issue Nov 10, 2023 · 11 comments

Comments

@ChrisRackauckas
Copy link

I was pretty surprised to find that the MKL.jl bindings were incomplete. That doesn't seem to be documented anywhere? In order for using MKL to be non-breaking, it needs to be complete or have fallbacks in any case where MKL does not provide a solution. In terms of what I can find from https://discourse.julialang.org/t/again-errors-with-no-blas-lapack-library-loaded/105943/8 SciML/LinearSolve.jl#427, at least lacpy seems to be missing.

ChrisRackauckas added a commit to ChrisRackauckas/MKL.jl that referenced this issue Nov 10, 2023
…mented

This warns about the currently undocumented behavior of JuliaLinearAlgebra#138 that `using MKL` is not a complete wrapper and is thus breaking.
@imciner2
Copy link
Contributor

I'm a bit confused, what do you mean by the bindings being incomplete and the function being missing? This should be handled by the lbt layer redirecting to MKL's function, and the stacktrace given in SciML/LinearSolve.jl#427 (comment) seems to show it is actually being dispatched into the library for the lacpy call.

@andreasvarga
Copy link

See also here.

@andreasvarga
Copy link

From my tests, it appears that the two pairs of LAPACK auxiliary routines dlanv2/slanv2 and dladiv/sladiv are missing in MKL.

@imciner2
Copy link
Contributor

They are physically in the MKL library:

$ objdump -x libmkl_rt.so.2 | grep lanv
0000000000000000 l    df *ABS*	0000000000000000              _dlanv2.c
0000000000000000 l    df *ABS*	0000000000000000              _slanv2.c
00000000004fb2a0 g     F .text	0000000000000100              dlanv2
00000000004fb2a0 g     F .text	0000000000000100              mkl_lapack__dlanv2_
00000000006d46e0 g     F .text	0000000000000100              slanv2_
00000000006d46e0 g     F .text	0000000000000100              slanv2
00000000004fb2a0 g     F .text	0000000000000100              dlanv2_
00000000006d46e0 g     F .text	0000000000000100              mkl_lapack__slanv2_

And libblastrampoline is reexporting the underscore variants:

$ objdump -x libblastrampoline.so.5.4.0 | grep lanv
0000000000384470 l     O .bss	0000000000000008              slanv2__addr
0000000000384718 l     O .bss	0000000000000008              dlanv2_64__addr
0000000000389e20 l     O .bss	0000000000000008              dlanv2__addr
0000000000391410 l     O .bss	0000000000000008              slanv2_64__addr
00000000000eab13 g     F .text	000000000000000c              slanv2_64_
00000000000da5b3 g     F .text	000000000000000c              dlanv2_
00000000000e8daf g     F .text	000000000000000c              dlanv2_64_
00000000000dc317 g     F .text	000000000000000c              slanv2_

When I load MKL into the REPL, the forward has a pointer in it:

julia> BLAS.lbt_get_forward(:dlanv2_, BLAS.LBT_INTERFACE_LP64)
Ptr{Nothing} @0x00007f96022fb2a0

However, the ILP64 version does not:

julia> BLAS.lbt_get_forward(:dlanv2_, BLAS.LBT_INTERFACE_ILP64)
Ptr{Nothing} @0x00007f967e8ceb50

julia> BLAS.lbt_get_default_func()
Ptr{Nothing} @0x00007f967e8ceb50

The reason is that the dlanv2 function appears to only have float/double values in its API, so there are no integer types being passed. Therefore, MKL is not exporting a separate 64-bit integer variant of the type from their libmkl_rt dispatch library, because the same function can be used for both ILP64 and LP64. The same thing happens with the ladiv functions.

e.g., the manual states:

On 64-bit platforms, selected domains provide API extensions with the _64 suffix (for example, SGEMM_64) for supporting large data arrays in the LP64 library, which enables the mixing of data types in one application. The selected domains and APIs include the following:

  • BLAS: Fortran-style APIs for C applications and CBLAS APIs with integer arguments
  • LAPACK: Fortran-style APIs for C applications and LAPACKE APIs with integer arguments

If I use the LBT footgun API to create the mapping for the ILP version dlanv2 manually to the only dlanv2 function that exists, it then works:

julia> dlanv_ = BLAS.lbt_get_forward(:dlanv2_, BLAS.LBT_INTERFACE_LP64)
Ptr{Nothing} @0x00007f2bbd2fb2a0

julia> BLAS.lbt_set_forward("dlanv2_", dlanv_, BLAS.LBT_INTERFACE_ILP64)
0

julia> lanv2(1.,2.,3.,4.)
(-0.3722813232690143, 0.0, 5.372281323269014, 0.0, -0.8245648401323938, 0.5657674649689923)

@andreasnoack
Copy link
Member

The symbol is called from inside of the SLICOT library. Will the mapping also apply to such calls?

@imciner2
Copy link
Contributor

Is slicot linked against libblastrampoline? If so, I believe this mapping would also apply to calls that are made from SLICOT through libblastrampoline.

@andreasvarga
Copy link

What are the implications on the ccall() interface? In the moment, I am using :dlanv2_, which works for OpenBLAS but not for MKL. (see example.). Should I use another symbol for MKL?

@andreasnoack
Copy link
Member

@andreasvarga
Copy link

There should be no problems with SLICOT. I contacted the main developer of SLICOT (Vasile Sima) and he confirmed that SLICOT is MKL compliant (of course there could be differences in the employed versions). I myself replaced the function dlanv2 with a generic version programmed in Julia and all tests for PeriodicSystems are passed also with MKL. So until now, only dlanv2 and dladiv raise problems.

@imciner2
Copy link
Contributor

That mapping will also fix the ccall for the functions.

I have put together an initial list of methods that seem to be missing and need this mapping in #140. I tried your example from the forum with that change in the package, and it works calling dlanv2 with the change.

@ViralBShah
Copy link
Contributor

Fixed in #140 and should be further fixed in a future MKL release upstream as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants