ENH: linalg: 64-bit BLAS/LAPACK #11193

pv · 2019-12-09T22:18:01Z

This is a draft for discussion (do not merge).

I'd suggest implementing support for 64-bit BLAS (=ILP64) in the following way:

The public BLAS/LAPACK wrappers in scipy.linalg will continue providing 32-bit BLAS/LAPACK, and new extensions _fblas_64, _flapack_64, cython_lapack_64, cython_blas_64 are added to provide ILP64 BLAS routine API. For those cases where BLAS/LAPACK usage is a non-public detail (e.g. ARPACK), we can build only the 64-bit version.

The above is necessary, as a single Python extension DLL cannot link to both types of BLAS because the symbol names generally clash. While dynamically linking to different BLAS in different extension works in the usual case, it is not enough to avoid symbol clashes when e.g. the Python interpreter is embedded inside an application also linked to BLAS/LAPACK (on Linux, see here section 1.5.4 --- TODO: less problematic on osx & windows iiuc, but needs checking). To avoid this issue, we also add support for BLAS/LAPACK symbol name mangling which resolves it.

TODO: also static linking to the 32-bit BLAS probably also resolves the issue (easier for Intel MKL users)

In scipy.linalg, we'd add ilp64= flag to the blas/lapack function getters, as in nrm2 = get_blas_func('nrm2', ilp64='maybe'/True/False), and a nrm2.int_dtype attribute so that it's easier to deal with the work array types.

We might want to eventually drop the 32-bit BLAS altogether. One backward-compatible option could be to autogenerate int32 interface wrappers --- but I'm not sure how easy it is to get the specs automatically.

I added some BLAS64 support in numpy master some days ago. The changes in this "PR" assume the numpy.distutils setup in the follow-up PR numpy/numpy#15069. Currently, you can do

import numpy as np
from scipy.linalg import norm
x = np.zeros([2**31], dtype=np.float64)
x[-1] = 1
res = norm(x)
print(res)
assert res == 1.0

so I think the plan is feasible.

Finish ILP64 BLAS numpy.distutils support (ENH: add support for ILP64 OpenBLAS (without symbol suffix) numpy/numpy#15069)
Implement Fortran BLAS/LAPACK symbol name mangling
Implement scipy.linalg f2py .pyf file int32->int64 search-and-replace
Support ilp64 variants in get_blas/lapack_funcs
Decide what to do in cases where routines return integer arrays --- do we just bump to np.int64?
Use ilp64 variants when available everywhere in scipy.linalg
Add tests...

tylerjereddy · 2019-12-13T23:40:46Z

Presumably the tests can be marked up in a similar way to your NumPy PR to avoid crushing machines with insufficient memory.

So far looks good upstream from the PRs I've reviewed related to this & my local tests with NumPy have been encouraging when using OpenBLAS + ILP64.

I don't have strong views on the linalg design decisions but maybe i.e., @ilayn might chime in?

ilayn · 2019-12-16T08:37:51Z

This is really nice and, unfortunately, rather the low level part that I am not too familiar with but one thing I imagine will stay is the 32-bit version. Because many problems would start not fitting into memory with ILP64 while they were barely fitting in with the 32-bit due to the extra allocation that would take place for the same info. Hence, I guess if we enable this we are going to maintain both for some time.

I can do the legwork on the linalg side if needed but one question:

Are we going to need to modify the INTEGER keywords in the wrappers to INTEGER*4 or INTEGER*8 (or whatever the syntax would be) depending on the linked library?

It would have been amazing if we could choose the array type integer depending on the problem size but I think that is too ambitious.

pv · 2019-12-16T09:52:42Z

The integer size is here changed by compiler flags (and f2py flags), so only C code requires manual changes. It's indeed already possible to switch between ilp64 and non-ilp64 as you like, for scipy.linalg. For fortran code, you'd need to compile two versions (possible, but not necessarily sensible). I wouldn't be so sure the memory usage matters in practice, as it's probably usually only 25% or so increase, and more often likely much less. For lapack, it's about routines with large iwork arrays, but are there any where the iwork really is usually significant compared to the fp arrays? I think practical advantages of using single blas library are much more important than cases working close to machine memory (ie, if its a problem, just buy an extra 8GB, or recompile scipy with 32-bit BLAS)

…

On December 16, 2019 8:37:51 AM UTC, Ilhan Polat ***@***.***> wrote: This is really nice and, unfortunately, rather the low level part that I am not too familiar with but one thing I imagine will stay is the 32-bit version. Because many problems would start not fitting into memory with ILP64 while they were barely fitting in with the 32-bit due to the extra allocation that would take place for the same info. Hence, I guess if we enable this we are going to maintain both for some time. I can do the legwork on the linalg side if needed but one question: Are we going to need to modify the `INTEGER` keywords in the wrappers to `INTEGER*4` or `INTEGER*8` (or whatever the syntax would be) depending on the linked library? It would have been amazing if we could choose the array type integer depending on the problem size but I think that is too ambitious.

pv · 2019-12-23T13:00:17Z

Ok, this is all the low-hanging fruit. What's left is:

scipy/sparse/linalg/dsolve/_superlu.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/_fblas.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/_flapack.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/_flinalg.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/_interpolative.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/cython_blas.cpython-37m-x86_64-linux-gnu.so
scipy/linalg/cython_lapack.cpython-37m-x86_64-linux-gnu.so

SuperLU supports only 32-bit int, and interpolative has assumptions about 4-byte integers in hard-coded work array sizes in many places, so their integer size cannot be changed. These two might be dealt with relatively lightweight 64-to-32 lapack wrappers.

I looked a bit into it, and general 64-to-32-bit LAPACK adapters are not really possible to write, because sizes of work arrays in several cases are not known (due to liwork queries). Such wrappers however are possible to do for all of BLAS, and for several LAPACK routines, but the rest of LAPACK would need to be compiled from sources. However, this would still avoid shipping the multi-arch kernels from openblas twice, so it might be useful to do in view of the binary distribution sizes.

This PR is probably too big to review, so I'll eventually split to a bit more manageable chunks.

rgommers · 2020-04-26T19:23:34Z

This sounds like a good plan to me.

pv force-pushed the blas-ilp64 branch 3 times, most recently from dfe6353 to 99ae717 Compare December 13, 2019 19:37

tylerjereddy added scipy.linalg needs-decision Items that need further discussion before they are merged or closed labels Dec 13, 2019

pv force-pushed the blas-ilp64 branch from 097f3d6 to 41d5b33 Compare December 23, 2019 10:53

pv closed this Dec 23, 2019

pv reopened this Dec 23, 2019

pv force-pushed the blas-ilp64 branch 2 times, most recently from 1aeec0c to 65203d6 Compare December 23, 2019 11:59

knedlsepp mentioned this pull request Dec 27, 2019

Default integer size in Fortran NixOS/nixpkgs#35208

Open

pv force-pushed the blas-ilp64 branch 15 times, most recently from 856bf68 to a4016ff Compare January 1, 2020 13:35

pv added 5 commits January 1, 2020 20:34

ENH: integrate: support ILP64 in quadpack

e181732

ENH: odr: support ILP64

d0b12a6

TST: linalg: add ilp64 lapack smoketest

c5e6483

ENH: optimize/_trlib: support ILP64 blas/lapack

0aea4dd

BLD: sparse.linalg/superlu: support ILP64 BLAS in SuperLU

7573a00

pv force-pushed the blas-ilp64 branch 5 times, most recently from 8a06cdb to e399fdd Compare January 1, 2020 21:49

BLD: linalg: support ILP64 BLAS in id_dist

883acfc

pv force-pushed the blas-ilp64 branch from e399fdd to 883acfc Compare January 1, 2020 21:53

This was referenced May 1, 2020

WIP: Azure Windows Openblas experiment #11965

Closed

ENH: optimize: support 64-bit BLAS in lbfgsb #12009

Merged

ENH: sparse.linalg: support 64-bit BLAS in isolve #12010

Merged

ENH: optimize/_trlib: support ILP64 blas/lapack #12030

Merged

This was referenced May 10, 2020

ENH: special: support ILP64 Lapack #12085

Merged

ENH: spatial/qhull: support ILP64 Lapack #12089

Merged

ENH: integrate: support ILP64 BLAS in odeint/vode/lsoda #12090

Merged

ENH: integrate: support ILP64 in quadpack #12091

Merged

This was referenced May 31, 2020

ENH: odr: ILP64 Blas support in ODR #12283

Merged

ENH: linalg: support for ILP64 BLAS/LAPACK in f2py wrappers #12284

Merged

rgommers mentioned this pull request Oct 18, 2022

Consider unifying the two OpenBLAS libraries in NumPy and SciPy wheels Create OpenBLAS wheel #15129

Open

rgommers mentioned this pull request Jun 13, 2023

[WIP] Set visibility attribute on internal function symbols to hidden OpenMathLib/OpenBLAS#3658

Closed

lucascolley added enhancement A new feature or improvement C/C++ Items related to the internal C/C++ code base Fortran Items related to the internal Fortran code base labels Dec 23, 2023

rgommers mentioned this pull request Jan 8, 2024

BLD: Add Accelerate support for macOS 13.3+ #19816

Merged

lucascolley changed the title ~~64-bit BLAS/LAPACK~~ ENH: linalg: 64-bit BLAS/LAPACK Mar 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: linalg: 64-bit BLAS/LAPACK #11193

ENH: linalg: 64-bit BLAS/LAPACK #11193

pv commented Dec 9, 2019 •

edited by tylerjereddy

Loading

tylerjereddy commented Dec 13, 2019

ilayn commented Dec 16, 2019

pv commented Dec 16, 2019 via email •

edited

Loading

pv commented Dec 23, 2019 •

edited

Loading

rgommers commented Apr 26, 2020

ENH: linalg: 64-bit BLAS/LAPACK #11193

Are you sure you want to change the base?

ENH: linalg: 64-bit BLAS/LAPACK #11193

Conversation

pv commented Dec 9, 2019 • edited by tylerjereddy Loading

tylerjereddy commented Dec 13, 2019

ilayn commented Dec 16, 2019

pv commented Dec 16, 2019 via email • edited Loading

pv commented Dec 23, 2019 • edited Loading

rgommers commented Apr 26, 2020

pv commented Dec 9, 2019 •

edited by tylerjereddy

Loading

pv commented Dec 16, 2019 via email •

edited

Loading

pv commented Dec 23, 2019 •

edited

Loading