Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for calling superlu with 64-bit integer support #20338

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

liuyangzhuan
Copy link

The current tarball of the superlu SRC supports 64-bit integers. In other words, the matrix size m, n are still 32bit, but the number of nonzeros, index array and pointer array (CSC or CSR) can be 64 bit integers.

However the current scipy interface in linsolve.py always call _safe_downcast_indices to convert the input matrix into 32bit indexed and call superlu.

This PR fixes the issue by using an environment variable XSDK_INDEX_SIZE to pass to the meson build system at compile time, as well as to select the correct idx_dtype at runtime when superlu is being used. More specifically, the 64-bit build can be done as follows:

export XSDK_INDEX_SIZE=64
CFLAGS="-DXSDK_INDEX_SIZE=${XSDK_INDEX_SIZE}" meson setup build

Note that one still need export XSDK_INDEX_SIZE=64 at runtime. That's why I used the environment variable approach.
Note that the 32-bit build is used by default, or by setting export XSDK_INDEX_SIZE=32

@github-actions github-actions bot added scipy.sparse.linalg scipy.sparse C/C++ Items related to the internal C/C++ code base labels Mar 26, 2024
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @liuyangzhuan. It looks like a good start. The environment variable usage won't work though, that is way to fragile. What we need here instead is to build SuperLU twice, so we can pass it both 32-bit and 64-bit indices to avoid the downcasting. That should "just work" without the user needing to toggle any build or runtime knobs.

Binary size impact is small-ish, since the extension is 300 kb unpacked in a release build. So wheel size contribution is smaller than that; doubling it seems acceptable.

The most important thing is to check how BLAS/LAPACK usage is handled; I haven't looked but I am guessing that it requires ILP64 BLAS support to make this work. If so, then that's in the works but will be optional so it'll only work with custom builds. Meaning that 64-bit indices should continue to be downcasted for default (non-ILP64-enabled) builds.

There may be some hiccups in having two separate extension modules that are in use at the same time, not sure.

@liuyangzhuan
Copy link
Author

Hi, @rgommers, I agree that env is not the perfect way to go. Building superlu twice is fine with me, as this is the way BLAS/LAPACK being used in scipy. But it seems that for superlu, the bridging C routines e.g.,
https://github.com/liuyangzhuan/scipy/blob/main/scipy/sparse/linalg/_dsolve/_superlumodule.c
https://github.com/liuyangzhuan/scipy/blob/main/scipy/sparse/linalg/_dsolve/_superluobject.c
need also to be built twice. Note that currently I'm using "int_t" to be either int or int64_t based on -DXSDK_INDEX_SIZE. I'm not sure how easy you can modify these bridging routines.

BTW, superlu always use 32-bit BLAS, either it's vendor-based or its own internal blas. This might cause some conflict when ILP64 BLAS is used in other place of scipy. @xiaoyeli

@rgommers
Copy link
Member

BTW, superlu always use 32-bit BLAS, either it's vendor-based or its own internal blas. This might cause some conflict when ILP64 BLAS is used in other place of scipy. @xiaoyeli

32-bit BLAS will always be available for internal use, since the exported scipy.linalg.cython_blas and cython_lapack APIs are 32-bit and cannot change.

@liuyangzhuan
Copy link
Author

BTW, superlu always use 32-bit BLAS, either it's vendor-based or its own internal blas. This might cause some conflict when ILP64 BLAS is used in other place of scipy. @xiaoyeli

32-bit BLAS will always be available for internal use, since the exported scipy.linalg.cython_blas and cython_lapack APIs are 32-bit and cannot change.

Cool. Any idea about the bridging C files?

@rgommers
Copy link
Member

Any idea about the bridging C files?

What I'd suggest trying is to indeed build those twice too. So you produce _superlu.so and _superlu_64.so, and the corresponding static libraries. There may be a few things in those two files that may need a symbol suffix, e.g.:

#define PY_ARRAY_UNIQUE_SYMBOL _scipy_sparse_superlu_ARRAY_API

if you do that by passing a compile argument, I hope you can build everything twice without conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C/C++ Items related to the internal C/C++ code base scipy.sparse.linalg scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants