Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH, BLD: use wheels to get BLAS implementations #24857

Open
mattip opened this issue Oct 4, 2023 · 9 comments
Open

ENH, BLD: use wheels to get BLAS implementations #24857

mattip opened this issue Oct 4, 2023 · 9 comments

Comments

@mattip
Copy link
Member

mattip commented Oct 4, 2023

In PR #24839, we began using scipy-openblas wheels in the CI build/test cycles. In order to use these in wheel building as well, we need to declare a dependency, so that pip install numpy in all its variations (build a wheel, build an sdist, install a wheel, install from an sdist) will

  • install scipy-openblas befor compiling
  • ship a modified _distributor_init.py (or add a _distributor_init_local.py) to import scipy-openblas before numpy
  • find the scipy-openblas.pc file when building
  • declare a dependency in the metadata of the numpy wheel/sdist

This is tricky since (take from the short discussion starting in this comment

  • There are two variants: scipy-openblas32 and scipy-openblas64. Which to use as the canonical dependency? It seems at least for NumPy this is easy: always use the ILP64 package.
  • How to allow changing that choice for a different variant (if not the 32/64 problem, then say MKL, or Accelerate, or None for platforms where OpenBLAS is not appropriate)?

We cannot currently have different metadata in the sdist than we have in the wheel on PyPI. There are some ideas in the draft PEP 725. Or perhaps we could have a meta-package scipy-openblasX?

@charris
Copy link
Member

charris commented Oct 4, 2023

always use the ILP64 package

Except on 32 bit platforms.

@ogrisel
Copy link
Contributor

ogrisel commented Oct 9, 2023

Or perhaps we could have a meta-package scipy-openblasX?

From a naming point of view, scipy-openblasX should probably be named scipy-blas. Ideally, it could implement a similar mechanism as the "alternatives" for BLAS in Debian/Ubuntu or the virtual blas package of conda-forge:

always use the ILP64 package

Except on 32 bit platforms.

In particular for WASM / Pyodide-based deployments.

@rgommers
Copy link
Member

@ogrisel that's an interesting idea. It could be added on top later - given that name, I think it'd be an extra level of abstraction, that would only be needed if one would want to make OpenBLAS runtime swappable with another library. I think at that point we should build against FlexiBLAS though, which is already designed to do this and "forward" calls to routines to the relevant implementation.

For now I think I'd be happy with a plain scipy-openblas that would only serve to get around the "need to declare a dependency" issue.

In particular for WASM / Pyodide-based deployments.

Those don't live on PyPI though - at least not yet. Right now, we have no 32-bit BLAS on PyPI at all; we still have win32 wheels but they're built without BLAS support.

@ogrisel
Copy link
Contributor

ogrisel commented Oct 12, 2023

I did not know about flexiblas but it sounds promising. It might also allow to switch between scipy-openblas-pthreads and scipy-openblas-openmp for instance.

@rgommers
Copy link
Member

Yes, FlexiBLAS indeed seems quite nice. The two potential issues with it at the moment:

  • It doesn't support the 64_ symbol suffix for OpenBLAS, which we're using in our OpenBLAS builds
  • threadpoolctl may not be able to support it, because using its introspection API makes the GPL apply (see Add support for Flexiblas scipy/scipy#17362)

@ogrisel
Copy link
Contributor

ogrisel commented Dec 11, 2023

threadpoolctl may not be able to support it, because using its introspection API makes the GPL apply (see scipy/scipy#17362)

This should not be a blocker (if the owners consider switching the license to LGPL as discussed in the scipy issue). If numpy/scipy want to rely on FlexiBLAS we will add support for FlexiBLAS to threadpoolctl. I opened an issue to not forget about this.

Futhermore the latest version of threadpoolctl is pluggable:

so numpy or scipy can already start prototyping without having to wait for an official release of threadpoolctl with FlexiBLAS support.

For 64_ symbol suffix I have no idea about the complexity about handling it. What are the pros and cons of the 64_ symbols compared to the standard BLAS symbols?

@rgommers
Copy link
Member

Thanks for the update @ogrisel. I'll note that NumPy now has the option of building against FlexiBLAS, and will try to find it as one of the supported libraries. SciPy doesn't support it yet. It'll be a while before we can consider using it in wheels, due to the suffix issue below.

For 64_ symbol suffix I have no idea about the complexity about handling it. What are the pros and cons of the 64_ symbols compared to the standard BLAS symbols?

Oh we definitely want symbol suffixes, because in a stack with numpy and scipy we use a mix of LP64 and ILP64 APIs. There's no good control over that with wheels, and without suffixes it's too easy to get symbol clashes.

@ogrisel
Copy link
Contributor

ogrisel commented Apr 9, 2024

Oh we definitely want symbol suffixes, because in a stack with numpy and scipy we use a mix of LP64 and ILP64 APIs. There's no good control over that with wheels, and without suffixes it's too easy to get symbol clashes.

I agree but can we hope to have ecosystem-wide recommendations to adopt one vs the other in the long run? That would avoiding linking to two different OpenBLAS copies for a typical scipy runtime env. This would be both helpful to reduce the size of deployed Python apps (particularly important on edge or WASM and more generally for people with not so great network connections). It would also help reduce debugging complexity in case two OpenBLASes interact in the same Python program (I can imagine spin-waiting threads causing performance problems when calling successively one OpenBLAS implementation and then the other in a tight loop.

@rgommers
Copy link
Member

A couple of thoughts there:

  • Long-term I think we want ILP64
  • However, ILP64 support is currently fairly immature all around (in both BLAS libraries and in downstream code, e.g. not all external libs that SciPy vendors fully support it, like SuperLU) compared to LP64
  • There is also the problem that SciPy re-exports the LP64 API through scipy.linalg.cython_blas, so that will always be needed
  • The NumPy decision to start shipping ILP64 by default in 1.22.0 (IIRC that was the version) was quite premature and will make it a long-term endeavor to use only a single BLAS library in wheels.
    • It's hard to go back now though, because it's hard to know who is relying on that ILP64 support now for dealing with large matrices
    • I'm not sure it's actually impossible. I could be convinced otherwise I think. As you point out, smaller binary size of deployed stack is an important argument. Probably significantly more important than supporting the "doesn't fit in 32-bit integers" crowd (it's anyway not possible to write code that is portable to conda-forge and Linux distros for such corner cases). So we may consider it if and when we can ship wheels that can rely on scipy-openblas32/64
  • To end up with a single shared library in the end, what we need is for that library to support both LP64 and ILP64 symbols. MKL and Accelerate already do this; reference BLAS has now settled on the 64_ symbol suffix so should be able to start shipping a single library in the future. Most importantly, we need an OpenBLAS build with both sets of symbols. That's still going to take a while.
  • WASM is an easier case, because it's 32-bit: we should definitely use LP64 only there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants