Skip to content

Commit

Permalink
Package OpenBLAS and use OpenBLAS in scipy (#3331)
Browse files Browse the repository at this point in the history
  • Loading branch information
lesteve committed Apr 12, 2023
1 parent 862163e commit 7193109
Show file tree
Hide file tree
Showing 41 changed files with 444 additions and 659 deletions.
2 changes: 1 addition & 1 deletion .prettierignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ cpython
.vscode
.pytest_cache
.clang-format
packages/CLAPACK/make.inc
packages/libf2c/make.inc
1 change: 0 additions & 1 deletion Makefile.envs
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,6 @@ export LDFLAGS_BASE=\
$(DBGFLAGS) \
$(DBG_LDFLAGS) \
-s MODULARIZE=1 \
-std=c++14 \
-s LZ4=1 \
-L $(CPYTHONROOT)/installs/python-$(PYVERSION)/lib/ \
-s WASM_BIGINT \
Expand Down
3 changes: 2 additions & 1 deletion docs/development/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,8 @@ callee because compiling with `-g3` increases the number of function pointers so
the function pointer we are calling is in a different spot. I know of no way to
determine the bad function pointer when compiling with `-g3`.

Sometimes (particularly with Scipy/CLAPACK) the issue will be a mismatch between
Sometimes (particularly with Scipy/OpenBLAS/libf2c) the issue will be a
mismatch between
`(param i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32) (result i32)` and
`(param i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32) (result i32)`

Expand Down
4 changes: 2 additions & 2 deletions docs/development/meta-yaml.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,8 @@ Files or folders in this folder will be packaged to make the Pyodide package.

See the [zlib
meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/zlib/meta.yaml)
for an example of a static library specification, and the [CLAPACK
meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/CLAPACK/meta.yaml)
for an example of a static library specification, and the [OpenBLAS
meta.yaml](https://github.com/pyodide/pyodide/blob/main/packages/openblas/meta.yaml)
for an example of a shared library specification.

### `build/script`
Expand Down
2 changes: 1 addition & 1 deletion docs/development/new-packages.md
Original file line number Diff line number Diff line change
Expand Up @@ -348,7 +348,7 @@ as a starting point.

After packaging a C library, it can be added as a dependency of a Python package
like a normal dependency. See `lxml` and `libxml` for an example (and also
`scipy` and `CLAPACK`).
`scipy` and `OpenBLAS`).

_Remark:_ Certain C libraries come as emscripten ports, and do not have to be
built manually. They can be used by adding e.g. `-s USE_ZLIB` in the `cflags` of
Expand Down
2 changes: 2 additions & 0 deletions docs/project/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ myst:

### Packages

- OpenBLAS has been added and scipy now uses OpenBLAS rather than CLAPACK
{pr}`3331`.
- New packages: sourmash {pr}`3635`, screed {pr}`3635`, bitstring {pr}`3635`,
deprecation {pr}`3635`, cachetools {pr}`3635`.
- Upgraded libmpfr to 4.2.0 {pr}`3756`.
Expand Down
42 changes: 0 additions & 42 deletions packages/CLAPACK/patches/0001-add-missing-import.patch

This file was deleted.

31 changes: 0 additions & 31 deletions packages/CLAPACK/patches/0003-lapack-install-make.patch

This file was deleted.

52 changes: 0 additions & 52 deletions packages/CLAPACK/patches/0007-Fix-xerbla-and-ilaenv.patch

This file was deleted.

9 changes: 9 additions & 0 deletions packages/gensim/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,21 @@ package:
source:
url: https://files.pythonhosted.org/packages/9a/3c/dd4351a2ef3a8fb19e26d6ccb928823fea53375de9d28b221f8cf0e53c8e/gensim-4.3.1.tar.gz
sha256: 8b5f11c0e6a5308086b48e8f6841223a4fa1a37d513684612b7ee854b533015f
patches:
- patches/0001-Avoid-signature-mismatch-in-sdot-detection.patch

requirements:
run:
- numpy
- scipy
- six
- smart_open
build:
script: |
# gensim apparently builds from .c files so need to cythonize the .pyx after
# patching
cython gensim/models/word2vec_inner.pyx
about:
home: http://radimrehurek.com/gensim
PyPI: https://pypi.org/project/gensim
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
From 2c816f54d3a6b056f42b97ad646789e9fe31a670 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= <loic.esteve@ymail.com>
Date: Thu, 6 Apr 2023 17:52:34 +0200
Subject: [PATCH] Avoid signature mismatch in sdot detection.

In Pyodide, OpenBLAS sdot returns float so use it rather than trying
to adapt the somewhat tricky gensim logic.
---
gensim/models/word2vec_inner.pyx | 20 +++-----------------
1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/gensim/models/word2vec_inner.pyx b/gensim/models/word2vec_inner.pyx
index 1c0807ee..3d4a6847 100755
--- a/gensim/models/word2vec_inner.pyx
+++ b/gensim/models/word2vec_inner.pyx
@@ -939,23 +939,9 @@ def init():
EXP_TABLE[i] = <REAL_t>(EXP_TABLE[i] / (EXP_TABLE[i] + 1))
LOG_TABLE[i] = <REAL_t>log( EXP_TABLE[i] )

- # check whether sdot returns double or float
- d_res = dsdot(&size, x, &ONE, y, &ONE)
- p_res = <float *>&d_res
- if abs(d_res - expected) < 0.0001:
- our_dot = our_dot_double
- our_saxpy = saxpy
- return 0 # double
- elif abs(p_res[0] - expected) < 0.0001:
- our_dot = our_dot_float
- our_saxpy = saxpy
- return 1 # float
- else:
- # neither => use cython loops, no BLAS
- # actually, the BLAS is so messed up we'll probably have segfaulted above and never even reach here
- our_dot = our_dot_noblas
- our_saxpy = our_saxpy_noblas
- return 2
+ our_dot = our_dot_float
+ our_saxpy = saxpy
+ return 1 # float

FAST_VERSION = init() # initialize the module
MAX_WORDS_IN_BATCH = MAX_SENTENCE_LEN
--
2.34.1

File renamed without changes.
32 changes: 18 additions & 14 deletions packages/CLAPACK/meta.yaml → packages/libf2c/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,24 +1,29 @@
# We still download the full CLAPACK but we are only using the libf2c part of CLAPACK.
# libf2c part is needed for the f2ced Fortran files in scipy for example to
# define things like pow_dd, i_len, etc...
#
# Note f2clib package only creates f2clib.a, and f2clib.a symbols are added to
# libopenblas.so in the OpenBLAS meta.yaml.
package:
name: CLAPACK
version: 3.2.1
name: libf2c
version: CLAPACK-3.2.1

source:
sha256: 6dc4c382164beec8aaed8fd2acc36ad24232c406eda6db462bd4c41d5e455fac
url: http://www.netlib.org/clapack/clapack.tgz
extract_dir: CLAPACK-3.2.1
patches:
- patches/0001-add-missing-import.patch
- patches/0002-fix-arith.h.patch
- patches/0003-lapack-install-make.patch
- patches/0004-fix-f2clibs-build.patch
- patches/0005-remove-redundant-symbols.patch
- patches/0006-correct-return-types.patch
- patches/0007-Fix-xerbla-and-ilaenv.patch
- patches/0001-fix-arith.h.patch
- patches/0002-fix-f2clibs-build.patch
- patches/0003-remove-redundant-symbols.patch
- patches/0004-correct-return-types.patch
- patches/0005-Remove-symbols-defined-in-OpenBLAS.patch

extras:
- [make.inc, make.inc]

build:
type: shared_library
type: static_library
script: |
# The archive's contents have default permission 0750. If we use docker
# to build, then we will not own the contents in the host, which means
Expand All @@ -32,8 +37,7 @@ build:
sed -i 's/^ ar /^ $(ARCH)/' **/Makefile
sed -i 's/^ ld /^ $(LD)/' **/Makefile
emmake make -j ${PYODIDE_JOBS:-3} blaslib lapacklib
emcc blas_WA.a lapack_WA.a F2CLIBS/libf2c.a ${SIDE_MODULE_LDFLAGS} -o ${DISTDIR}/clapack_all.so
emmake make -j ${PYODIDE_JOBS:-3} f2clib
mkdir -p ${WASM_LIBRARY_DIR}/{lib,include}
cp -r INCLUDE/* ${WASM_LIBRARY_DIR}/include
cp ${DISTDIR}/clapack_all.so ${WASM_LIBRARY_DIR}/lib
cp INCLUDE/f2c.h ${WASM_LIBRARY_DIR}/include
cp F2CLIBS/libf2c.a ${WASM_LIBRARY_DIR}/lib
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 01990867ee7a641078505efba367a413a97f7802 Mon Sep 17 00:00:00 2001
From: Michael Droettboom <mdboom@gmail.com>
Date: Fri, 18 Mar 2022 19:59:25 -0700
Subject: [PATCH 2/7] fix arith.h
Subject: [PATCH 1/5] fix arith.h

arith.h is a file generated at build time by compiling and running a C program.
Since we use emscripten to build throughout, the C program becomes a wasm file
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From d88133066f9f6312145c1186116fdb6446d3f7a5 Mon Sep 17 00:00:00 2001
From: Michael Droettboom <mdboom@gmail.com>
Date: Fri, 18 Mar 2022 20:00:51 -0700
Subject: [PATCH 4/7] fix f2clibs build
Subject: [PATCH 2/5] fix f2clibs build

emscripten produces LLVM bitcode here, not genuine object files, so it doesn't
make sense to strip symbols.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 78ff0cec961d9eb4e94193995fe151e1ecdae9df Mon Sep 17 00:00:00 2001
From: Roman Yurchak <rth.yurchak@gmail.com>
Date: Fri, 18 Mar 2022 20:01:39 -0700
Subject: [PATCH 5/7] remove redundant symbols
Subject: [PATCH 3/5] remove redundant symbols

Remove a few symbols from LAPACK that are redundantly defined with BLAS or are
ported in scipy. It wouldn't be an issue if we were linking dynamically, but
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 572a3e20ba040b4f29bbef97a9db6658c10077d3 Mon Sep 17 00:00:00 2001
From: Joe Marshall <joe.marshall@nottingham.ac.uk>
Date: Fri, 18 Mar 2022 20:02:42 -0700
Subject: [PATCH 6/7] correct return types
Subject: [PATCH 4/5] correct return types

Make return types to fortran subroutines consistently be int. Some functions are defined within clapack as variously
void and int return. Normal C compilers don't care, but emscripten is strict about return values.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
From eaf5c5db6e956036869255cb51831e720474d01d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= <loic.esteve@ymail.com>
Date: Fri, 7 Apr 2023 15:20:18 +0200
Subject: [PATCH 5/5] Remove symbols defined in OpenBLAS

---
F2CLIBS/libf2c/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/F2CLIBS/libf2c/Makefile b/F2CLIBS/libf2c/Makefile
index 57eff0d..136050f 100644
--- a/F2CLIBS/libf2c/Makefile
+++ b/F2CLIBS/libf2c/Makefile
@@ -31,8 +31,8 @@ MISC = f77vers.o i77vers.o main.o s_rnge.o abort_.o exit_.o getarg_.o iargc_.o\
getenv_.o signal_.o s_stop.o s_paus.o system_.o cabs.o ctype.o\
derf_.o derfc_.o erf_.o erfc_.o sig_die.o uninit.o
POW = pow_ci.o pow_dd.o pow_di.o pow_hh.o pow_ii.o pow_ri.o pow_zi.o pow_zz.o
-CX = c_abs.o c_cos.o c_div.o c_exp.o c_log.o c_sin.o c_sqrt.o
-DCX = z_abs.o z_cos.o z_div.o z_exp.o z_log.o z_sin.o z_sqrt.o
+CX = c_cos.o c_div.o c_exp.o c_log.o c_sin.o c_sqrt.o
+DCX = z_cos.o z_div.o z_exp.o z_log.o z_sin.o z_sqrt.o
REAL = r_abs.o r_acos.o r_asin.o r_atan.o r_atn2.o r_cnjg.o r_cos.o\
r_cosh.o r_dim.o r_exp.o r_imag.o r_int.o\
r_lg10.o r_log.o r_mod.o r_nint.o r_sign.o\
--
2.34.1

0 comments on commit 7193109

Please sign in to comment.