Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openblas: add initial package #15685

Merged
merged 1 commit into from
Jul 27, 2021
Merged

openblas: add initial package #15685

merged 1 commit into from
Jul 27, 2021

Conversation

commodo
Copy link
Contributor

@commodo commodo commented May 24, 2021

Maintainer: me
Compile tested: x86 openwrt/openwrt@ddcb970
Run tested: x86 openwrt/openwrt@ddcb970


Initial draft PR is:
#11894

This one is a bit more complete, and follows packaging practices.

For now, disabling builds on ARC and PowerPC. Will require more work to get
them going.
Explicitly disabling OpenMP support, so that it doesn't get picked by
accident.

Later we may use the CPU_TYPE parameter to tweak things a little further.

Signed-off-by: Alexandru Ardelean ardeleanalex@gmail.com

@commodo
Copy link
Contributor Author

commodo commented May 24, 2021

@bhack
i got to this point with the library;

will try to add support for this in numpy

@commodo
Copy link
Contributor Author

commodo commented May 24, 2021

looks like i have a few things to iron out on this;

@brada4
Copy link

brada4 commented May 24, 2021

It should not link against 300kB GOMP, at least not on compact embedded architectures.

@brada4
Copy link

brada4 commented May 24, 2021

s/ATHLON/GENERIC/g
It introduces 3DNow insns in GEMM path that will lead to SIGILL later.
s/ARMv7/ARMv5/g
Same

POWER4 is a 64bit CPU, probably that needs to be omitted, 32bit PPC (Apple G4) is long unmaintained, it might need some tuning to accommodate configuration with only generic C. That shall also help support other less mainstream arches.

@martin-frbg
Copy link

martin-frbg commented May 24, 2021

AFAICT, your PPC builds currently fail as they are missing the OpenMP headers, and the x86_64 build struggles to find a matching SGEMM kernel as the "TARGET=ATHLON" you gave it is strictly 32bit (suggest using TARGET=OPTERON or CORE2 if you want a tiny bit more performance than the safest, TARGET=GENERIC while avoiding the size penalty of DYNAMIC_ARCH)

@commodo
Copy link
Contributor Author

commodo commented May 24, 2021

wow, that's a lot of useful feedback :)
many thanks :)

so, it may take me a bit to re-spin things;
but at this point, it definitely won't be too long before we finish this;
i am hoping this week;

obviously, this will get tweaked even after being merged;

if anyone feels like they want to be a co-maintainer on this, i am happy to add/list them;
i mostly try to add co-maintainers, so that packages have a smaller risk of being abandoned if i somehow cannot take care of them;

@brada4
Copy link

brada4 commented May 24, 2021

You can also build dynamic lib on x86_64 with few cores present in hardware + generic. Further down the road, though.
https://openwrt.org/toh/views/toh_fwdownload?dataflt%5B0%5D=supported+current+rel_%3D19.07.7&dataflt%5BTarget_target*%7E%5D=x86

@commodo commodo force-pushed the openblas branch 2 times, most recently from 0b76a2c to 7685dcf Compare May 25, 2021 13:17
@commodo
Copy link
Contributor Author

commodo commented May 25, 2021

Changelog v1 -> v2:

  • dropped Config.in file; converted OPENBLAS_TARGET to OPENBLAS_TARGET_OVERRIDE Kconfig option
    • reason is that in OpenWrt it's quite common to change between boards during development [which means another ARCH), so having defaults per arch becomes annoying, because if you change from ATH79 (MIPS24K) to x86, then CPU target still remains MIPS24K ; this is a design limitation of Kconfig in general; best option is to have a user manually type-in a CPU target, and if the user doesn't specify anything, the defaults will be automatically selected
  • for x86_64 ; changed s/ATHLON/GENERIC/g
  • for x86 ; changed s/NORTHWOOD/GENERIC/g
  • for arm ; changed s/ARMv7/ARMv5/g
  • added USE_OPENMP=0 to build
  • disabling PowerPC builds; using USE_OPENMP=0 helps with the 8540 target, but doesn't work with 464FP; will require more work
  • added BINARY make flag ; i initially forgot about it

@commodo
Copy link
Contributor Author

commodo commented May 25, 2021

so, we may tweak things by using the CPU_TYPE parameter
for plenty of ARM/ARM64 targets, we can bump up the optimization level by checking if CPU_TYPE == cortex-a9 [for example];

this will be future-work;
but i think this [what we have now] should be good enough to get started;
as there will be more feedback, we can find other issues to tweak;

@commodo
Copy link
Contributor Author

commodo commented May 25, 2021

It should not link against 300kB GOMP, at least not on compact embedded architectures.

i may have missed what this comment;
do i need to do anything about GOMP?

@brada4
Copy link

brada4 commented May 25, 2021

ARM targets:
I think there are subarches , like i386 and Pentium4 and you can have specific package for each, OpenBLAS supports dynamic detection on ARM and x86 (with huge fat library). Lets leave that for later. At present you need a solid base to work on NumPy I suppose.

OpenMP:

  • USE_OPENMP parameter is buggy, it's mere presence allows OpenMP use in some places. Just omit this parameter for now to keep OpenMP disabled for good.

  • If you intend to use multi-process building than add build parameter MAKE_NB_JOBS=-1 (something negative) to inherit -j ??? from make and not detect internally.

  • It will use CPU number from build system, not target one You need to add NUM_THREADS=2 or something, number two is best for testing - it will highlight all race issues easily, while keeping embedded devices cool.

It is all written in Makefile.rule file. Feel free to remind if you find something confusing there. Have fun.

@brada4
Copy link

brada4 commented May 25, 2021

You need to tweak COMMON_OPT="" and maybe CCOMMON_OPT=$YOUR_CFLAGS to avoid overrriding intended -Os with -O2 , otherwise at least looks warning-free. Note that cross-build does not run functional tests, it might be worth adding extra native amd64 build to validate at least once.
EDIT: ie CROSS=1 HOSTCC=cc

@commodo
Copy link
Contributor Author

commodo commented May 28, 2021

You need to tweak COMMON_OPT="" and maybe CCOMMON_OPT=$YOUR_CFLAGS to avoid overrriding intended -Os with -O2 , otherwise at least looks warning-free.

many thanks for the COMMON_OPT & CCOMMON_OPT hints;
will adjust those;
i'm seeing that there are many -DNO_AVX and other such options; these go away if I pass CCOMMON_OPT=$(TARGET_CFLAGS)
looking briefly, it seems like there should be some runtime detection of these options;
is that the case?

Note that cross-build does not run functional tests, it might be worth adding extra native amd64 build to validate at least once.

in general in OpenWrt packaging we don't run tests;
mostly because it's an effort to do and maintain (because of all the cross-building that's going on);
it requires a better integration between packaging and CI (which is not quite there yet);

usually these tests are more in the responsibility of the package (itself)
and many packages just disable them;
finally, it's a load on the OpenWrt CI, which runs quite a lot of builds for quite a lot of SoCs

EDIT: ie CROSS=1 HOSTCC=cc

Should I remove these?
These seemed to help the cross-build.

@commodo
Copy link
Contributor Author

commodo commented May 28, 2021

You need to tweak COMMON_OPT="" and maybe CCOMMON_OPT=$YOUR_CFLAGS to avoid overrriding intended -Os with -O2 , otherwise at least looks warning-free.

many thanks for the COMMON_OPT & CCOMMON_OPT hints;
will adjust those;
i'm seeing that there are many -DNO_AVX and other such options; these go away if I pass CCOMMON_OPT=$(TARGET_CFLAGS)
looking briefly, it seems like there should be some runtime detection of these options;
is that the case?

Note that cross-build does not run functional tests, it might be worth adding extra native amd64 build to validate at least once.

in general in OpenWrt packaging we don't run tests;
mostly because it's an effort to do and maintain (because of all the cross-building that's going on);
it requires a better integration between packaging and CI (which is not quite there yet);

usually these tests are more in the responsibility of the package (itself)
and many packages just disable them;
finally, it's a load on the OpenWrt CI, which runs quite a lot of builds for quite a lot of SoCs

EDIT: ie CROSS=1 HOSTCC=cc

Should I remove these?
These seemed to help the cross-build.

I'm seeing that CCOMMON_OPT=$(TARGET_CFLAGS) fails to build. On x86 I get (see below):
Just doing COMMON_OPT="" helps to allow the optimization level be overriden to OpenWrt's default -Os

syr_thread.c: In function 'CNAME':
syr_thread.c:151:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr_thread.c:220:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
syr_thread.c:220:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
syr_thread.c:220:30: note: each undeclared identifier is reported only once for each function it appears in
syr_thread.c:221:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = syr_kernel;
                   ^
syr_thread.c:222:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
syr_thread.c:223:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = &range_m[MAX_CPU_NUMBER - num_cpu - 1];
                   ^
syr_thread.c:224:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = NULL;
                   ^
syr_thread.c:225:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
syr_thread.c: In function 'CNAME':
syr_thread.c:226:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
syr_thread.c:151:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr_thread.c:227:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
syr_thread.c:274:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
syr_thread.c:275:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
syr_thread.c:277:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
syr_thread.c:258:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
syr_thread.c:258:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
syr_thread.c:258:30: note: each undeclared identifier is reported only once for each function it appears in
syr_thread.c:259:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = syr_kernel;
                   ^
syr_thread.c:260:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
syr_thread.c:261:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = &range_m[num_cpu];
                   ^
syr_thread.c:262:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = NULL;
                   ^
syr_thread.c:263:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
syr2_thread.c: In function 'CNAME':
syr_thread.c:264:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
syr_thread.c:265:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
gemv_thread.c: In function 'CNAME':
syr_thread.c:274:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
syr_thread.c:275:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
syr_thread.c:277:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
gemv_thread.c:163:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr2_thread.c:211:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr2_thread.c:282:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
syr_thread.c:279:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
syr2_thread.c:282:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
gemv_thread.c: In function 'CNAME':
syr2_thread.c:282:30: note: each undeclared identifier is reported only once for each function it appears in
syr2_thread.c:283:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = syr_kernel;
                   ^
gemv_thread.c:163:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr2_thread.c:284:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
syr2_thread.c:285:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = &range_m[MAX_CPU_NUMBER - num_cpu - 1];
                   ^
syr2_thread.c:286:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = NULL;
                   ^
syr2_thread.c:287:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
syr2_thread.c:288:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
syr2_thread.c:289:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
syr2_thread.c:336:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
symv_thread.c: In function 'CNAME':
syr2_thread.c:337:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
symv_thread.c:117:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
syr2_thread.c:339:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
symv_thread.c:182:40: error: request for member 'mode' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].mode    = mode;
                                        ^
syr_thread.c:279:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
ger_thread.c: In function 'CNAME':
ger_thread.c:120:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
symv_thread.c:182:51: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[MAX_CPU_NUMBER - num_cpu - 1].mode    = mode;
                                                   ^~~~
                                                   modf
symv_thread.c:182:51: note: each undeclared identifier is reported only once for each function it appears in
symv_thread.c:183:40: error: request for member 'routine' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].routine = symv_kernel;
                                        ^
symv_thread.c:184:40: error: request for member 'args' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].args    = &args;
                                        ^
symv_thread.c:185:40: error: request for member 'range_m' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].range_m = &range_m[num_cpu];
                                        ^
symv_thread.c:186:40: error: request for member 'range_n' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].range_n = &range_n[num_cpu];
                                        ^
symv_thread.c:187:40: error: request for member 'sa' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].sa      = NULL;
                                        ^
symv_thread.c:188:40: error: request for member 'sb' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].sb      = NULL;
                                        ^
symv_thread.c:189:40: error: request for member 'next' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu - 1].next    = &queue[MAX_CPU_NUMBER - num_cpu];
                                        ^
symv_thread.c:196:36: error: request for member 'sa' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu].sa = NULL;
                                    ^
symv_thread.c:197:36: error: request for member 'sb' in something not a structure or union
     queue[MAX_CPU_NUMBER - num_cpu].sb = buffer + num_cpu * (((m + 255) & ~255) + 16) * COMPSIZE;
                                    ^
symv_thread.c:199:30: error: request for member 'next' in something not a structure or union
     queue[MAX_CPU_NUMBER - 1].next = NULL;
                              ^
make[4]: *** [Makefile:894: ssyr_thread_U.o] Error 1
make[4]: *** Waiting for unfinished jobs....
gemv_thread.c:220:14: warning: implicit declaration of function 'blas_quickdivide'; did you mean 'at_quick_exit'? [-Wimplicit-function-declaration]
     width  = blas_quickdivide(i + nthreads - num_cpu - 1, nthreads - num_cpu);
              ^~~~~~~~~~~~~~~~
              at_quick_exit
gemv_thread.c:226:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
syr2_thread.c:341:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
gemv_thread.c:226:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
gemv_thread.c:226:30: note: each undeclared identifier is reported only once for each function it appears in
gemv_thread.c:227:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = gemv_kernel;
                   ^
gemv_thread.c:228:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
gemv_thread.c:233:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = NULL;
                   ^
gemv_thread.c:234:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = &range[num_cpu];
                   ^
gemv_thread.c:236:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
gemv_thread.c:237:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
gemv_thread.c:238:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
gemv_thread.c:297:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
gemv_thread.c:298:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
gemv_thread.c:299:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
make[4]: *** [Makefile:897: ssyr_thread_L.o] Error 1
gemv_thread.c:220:14: warning: implicit declaration of function 'blas_quickdivide'; did you mean 'at_quick_exit'? [-Wimplicit-function-declaration]
     width  = blas_quickdivide(i + nthreads - num_cpu - 1, nthreads - num_cpu);
              ^~~~~~~~~~~~~~~~
              at_quick_exit
symv_thread.c:201:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, &queue[MAX_CPU_NUMBER - num_cpu]);
     ^~~~~~~~~
     xerbla_
gemv_thread.c:226:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
ger_thread.c:169:14: warning: implicit declaration of function 'blas_quickdivide'; did you mean 'at_quick_exit'? [-Wimplicit-function-declaration]
     width  = blas_quickdivide(i + nthreads - num_cpu - 1, nthreads - num_cpu);
              ^~~~~~~~~~~~~~~~
              at_quick_exit
ger_thread.c:175:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
make[4]: *** [Makefile:966: ssyr2_thread_U.o] Error 1
gemv_thread.c:301:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
gemv_thread.c:226:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
gemv_thread.c:226:30: note: each undeclared identifier is reported only once for each function it appears in
gemv_thread.c:227:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = gemv_kernel;
                   ^
gemv_thread.c:228:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
gemv_thread.c:230:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = &range[num_cpu];
                   ^
gemv_thread.c:231:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = NULL;
                   ^
gemv_thread.c:236:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
gemv_thread.c:237:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
gemv_thread.c:238:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
gemv_thread.c:274:21: error: request for member 'mode' in something not a structure or union
       queue[num_cpu].mode    = mode;
                     ^
gemv_thread.c:275:21: error: request for member 'routine' in something not a structure or union
       queue[num_cpu].routine = gemv_kernel;
                     ^
gemv_thread.c:276:21: error: request for member 'args' in something not a structure or union
       queue[num_cpu].args    = &args;
                     ^
gemv_thread.c:278:21: error: request for member 'position' in something not a structure or union
       queue[num_cpu].position = num_cpu;
                     ^
gemv_thread.c:280:21: error: request for member 'range_m' in something not a structure or union
       queue[num_cpu].range_m = NULL;
                     ^
gemv_thread.c:281:21: error: request for member 'range_n' in something not a structure or union
       queue[num_cpu].range_n = &range[num_cpu];
                     ^
gemv_thread.c:283:21: error: request for member 'sa' in something not a structure or union
       queue[num_cpu].sa      = NULL;
                     ^
gemv_thread.c:284:21: error: request for member 'sb' in something not a structure or union
       queue[num_cpu].sb      = NULL;
                     ^
gemv_thread.c:285:21: error: request for member 'next' in something not a structure or union
       queue[num_cpu].next    = &queue[num_cpu + 1];
                     ^
gemv_thread.c:297:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
gemv_thread.c:298:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
gemv_thread.c:299:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
ger_thread.c:175:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
ger_thread.c:175:30: note: each undeclared identifier is reported only once for each function it appears in
ger_thread.c:176:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = ger_kernel;
                   ^
ger_thread.c:177:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
ger_thread.c:178:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = &range_n[num_cpu];
                   ^
ger_thread.c:179:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
ger_thread.c:180:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
ger_thread.c:181:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
ger_thread.c:188:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
ger_thread.c:189:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer;
             ^
ger_thread.c:191:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
make[4]: *** [Makefile:822: ssymv_thread_U.o] Error 1
make[4]: *** [Makefile:679: sgemv_thread_t.o] Error 1
gemv_thread.c:301:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
symv_thread.c: In function 'CNAME':
symv_thread.c:117:3: error: unknown type name 'blas_queue_t'; did you mean 'blas_arg_t'?
   blas_queue_t queue[MAX_CPU_NUMBER];
   ^~~~~~~~~~~~
   blas_arg_t
symv_thread.c:231:19: error: request for member 'mode' in something not a structure or union
     queue[num_cpu].mode    = mode;
                   ^
symv_thread.c:231:30: error: 'mode' undeclared (first use in this function); did you mean 'modf'?
     queue[num_cpu].mode    = mode;
                              ^~~~
                              modf
symv_thread.c:231:30: note: each undeclared identifier is reported only once for each function it appears in
symv_thread.c:232:19: error: request for member 'routine' in something not a structure or union
     queue[num_cpu].routine = symv_kernel;
                   ^
symv_thread.c:233:19: error: request for member 'args' in something not a structure or union
     queue[num_cpu].args    = &args;
                   ^
symv_thread.c:234:19: error: request for member 'range_m' in something not a structure or union
     queue[num_cpu].range_m = &range_m[num_cpu];
                   ^
symv_thread.c:235:19: error: request for member 'range_n' in something not a structure or union
     queue[num_cpu].range_n = &range_n[num_cpu];
                   ^
symv_thread.c:236:19: error: request for member 'sa' in something not a structure or union
     queue[num_cpu].sa      = NULL;
                   ^
symv_thread.c:237:19: error: request for member 'sb' in something not a structure or union
     queue[num_cpu].sb      = NULL;
                   ^
symv_thread.c:238:19: error: request for member 'next' in something not a structure or union
     queue[num_cpu].next    = &queue[num_cpu + 1];
                   ^
symv_thread.c:245:13: error: request for member 'sa' in something not a structure or union
     queue[0].sa = NULL;
             ^
symv_thread.c:246:13: error: request for member 'sb' in something not a structure or union
     queue[0].sb = buffer + num_cpu * (((m + 255) & ~255) + 16) * COMPSIZE;
             ^
symv_thread.c:248:23: error: request for member 'next' in something not a structure or union
     queue[num_cpu - 1].next = NULL;
                       ^
ger_thread.c:193:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
make[4]: *** [Makefile:676: sgemv_thread_n.o] Error 1
make[4]: *** [Makefile:777: sger_thread.o] Error 1
symv_thread.c:250:5: warning: implicit declaration of function 'exec_blas'; did you mean 'xerbla_'? [-Wimplicit-function-declaration]
     exec_blas(num_cpu, queue);
     ^~~~~~~~~
     xerbla_
make[4]: *** [Makefile:825: ssymv_thread_L.o] Error 1

@commodo
Copy link
Contributor Author

commodo commented May 28, 2021

Changelog v2 -> v3:

  • add COMMON_OPT="" to MAKE_FLAGS ; this makes sure the -Os flag isn't overriden by the OpenBLAS build flags

@brada4
Copy link

brada4 commented May 28, 2021

Thread errors are because of strange threading flags , remove that USE_OPENMP=0 and add NUM_THREADS=2, seems that CI build happens on various machines and messages pop on or off depending on that.

@commodo commodo force-pushed the openblas branch 3 times, most recently from 3473afd to 76af978 Compare June 2, 2021 11:50
@commodo
Copy link
Contributor Author

commodo commented Jun 2, 2021

Thread errors are because of strange threading flags , remove that USE_OPENMP=0 and add NUM_THREADS=2, seems that CI build happens on various machines and messages pop on or off depending on that.

I still get the same build errors after removing USE_OPENMP=0 and adding NUM_THREADS=2. And adding CCOMMON_OPT="$(TARGET_CFLAGS)"

So, I removed CCOMMON_OPT="$(TARGET_CFLAGS)"

Changelog v3 -> v4:

  • replaced USE_OPENMP=0 with NUM_THREADS=2

@commodo
Copy link
Contributor Author

commodo commented Jun 14, 2021

anything from my part to do here?

@brada4
Copy link

brada4 commented Jun 15, 2021

I suppose it is usable as numpy backend on supplied platforms. And reasonable minimised representation of OpenBLAS. Youre always free to ping upstream if you find an inconsistency, or plainly get stuck and confused.

@commodo
Copy link
Contributor Author

commodo commented Jul 6, 2021

I ran this code on a numpy instalation:
Seems good.

import numpy as np

import time

n = 200

A = np.random.randn(n,n).astype('float64')
B = np.random.randn(n,n).astype('float64')


start_time = time.time()
nrm = np.linalg.norm(A@B)
print(" took {} seconds ".format(time.time() - start_time))

print(" norm = ",nrm)
print(np.__config__.show())

Output:

 python test.py 
 took 0.0034945011138916016 seconds 
 norm =  2830.286817461252
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/aardelean/work/openwrt/openwrt/staging_dir/target-i386_pentium4_musl/usr/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/home/aardelean/work/openwrt/openwrt/staging_dir/target-i386_pentium4_musl/usr/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
  NOT AVAILABLE
openblas_clapack_info:
  NOT AVAILABLE
flame_info:
  NOT AVAILABLE
accelerate_info:
  NOT AVAILABLE
atlas_3_10_threads_info:
  NOT AVAILABLE
atlas_3_10_info:
  NOT AVAILABLE
atlas_threads_info:
  NOT AVAILABLE
atlas_info:
  NOT AVAILABLE
lapack_info:
  NOT AVAILABLE
lapack_src_info:
  NOT AVAILABLE
lapack_opt_info:
  NOT AVAILABLE
numpy_linalg_lapack_lite:
    language = c
    define_macros = [('HAVE_BLAS_ILP64', None), ('BLAS_SYMBOL_SUFFIX', '64_')]
None

commodo added a commit to commodo/packages that referenced this pull request Jul 6, 2021
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
@commodo
Copy link
Contributor Author

commodo commented Jul 6, 2021

created PR #16054 for numpy with OpenBLAS
currently disabled by default; will enable a few archs later;

right now, i just want to slowly move forward;
and we can validate archs as we move forward;


PKG_NAME:=OpenBLAS
PKG_VERSION:=0.3.15
PKG_RELEASE:=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use $(AUTORELEASE) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack;

based on GotoBLAS2 1.13 BSD version.
endef

ifneq (,$(findstring $(ARCH) , aarch64 mips64 mips64el x86_64 ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be shortened to one line with CONFIG_ARCH_64BIT

endif
endif # ifeq ($(OPENBLAS_TARGET),)

ifneq (,$(findstring $(ARCH) , aarch64 mips64 mips64el x86_64 ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a duplicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep;
sorry about that; will remove;


MAKE_FLAGS += \
CROSS=1 \
HOSTCC=$(HOSTCC) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suspicious :). I bet it breaks with ccache on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm; no idea about ccache here;

Initial draft PR is:
  openwrt#11894

This one is a bit more complete, and follows packaging practices.

For now, disabling builds on ARC and PowerPC. Will require more work to get
them going.
Explicitly disabling OpenMP support, so that it doesn't get picked by
accident.

Later we may use the `CPU_TYPE` parameter to tweak things a little further.

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
@commodo
Copy link
Contributor Author

commodo commented Jul 26, 2021

Changelog v4 -> v5:

  • PKG_RELEASE:=$(AUTORELEASE)
  • using CONFIG_ARCH_64BIT for 64 bit check
  • removed duplicate 64-bit arch check

commodo added a commit to commodo/packages that referenced this pull request Jul 26, 2021
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
@neheb neheb merged commit 038f890 into openwrt:master Jul 27, 2021
1715173329 pushed a commit to immortalwrt/packages that referenced this pull request Jul 27, 2021
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Signed-off-by: Tianling Shen <cnsztl@immortalwrt.org>
@commodo commodo deleted the openblas branch August 13, 2021 08:34
utoni pushed a commit to utoni/openwrt-packages that referenced this pull request Jan 21, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 7, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this pull request Nov 8, 2022
Also bump Cython version to 0.29.23.
And add support for OpenBLAS.
Currently optional, but will be enabled by default on some architectures
later.

Depends on PR openwrt/packages#15685

Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants