Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLAS : Program is Terminated. Because you tried to allocate too many memory regions. #1882

Closed
yurivict opened this issue Nov 22, 2018 · 58 comments

Comments

@yurivict
Copy link
Contributor

I used openblas for blas/lapack functions in the erkale project, and it fails. erkale's author says that openblas is broken, see susilehtola/erkale#29 (comment)

@martin-frbg
Copy link
Collaborator

Well, you know, he may be right. But it would certainly help if we knew the version of OpenBLAS that you are currently using, and a bit more about what erkale does. I guess erkale itself is multithreading and thus calling into OpenBLAS from multiple threads, which causes problems that I am currently trying to fix.

@yurivict
Copy link
Contributor Author

openblas-0.2.20_3,1 on FreeBSD
erkale experiences this problem when run with multithreading using OpenMP.

@martin-frbg
Copy link
Collaborator

Could you try with current develop branch ("soon" to be 0.3.4) please ? Apart from a number of fixes, that one has a new compile-time parameter NUM_PARALLEL to reduce the risk of running out of unique thread pointers in what looks like your use case.

@brada4
Copy link
Contributor

brada4 commented Nov 22, 2018

Alternatively rebuild package with OpenMP that should be more moderate in openmp program and not try to spawn n^2 threads

@yurivict
Copy link
Contributor Author

erkale is built with both OpenMP and OpenBLAS.
I'm trying to fix test failures in its parallel version.

@yurivict
Copy link
Contributor Author

yurivict commented Nov 22, 2018

The message

Because you tried to allocate too many memory regions.

begs the question "How many regions were allocated?"

Also with the ever increasing computing power, what does "too many" really mean? Why is this limitation imposed?

@brada4
Copy link
Contributor

brada4 commented Nov 22, 2018

It is a fixed table of regions.... 1-2 are consumed per parallel thread
what was built into system package derived from NUM_THREADS at build time got exceeded.
You can throw in any value you prefer where recent improvement was made:
#1858

@yurivict
Copy link
Contributor Author

Why can't you reallocate it dynamically when exceeded instead of fixing it once and for good during build?

@martin-frbg
Copy link
Collaborator

martin-frbg commented Nov 22, 2018

This limit is directly related to the NUM_THREADS parameter set at build time (which defaults to the number of cores detected on the build host ). There has been a recent attempt to rewrite the memory allocation logic (that dates back to K. Goto's original libGotoBLAS of 10+ years ago) using thread-local storage. Unfortunately the reimplementation met a number of unexpected corner cases and it is unclear if it is safe to use as the default in its current state. See option USE_TLS in current develop.

@brada4
Copy link
Contributor

brada4 commented Nov 22, 2018

Indeed OpenBLAS FreeBSD port would be built with 16 malloc slots only.
https://svnweb.freebsd.org/ports/head/math/openblas/Makefile?view=markup#l61
PR above is meant to plainly mask such cases so that ages old limitation does not hurt every other user.
I'd recommend to change package makefile with like 64 threads (128 slots), and use OpenMP , since you wrap library in OpenMP and OMP openblas reduces threading if called in OMP parallel section.

@yurivict
Copy link
Contributor Author

yurivict commented Nov 23, 2018

I'll change the limit in the port for now.

But no matter what the limit value would be set, this problem will come back because the number of threads shouldn't even in theory be tied to the number of CPUs in general (threads can be half-idle for example). This needs to be solved.

uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 23, 2018
…k other ports

NUM_THREADS= sets the build-time limit on the number of threads that apps can use with OpenBlas.
This unbreaks at least science/erkale's tests, and possibly some other software.
The upstream acknowledges the problem, recommended this solution for the port,
and are working on the permanent solution: OpenMathLib/OpenBLAS#1882

Approved by:	portmgr blanket (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485641 35697150-7ecd-e111-bb59-0022644237b5
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 23, 2018
…k other ports

NUM_THREADS= sets the build-time limit on the number of threads that apps can use with OpenBlas.
This unbreaks at least science/erkale's tests, and possibly some other software.
The upstream acknowledges the problem, recommended this solution for the port,
and are working on the permanent solution: OpenMathLib/OpenBLAS#1882

Approved by:	portmgr blanket (unbreak)
@brada4
Copy link
Contributor

brada4 commented Nov 23, 2018

@yurivict
I think you got the message into right ears.

You are wrong about number of threads. The constraining resource here is CPU cache, OpenBLAS( or MKL for that sake) would operate on limited amount of data , fitting in L1d/L2/L3/L4 caches. Obvious if 2 threads of a kind meet on same core they go with 10-20x slower memory accesses from main memory and performance goes 10x down.
What is wrong here is that number of memory buffers is compiled in, and bound to build CPUs, and hurts people oversubscribing CPU cores (I mean caches)

@yurivict
Copy link
Contributor Author

You assume that all threads are CPU-intense. But some threads might be idle. Some might work on separate data sets while using only 10% of CPU each. Some people create threads per connection, etc. All sorts of use models can take place.

Jehops pushed a commit to Jehops/freebsd-ports-legacy that referenced this issue Nov 23, 2018
…k other ports

NUM_THREADS= sets the build-time limit on the number of threads that apps can use with OpenBlas.
This unbreaks at least science/erkale's tests, and possibly some other software.
The upstream acknowledges the problem, recommended this solution for the port,
and are working on the permanent solution: OpenMathLib/OpenBLAS#1882

Approved by:	portmgr blanket (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485641 35697150-7ecd-e111-bb59-0022644237b5
@brada4
Copy link
Contributor

brada4 commented Nov 23, 2018

That 10% would break the assumption of computation kernels that cache is for their exclusive use, and both compute kernels on same core will slow down N times more than just in half as with normal compiler-emitted code.
The aim is to plainly get results out faster, not to have 100% CPU usage in "top" or maximize CPU temperatures.

@yurivict
Copy link
Contributor Author

Change of NUM_THREADS to 64 didn't fix all erkale's test failures. Some of them still fail with the same message.

@brada4
Copy link
Contributor

brada4 commented Nov 23, 2018

@yurivict while at it can you push #1785 that is reducing swarm of unproductive locks (syscalls) per each BLAS call that hurts a lot on high core number systems? (it is old code, but must be re-based for old version because of recent renumberings in particular file)

@brada4
Copy link
Contributor

brada4 commented Nov 23, 2018

@yurivict do they (tests) pass with OPENBLAS_NUM_THREADS=1 and/or with OpenMP OpenBLAS?
EDIT yeah, i know it may still hurt casual users, but no easy chance with current code.
How many cores the build system has? It will spin up that many squared threads if program uses OMP and OpenBLAS then uses pthreads inside.

swills pushed a commit to swills/freebsd-ports that referenced this issue Nov 23, 2018
…k other ports

NUM_THREADS= sets the build-time limit on the number of threads that apps can use with OpenBlas.
This unbreaks at least science/erkale's tests, and possibly some other software.
The upstream acknowledges the problem, recommended this solution for the port,
and are working on the permanent solution: OpenMathLib/OpenBLAS#1882

Approved by:	portmgr blanket (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485641 35697150-7ecd-e111-bb59-0022644237b5
@yurivict
Copy link
Contributor Author

do they (tests) pass with OPENBLAS_NUM_THREADS=1 and/or with OpenMP OpenBLAS?

They still fail.

@yurivict
Copy link
Contributor Author

yurivict commented Nov 23, 2018

Summary:

The change to NUM_THREADS=64 in the port didn't help, OPENBLAS_NUM_THREADS=1 also doesn't help, gotoblas fails the same way when used instead of OpenBlas.

What helped: change to liblapack.so/libblas.so/libcblas.so. Tests pass with this implementation.

Testcase: The Erkale quantum chemistry project (https://github.com/susilehtola/erkale) built with -DUSE_OPENMP=ON. ctest tests fail when linked with OpenBlas.

@brada4
Copy link
Contributor

brada4 commented Nov 24, 2018

You mean openblas.so fails completlely? Or you had to direct .BLAS .cblas .lapack alll to OpenBLAS at once?
Do you have any log of failure to repeat at "home"?

@yurivict
Copy link
Contributor Author

openblas.so fails completely. Replacing it with .BLAS/.cblas/.lapack combination allows the process to succeed.

It triggers exceptions error, see above, and the processes crash.

@brada4
Copy link
Contributor

brada4 commented Nov 24, 2018

The log?

@yurivict
Copy link
Contributor Author

2: Test command: /usr/ports/science/erkale/work-parallel/.build/src/test/basictests_omp
2: Test timeout computed to be: 10000000
2: Indices OK.
2: Solid harmonics OK.
2: Checkpointing OK.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Bad memory unallocation! :    2  0x0
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
2: BLAS : Program is 
2/2 Test #2: basictests .......................***Exception: SegFault  0.96 sec

The following tests passed:
	build_basictests

50% tests passed, 1 tests failed out of 2

Total Test time (real) =   1.82 sec

The following tests FAILED:
	  2 - basictests (SEGFAULT)
Errors while running CTest

@brada4
Copy link
Contributor

brada4 commented Nov 24, 2018

Is it the log from:
1/ OPENBLAS_NUM_THREADS=1 (or 2)
2/ USE_OPENMP=1
Or you just knowingly run N threads on each of N CPUs?

HOW MANY CPU CORES ARE THERE IN THE BUILD MACHINE?

@yurivict
Copy link
Contributor Author

yurivict commented Nov 24, 2018

HOW MANY CPU CORES ARE THERE IN THE BUILD MACHINE?

4 cores, 8 virtual CPUs.

USE_OPENMP=1 is used in the erkale project.
I watched how it runs, tests run with 8 threads. It might be that it runs more threads for a short period of time.
But again, this should be up to project authors how to allocate threads.

@brada4
Copy link
Contributor

brada4 commented Nov 24, 2018

I will try to get something out of Linux and erakle

If all tests run in same program continuously-there are some uninitialized values fixed, that may probably worth waiting for 0.3.4 instead of rushing 0.3.3
Messages about bad deallocation also mean that something alloc/free was not paired properly, i.e memory leak, also wort checking against later/-st version

Which project is to blame for allocating threads? So far I see just slight misconfiguration, and probably old version.

uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 26, 2018
… safety issues

This patch is recommended by the upstream: OpenMathLib/OpenBLAS#1882 (comment)

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485906 35697150-7ecd-e111-bb59-0022644237b5
swills pushed a commit to swills/freebsd-ports that referenced this issue Nov 26, 2018
… safety issues

This patch is recommended by the upstream: OpenMathLib/OpenBLAS#1882 (comment)

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485906 35697150-7ecd-e111-bb59-0022644237b5
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 26, 2018
… safety issues

This patch is recommended by the upstream: OpenMathLib/OpenBLAS#1882 (comment)

Approved by:	portmgr (unbreak)
Jehops pushed a commit to Jehops/freebsd-ports-legacy that referenced this issue Nov 26, 2018
… safety issues

This patch is recommended by the upstream: OpenMathLib/OpenBLAS#1882 (comment)

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485906 35697150-7ecd-e111-bb59-0022644237b5
@brada4
Copy link
Contributor

brada4 commented Nov 26, 2018

Both options should go to non-threaded version too
-fopenmp would imply -frecursive, but single threaded version will have unsafe fortran function representations that cannot work from C/C++ pthreads or OpenMP - local temporary arrays of sufficient size like >32-64k would be allocated in global heap shared between threads without any arbitaration whatsoever, leading to at least certain least numeric failures.
I have got the same crash with g++ , backtracing to something main -> read_config -> assert, not yet involving any BLAS.

@brada4
Copy link
Contributor

brada4 commented Nov 26, 2018

Ok BLAS imports (probably some of L1 is masked by gsl cblas macros)

All BLAS have thread limits, it is a performance issue for particular functions for small inputs, not crasher or something

There are some dangerous LAPACK functions getting imported mandating frecursive

BLAS L1
ddot_
BLAS L2
dgemv_ zgemv_
BLAS L3
dgemm_ zgemm_
dsyrk_
zherk_
LAPACK THREADSAFE
ilaenv_
dgetrf_
dgetri_
LAPACK needing -frecursive
dgesv_
dgelsd_
dgels_
dgesvx_

Let me summarize:
1/ GOMP and CLANG OMP are not very friendly (probably they emit different pthread lock IDs, but from same toplevel functions leading to locks not working right at all)
2/ G++ leads to early crashes
3/ lapack functions that were not thread-safe before frecursive are present

I think for now it is best to import pthread version in all circumstances in serial programs and single-threaded, safeguarded with -frecursive in threaded ones, and keep the GOMP version in the basement for programs that do not crash when build with GCC world (as disabled by default option for example)

The only dangers are performance-related i.e OMP program imports threaded version and gets N^2 threads which can be brought under control with variables, or single threaded program imports single threaded version, still faster than netlib, but with big space for improvement

Improvements gained towards 0.3.4:

  • threading outside single-threaded OpenBLAS is (will be once param reaches 1-thread version) safe
  • We have some space for N^2 threads from serial version

@yurivict
Copy link
Contributor Author

I see now that OPENMP isn't a default option in the port, changes that I made only apply to the non-default case. I'll move them to be for all versions.

@brada4
Copy link
Contributor

brada4 commented Nov 26, 2018

@martin-frbg
dgetrf zherk dsyrk are not guarded against early threading.
Ignore me if I dont produce PRs today.
@yurivict
Following changes will need to go to 0.3.4 on top of what we did here:
DYNAMIC_OLDER=1 should supplement amd64/x86_64 DYNAMIC_ARCH=1

No more need for -frecursive, it is now in right place.

There will be AVX-512 (Skylake-X) support , both FreeBSD clang and gcc can compile it, so new option for that(?).

In principle it builds with clang+flang(once later works) too, if you want to experiment in other side of OMP world, but not required at all.

I think we cannot improve anything here, but feel free to report if you stumble on anything similarily weird.

@martin-frbg
Copy link
Collaborator

martin-frbg commented Nov 26, 2018

dgetrf zherk dsyrk are not guarded against early threading.
Ignore me if I dont produce PRs today.

Could you create a separate issue for that please (I assume with "early threading" you mean inefficient multithreading for tiny problem sizes (and not something leading to catastrophic failure), but I am guaranteed to lose my mind if I try to look into that today).

@brada4
Copy link
Contributor

brada4 commented Nov 26, 2018

@yurivict (not related to current issue at all) is it possible to get to FreeBSD something like linux pax-utils, i.e. lddtree to find 2 distinct OMP imports and symtree to quickly list imported functions per library?

@yurivict
Copy link
Contributor Author

yurivict commented Nov 26, 2018

@brada4 Is it this package: https://www.freshports.org/sysutils/pax-utils ?

@yurivict
Copy link
Contributor Author

FYI You can use Repology website to search for packages by name in different systems: https://repology.org/

@brada4
Copy link
Contributor

brada4 commented Nov 26, 2018

Installed, thanks :-)

uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 26, 2018
…r non-openmp too

Previously I added these options only to the openmp build which isn't a default.
This change is requested by the upstream.
Ref. OpenMathLib/OpenBLAS#1882

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485947 35697150-7ecd-e111-bb59-0022644237b5
uqs pushed a commit to freebsd/freebsd-ports that referenced this issue Nov 26, 2018
…r non-openmp too

Previously I added these options only to the openmp build which isn't a default.
This change is requested by the upstream.
Ref. OpenMathLib/OpenBLAS#1882

Approved by:	portmgr (unbreak)
swills pushed a commit to swills/freebsd-ports that referenced this issue Nov 26, 2018
…r non-openmp too

Previously I added these options only to the openmp build which isn't a default.
This change is requested by the upstream.
Ref. OpenMathLib/OpenBLAS#1882

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485947 35697150-7ecd-e111-bb59-0022644237b5
Jehops pushed a commit to Jehops/freebsd-ports-legacy that referenced this issue Nov 27, 2018
…r non-openmp too

Previously I added these options only to the openmp build which isn't a default.
This change is requested by the upstream.
Ref. OpenMathLib/OpenBLAS#1882

Approved by:	portmgr (unbreak)


git-svn-id: svn+ssh://svn.freebsd.org/ports/head@485947 35697150-7ecd-e111-bb59-0022644237b5
@martin-frbg
Copy link
Collaborator

Closing as the crucial change, adding -frecursive to the gfortran options, was released with 0.3.4 already.

@zhilians
Copy link

zhilians commented Jan 3, 2019

@brada4 Is there any 'harm' on increasing the number of memory buffers to large number?

From the memory.c:

local_memory_table = (struct alloc_t **)malloc(sizeof(struct alloc_t *) * NUM_BUFFERS);
memset(local_memory_table, 0, sizeof(struct alloc_t *) * NUM_BUFFERS);

It seems that it's just 64 * NUM_BUFFERS bytes in memory.

From the discussions above, I can't quite relate the local_memory_table with fitting data into CPU cache. Is there any harm to have a very large NUMBER_THREADS in build time, say 4,096 for a 96 core CPU, and call OpenBLAS in 128 concurrent threads? In such way, our program won't be terminated with an outburst of number of concurrent threads.

@brada4
Copy link
Contributor

brada4 commented Jan 3, 2019

Very bad to steal closed unrelated thread...
Open a new one if you want to discuss what is not clear from discussion in(unrelated to this) #1858

svmhdvn pushed a commit to svmhdvn/freebsd-ports that referenced this issue Jan 10, 2024
…k other ports

NUM_THREADS= sets the build-time limit on the number of threads that apps can use with OpenBlas.
This unbreaks at least science/erkale's tests, and possibly some other software.
The upstream acknowledges the problem, recommended this solution for the port,
and are working on the permanent solution: OpenMathLib/OpenBLAS#1882

Approved by:	portmgr blanket (unbreak)
svmhdvn pushed a commit to svmhdvn/freebsd-ports that referenced this issue Jan 10, 2024
… safety issues

This patch is recommended by the upstream: OpenMathLib/OpenBLAS#1882 (comment)

Approved by:	portmgr (unbreak)
svmhdvn pushed a commit to svmhdvn/freebsd-ports that referenced this issue Jan 10, 2024
…r non-openmp too

Previously I added these options only to the openmp build which isn't a default.
This change is requested by the upstream.
Ref. OpenMathLib/OpenBLAS#1882

Approved by:	portmgr (unbreak)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants