Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to do gemv and ger buffer allocation on the stack #482

Merged
merged 1 commit into from Jan 1, 2015

Conversation

jeromerobert
Copy link
Contributor

ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix #478

ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix OpenMathLib#478
xianyi added a commit that referenced this pull request Jan 1, 2015
Allow to do gemv and ger buffer allocation on the stack
@xianyi xianyi merged commit 41aad04 into OpenMathLib:develop Jan 1, 2015
xianyi added a commit that referenced this pull request Apr 13, 2015
For gemv_t, directly use malloc to create the buffer.
xianyi added a commit that referenced this pull request Apr 13, 2015
jeromerobert added a commit to jeromerobert/OpenBLAS that referenced this pull request Apr 15, 2015
jeromerobert added a commit to jeromerobert/OpenBLAS that referenced this pull request Apr 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very slow when having many small matrices and many threads
2 participants