Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic number of threads #678

Closed
xianyi opened this issue Oct 29, 2015 · 9 comments · Fixed by #4585
Closed

Dynamic number of threads #678

xianyi opened this issue Oct 29, 2015 · 9 comments · Fixed by #4585

Comments

@xianyi
Copy link
Collaborator

xianyi commented Oct 29, 2015

Try to dynamically select the number of threads for different input sizes and functions.

We need some function lists. e.g. gemv?

@hiccup7 @ViralBShah

JuliaLang/julia#11258
#103

@ViralBShah
Copy link
Contributor

gemm would definitely be the first one of interest.

@ViralBShah
Copy link
Contributor

Cc @tkelman

@hiccup7
Copy link

hiccup7 commented Nov 3, 2015

For me, the one-dimensional dot products are most important because I can create multi-dimensional dot products (matrix multiplies) from them if there is a speed advantage. Also, optimizing double-precision is more important to me because it is slower to execute and I use it most often.
1st priority: ddot, zdotu
2nd priority: sdot, cdotu
3rd priority: zdotc, cdotc

By the way, I would be very happy if OpenBLAS implemented the dsdot function, and made it faster to execute than ddot on the Haswell architecture (which is possible with optimized code).

@groutr
Copy link

groutr commented Nov 16, 2015

👍

@treadalus
Copy link

I would like to see this feature implemented, but not quite in the way described here. What I would like to do is control the number of threads for each call to the library by resetting the OPENBLAS_NUM_THREADS environment variable, or something like that. What I am trying to do is layer a multi-threaded BLAS into a program that is itself driven in parallel. That might not be possible but sure would be useful if it can be made to happen.

@jeromerobert
Copy link
Contributor

@treadalus Unless I missed something, this is already possible thanks to the openblas_set_num_threads non-standard function.

@treadalus
Copy link

Hmm, that's good to know, but I don't think that will solve my problem. What I'm trying to do is link OpenBLAS all the way up the dependency chain to octave, which drives my program in parallel over the cores, thus all the low level calls are buried in the libraries. The number of threads requested by the environment variable do indeed fire up on the first call, but they never go away despite changing the value of the environment variable for later calls. Knowing that there is a non-standard function included that does this trick suggests to me that the behavior I want to see would not be hard to implement, but I'm not really a low level programmer, possessing mere hacking skills at best once I'm out of octave. Thanks for the swift response. If what I've said makes sense to others, then perhaps this kind of dynamic threading control could appear.

@jeromerobert
Copy link
Contributor

@treadalus I what you want is an Octave wrapping of openblas_set_num_threads that's probably not the best place to ask for, because that would be an Octave feature, not an OpenBLAS feature. It can actually be easily implemented using the Octave mkoctfile tool.

@treadalus
Copy link

Now that sounds like it might work. This thread was the closest thing I could find to what I'm looking for, so it looks like I at least managed to ask my question where folks who would know the answer could find it. Thanks again for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants