-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: allow threading backend to be replaced by caller #2255
Conversation
In particular, I'm thinking the prototypical user callback could look like: void callback(void *callback_data, int sync, void (*dojob)(int, void *, void *), int numjobs, size_t jobdata_elsize, void *jobdata, void *dojob_data) {
int i;
for (i = 0; i < numjobs; ++i) // in parallel
dojob(THREADID, ((char *) jobdata) + ((unsigned) i)*jobdata_elsize, dojob_data);
if (sync)
// wait for dojob calls to complete
} where the |
(Note that this will also allow people using OpenMP to link to a pthreads OpenBLAS binary without recompiling OpenBLAS — with a 10-line callback using |
That's an intriguing concept... I just hope it does not wake any sleeping bugs. |
Is further work required to move this forward? Thanks for pushing composable threading, @stevengj! |
IF this is expected to work on Win32, it will need adjustments in blas_server_win32.c similar to those in/for blas_server_omp.c (this is what causes one of the two remaining Travis failures, the other is an unrelated issue with powerpc) |
Yes, a significant amount of work: see un-checked to-do items above. It looks like the most effort is required to adapt I haven't had a chance to work on it for a few days, but I'm hoping to get back to it next week. |
Right, somehow I missed the checklist at the bottom of your post 😬 |
Any update on this? :) |
@stevengj |
Would be great to get this done! |
Looks very interesting. Do you think it would provide a solution to this issue #3187 ? |
rebased for convenience (it is not immediately clear to me how to apply the proposed changes to the non-OMP server files) |
@stevengj Any update on this? Can you please share your email id, would like to discuss few doubts. Your help is highly appreciated. |
closing as superseded by #4577 - nevertheless an important contribution, and I regret not having been able to advance it at the time |
This PR adds a new function
openblas_set_threads_callback
that allows the caller to register a callback in order to change the threading backend at runtime — instead of spawning its own threads and passing work to them, OpenBLAS will pass an array of work (an array ofblas_queue_t
) to the callback, which can execute the work in serial or in parallel using whatever mechanism it wants.Motivation: this allows OpenBLAS threading to be composable with caller multithreading, instead of having OpenBLAS and the caller fight over the same CPUs. In particular, this enables OpenBLAS threading to be composable with:
Because registering the callback is done at runtime, people wanting to use this feature won't need to install a separate compiled OpenBLAS binary, but can instead use an existing threaded binary.
The design and implementation is very similar to the new pluggable threading backend for FFTW (FFTW/fftw3#175) and Blosc (Blosc/c-blosc2#81), which worked out very well for us. My expectation is that the code additions will be fairly minimal—a few dozen lines plus some trivial refactoring—with a practical overhead of a single
if
statement in theexec_blas
function for people not using this feature.This PR is an early work-in-progress, but I wanted to get some early feedback. To do:
blas_server.c
)blas_server_win32.c
)