-
Notifications
You must be signed in to change notification settings - Fork 1
Multithreading
Hüseyin Tuğrul BÜYÜKIŞIK edited this page Apr 25, 2022
·
3 revisions
When a single-threaded auto-vectorized SIMD execution is not fast enough, run method can be replaced with a multithreaded version:
constexpr int numThreads = 8;
kernel.runMultithreaded<numThreads>(numWorkItems,inputBuffer,outputBuffer);
If (GCC v10 or any auto-vectorization capable) compiler is given -fopenmp flag, it uses OpenMP for the work distribution. If it does not exist, then it uses simple array of threads launched and joined for every runMultithreaded
method call. This will be replaced by a load-balancer later.