-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make kernel execution thread safe #23
Comments
puh, that can generally be useful but as you state it correctly it brings a good amount of requirements and additional checks (locks) for the code that harm performance. the devices one should have in mind are SoC systems like jetson's K1 and general many-core systems in numa style, e.g. one might want to run a self-contained set of OpenMP threads per MP ("sockel") of a XeonPhi (or vice versa: if I have |
@BenjaminW3 @ax3l is this still an open issue ? That is, if I understand correctly, it is not possible to launch multiple kernels on the same device from different threads ? |
@fwyzard I can not say for sure because it is not tested but with some restrictions it should be possible to launch multiple tasks on the same device from different threads. It should be possible to have one queue per thread and enqueue tasks into those queues in parallel. |
The SYCL back-end currently assumes this and I think it would be useful to have this guaranteed for all of alpaka. Implementing this for SYCL was relatively straight forward so I believe this could (and should) be achieved as part of the next alpaka release with not too much effort. |
Is it allowed for the host code to be multithreaded itself?
Restricting it is neither useful nor realistically enforceable.
Maybe one thread calculates something using a CUDA device while the other thread uses OpenMP?
This use case requires locking and the removal of some hidden state.
-> Each operator() of a KernelTask has to be secured by a lock.
-> A process global mutex lock is required.
http://stackoverflow.com/questions/13197510/why-do-c11-threads-become-unjoinable-when-using-nested-openmp-pragmas
http://stackoverflow.com/questions/13837696/can-i-safely-use-openmp-with-c11
-> Therefore a process global mutex for the execution of OpenMP kernels is required
The text was updated successfully, but these errors were encountered: