[SYCL] Removed mutex leading to deadlock#889
Conversation
There was a problem hiding this comment.
Suppose a deadlock happens.
Will the test framework deal with it?
There was a problem hiding this comment.
I'm not sure, but I guess the task might be killed by lit infrastructure due to execution timeout.
Description contains to many details not really related to the issue.
wait on step 3 blocks the mutex, but waits because for host accessor A1 to be destructed. |
The deadlock appeared under following circumstances: 1) thread1: adds nodes to the graph for host accessor A1 to the buffer B; 2) thread2: adds nodes to the graph for host accessor A2 to the buffer B; 3) thread2: waits for host accessor A2 nodes to complete; 4) thread1: waits for host accessor A1 nodes to complete. On step 3 thread2 locks a mutex in `Scheduler::waitForEvent` and waits for destruction of host accessor A1. Actions on step 4 cannot be completed because thread1 waits for the mutex to be unlocked. Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
e93584e to
5326535
Compare
Commit message was updated. |
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
5326535 to
09c03e1
Compare
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
080a729 to
16d0b02
Compare
|
|
||
| template <typename Func, typename... Args> | ||
| void enqueue(Func func, Args... args) { | ||
| MThreadPool.push_back(std::thread(func, args...)); |
There was a problem hiding this comment.
minor
| MThreadPool.push_back(std::thread(func, args...)); | |
| MThreadPool.emplace_back(std::forward(func), std::forward(args)...); |
18da51d to
b55d511
Compare
Imagine the situation, when there is a global buffer, and several threads
trying to write some data in it. Each thread creates a host accessor to the
buffer and writes its id into the memory.
When a host accessor is created, SYCL RT creates two commands:
UpdateHostRequirement and EmptyCommand.
So SYCL execution graph will look like this:
... -> [EmpCmd T2] -> [UpdReq T2] -> [EmpCmd T1] -> [UpdReq T1] -> [AllocaBuf]
An EmptyCommand acts like mutex: it blocks all operations execution which were
created after host accessor creation while host accessors is "alive".
Let's assume that thread #2 first started an execution by enqueueing
UpdReq T2command. But the command cannot be started since its execution is blocked by
EmpCmd T1. Since thread #2 locked the mutex inScheduler::waitForEvent,thread #1 hangs on this mutex, which leads to deadlock.
Signed-off-by: Ivan Karachun ivan.karachun@intel.com