Fix race condition in AsyncPipelinedExecutor destructor #271

JanuszL · 2018-11-06T07:14:23Z

When AsyncPipelinedExecutor is destroyed WorkerThread destructor is called as well. Inside, it waits for the running job to be completed and then thread executing it is shut down as well. The problem arises because the job that is executed
inside WorkerThread uses conditionals and mutexes that are destroyed during AsyncPipelinedExecutor destruction and before WorkerThread finishes.
So WorkerThread may end up hanging inside work() it needs to complete before
is able to shutdown, but this work it trying to use no longer existing
conditional or mutex waiting for it infinitely.

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

JanuszL · 2018-11-06T07:16:09Z

Builds 38700.

Kh4L · 2018-11-06T10:18:47Z

dali/pipeline/executor/async_pipelined_executor.h

@@ -47,6 +47,15 @@ class DLL_PUBLIC AsyncPipelinedExecutor : public PipelinedExecutor {
    cpu_thread_.ForceStop();
    mixed_thread_.ForceStop();
    gpu_thread_.ForceStop();
+    /*
+     * We need to call shutdown here and not relay on cpu_thread_ destructor


relay -> rely

Kh4L · 2018-11-06T10:18:58Z

dali/pipeline/executor/async_pipelined_executor.h

+    /*
+     * We need to call shutdown here and not relay on cpu_thread_ destructor
+     * as when WorkerThread destructor is called conditional variables and mutexes
+     * from this class may no longer exists while work inside WorkerThread is still


exists -> exist

When AsyncPipelinedExecutor is destroyed WorkerThread destructor is called as well. Inside, it waits for the running job to be completed and then thread executing it is shut down as well. The problem arises because the job that is executed inside WorkerThread uses conditionals and mutexes that are destroyed during AsyncPipelinedExecutor destruction and before WorkerThread finishes. So WorkerThread may end up hanging inside work() it needs to complete before is able to shutdown, but this work it trying to use no longer existing conditional or mutex waiting for it infinitely. Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

JanuszL requested review from Kh4L, szalpal, klecki, ptrendx and pribalta November 6, 2018 07:14

Kh4L reviewed Nov 6, 2018

View reviewed changes

Kh4L approved these changes Nov 6, 2018

View reviewed changes

JanuszL merged commit 2e45e52 into NVIDIA:master Nov 6, 2018

JanuszL deleted the fix_race branch November 6, 2018 10:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition in AsyncPipelinedExecutor destructor #271

Fix race condition in AsyncPipelinedExecutor destructor #271

JanuszL commented Nov 6, 2018

JanuszL commented Nov 6, 2018

Kh4L Nov 6, 2018

JanuszL Nov 6, 2018

Kh4L Nov 6, 2018

JanuszL Nov 6, 2018

Fix race condition in AsyncPipelinedExecutor destructor #271

Fix race condition in AsyncPipelinedExecutor destructor #271

Conversation

JanuszL commented Nov 6, 2018

JanuszL commented Nov 6, 2018

Kh4L Nov 6, 2018

Choose a reason for hiding this comment

JanuszL Nov 6, 2018

Choose a reason for hiding this comment

Kh4L Nov 6, 2018

Choose a reason for hiding this comment

JanuszL Nov 6, 2018

Choose a reason for hiding this comment