New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added more robust shutdown code for our dmJobThread implementation #8762
Conversation
@@ -53,7 +53,7 @@ struct JobThreadContext | |||
#if defined(DM_HAS_THREADS) | |||
dmMutex::HMutex m_Mutex; | |||
dmConditionVariable::HConditionVariable m_WakeupCond; | |||
int32_atomic_t m_Run; | |||
bool m_Run; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for atomics if we already use a mutex.
while (true) | ||
{ | ||
JobItem item; | ||
JobItem item = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Between the loop check and the dmConditionVariable::Wait(), the main thread set run=0
and also signalled the wakeup condition...
|
||
if (!ctx->m_Run) | ||
break; | ||
|
||
while(ctx->m_Work.Empty()) | ||
{ | ||
dmConditionVariable::Wait(ctx->m_WakeupCond, ctx->m_Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we got here, the signal had already been sent which meant that the wait would then stall forever, as there wouldn't be any new signal coming.
|
||
context->m_ThreadContext.m_Run = false;; | ||
|
||
dmConditionVariable::Broadcast(context->m_ThreadContext.m_WakeupCond); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to wakeup all threads at once.
dmConditionVariable::Signal(context->m_ThreadContext.m_WakeupCond); | ||
} | ||
|
||
context->m_ThreadContext.m_Run = false;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we know that the Run and Wait() are within the same protected scope on the thread and here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm but there’s no tests, do we want that perhaps?
I'd say yes, but I'm not sure how to write such a test that actually catches/tests the issue in a 100% manner? |
This fixes an issue where it was possible for get a dead lock when shutting down the job thread system.
PR checklist
Example of a well written PR description:
### Technical changes
Technical changes:
Technical notes: