Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added more robust shutdown code for our dmJobThread implementation #8762

Merged
merged 1 commit into from Apr 6, 2024

Conversation

JCash
Copy link
Contributor

@JCash JCash commented Apr 5, 2024

This fixes an issue where it was possible for get a dead lock when shutting down the job thread system.

PR checklist

  • Code
    • Add engine and/or editor unit tests.
    • New and changed code follows the overall code style of existing code
    • Add comments where needed
  • Documentation
    • Make sure that API documentation is updated in code comments
    • Make sure that manuals are updated (in github.com/defold/doc)
  • Prepare pull request and affected issue for automatic release notes generator
    • Pull request - Write a message that explains what this pull request does. What was the problem? How was it solved? What are the changes to APIs or the new APIs introduced? This message will be used in the generated release notes. Make sure it is well written and understandable for a user of Defold.
    • Pull request - Write a pull request title that in a sentence summarises what the pull request does. Do not include "Issue-1234 ..." in the title. This text will be used in the generated release notes.
    • Pull request - Link the pull request to the issue(s) it is closing. Use on of the approved closing keywords.
    • Affected issue - Assign the issue to a project. Do not assign the pull request to a project if there is an issue which the pull request closes.
    • Affected issue - Assign the "breaking change" label to the issue if introducing a breaking change.
    • Affected issue - Assign the "skip release notes" is the issue should not be included in the generated release notes.

Example of a well written PR description:

  1. Start with the user facing changes. This will end up in the release notes.
  2. Add one of the GitHub approved closing keywords
  3. Optionally also add the technical changes made. This is information that might help the reviewer. It will not show up in the release notes. Technical changes are identified by a line starting with one of these:
    1. ### Technical changes
    2. Technical changes:
    3. Technical notes:
There was a anomaly in the carbon chroniton propeller, introduced in version 8.10.2. This fix will make sure to reset the phaser collector on application startup.

Fixes #1234

### Technical changes
* Pay special attention to line 23 of phaser_collector.clj as it contains some interesting optimizations
* The propeller code was not taking into account a negative phase.

@JCash JCash requested a review from Jhonnyg April 5, 2024 11:48
@@ -53,7 +53,7 @@ struct JobThreadContext
#if defined(DM_HAS_THREADS)
dmMutex::HMutex m_Mutex;
dmConditionVariable::HConditionVariable m_WakeupCond;
int32_atomic_t m_Run;
bool m_Run;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for atomics if we already use a mutex.

Comment on lines +92 to +94
while (true)
{
JobItem item;
JobItem item = {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Between the loop check and the dmConditionVariable::Wait(), the main thread set run=0 and also signalled the wakeup condition...


if (!ctx->m_Run)
break;

while(ctx->m_Work.Empty())
{
dmConditionVariable::Wait(ctx->m_WakeupCond, ctx->m_Mutex);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we got here, the signal had already been sent which meant that the wait would then stall forever, as there wouldn't be any new signal coming.


context->m_ThreadContext.m_Run = false;;

dmConditionVariable::Broadcast(context->m_ThreadContext.m_WakeupCond);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to wakeup all threads at once.

dmConditionVariable::Signal(context->m_ThreadContext.m_WakeupCond);
}

context->m_ThreadContext.m_Run = false;;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we know that the Run and Wait() are within the same protected scope on the thread and here as well.

Copy link
Contributor

@Jhonnyg Jhonnyg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm but there’s no tests, do we want that perhaps?

@JCash
Copy link
Contributor Author

JCash commented Apr 6, 2024

I'd say yes, but I'm not sure how to write such a test that actually catches/tests the issue in a 100% manner?

@JCash JCash merged commit 6a5fc39 into dev Apr 6, 2024
22 checks passed
@JCash JCash deleted the job-thread-shutdown-fix branch April 6, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants