Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue 19978 - D sometimes just crashes on exit with daemon threads #14907

Closed
wants to merge 1 commit into from

Conversation

RazvanN7
Copy link
Contributor

@RazvanN7 RazvanN7 commented Feb 22, 2023

Disclaimer: This PR is probably not the best solution and might suffer from some issues, however, I don't have any better ideas.

The problem seems to stem from the fact that daemon threads are not joined or stopped before memory is teared apart by the runtime. This leads to memory issues when the said daemon threads access memory that was freed. To fix this, I am suspending all threads before the cleanup process begins. Unfortunately, the only function that I've found to do that is suspendAll - which indiscriminately stops all threads and also acquires a lock on the global ThreadBase lock. This is problematic when the locks are freed because, typically, the lock is released when resumeAll is called. To fix this, I added an optional parameter to suspendAll (willNeverResume) so that the caller can stop the lock acquisition if the threads are suspended before terminating the process. I know that this is not ideal, but I am all ears for suggestions of better approaches; after all, I can't really find my way around druntime.

Note: This works on my machine but how do I test this in the testing framework?

@dlang-bot
Copy link
Contributor

dlang-bot commented Feb 22, 2023

Thanks for your pull request and interest in making D better, @RazvanN7! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
19978 critical D sometimes just crashes on exit with daemon threads

⚠️⚠️⚠️ Warnings ⚠️⚠️⚠️

To target stable perform these two steps:

  1. Rebase your branch to upstream/stable:
git rebase --onto upstream/stable upstream/master
  1. Change the base branch of your PR to stable

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#14907"

@FeepingCreature
Copy link
Contributor

Wouldn't this break if a thread is in a synchronized block on a mutex that is taken by the thread dtor? Ie. for instance if the dtor wants to close a connection but the thread is answering a request.

@RazvanN7
Copy link
Contributor Author

There are 2 mutexes involved here: one is the mutex that is used in syncronized blocks and the other is the mutex that is used to update information regarding the threads (like thread_list, context_list etc.). The mutex that I was referring to is the one for internal data structure modification.

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Feb 23, 2023

Right. I'm imagining code like

class Foo {
  File file;
  static void print(string s) {
    synchronized (Foo.classinfo) {
      if (file) file.print(s);
    }
  }
  static ~this() {
    synchronized (Foo.classinfo) {
      file.close;
      file = null;
    }
  }
}

The hypothetical idea is that we're a daemon thread, the program exits, we want to run the thread dtor but without synchronized we might set file to null between the if and the print. With this change, the daemon thread can get frozen inside the synchronized and have the dtor block forever.

I don't know if this is a serious concern.

Maybe the solution is to notice when a daemon thread is started and just disable all cleanup on exit? I don't think it can ever be done safely in the presence of daemon threads.

@RazvanN7
Copy link
Contributor Author

Yes, that might be a problem (assuming you meant file to be static shared and the destructor to be shared). I think that is the reason the test suite is failing. suspendAll currently does not wait for threads to exit the critical regions because it was written with the idea that each suspend will be matched with a resume.

However, I don't think dropping support for module destructors in the presence of daemon threads is the way to go. Ideally, we could somehow reset all mutexes once the daemon threads are suspended and go on with destruction.

@RazvanN7 RazvanN7 closed this Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants