Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronously call sys.exit() to avoid deadlock due to the JVM shutdown hooks #261

Merged
merged 1 commit into from
May 4, 2014
Merged

Conversation

marc-barry
Copy link
Contributor

I finally managed to figure out why when suicide() was called by disconnected() that Marathon (i.e. the JVM) would not exit and would require a kill -9 to get rid of the JVM process.

The issue was that suicide() called sys.exit() which initiates the JVM shutdown sequence. Part of the JVM shutdown sequence is to run shutdown hooks. This method will block until the shutdown hooks have ran and are finished. One of the shutdown hooks is to shutdown Marathon via mesosphere.chaos.App trait (which is used in Main.scala). This stops all the services started with mesosphere.chaos.App.run(). In Main.scala two services are run; HttpService and MarathonSchedulerService. Thus, the shutdown hook calls MarathonSchedulerService.shutDown() which then calls MarathonSchedulerService.triggerShutdown(). You will notice that triggerShutdown() tries to shutdown the driver (i.e. MesosSchedulerDriver). But it can't stop the driver because the driver thread is currently blocked by sys.exit() in the suicide() method. And hence we have deadlock.

The simple solution is to put the sys.exit() call in its own thread so that it doesn't block the driver thread. And thus everything shuts down nicely :).

@guenter guenter merged commit 730b14a into mesosphere:master May 4, 2014
@marc-barry marc-barry deleted the fix-suicide branch May 4, 2014 23:17
@everpeace
Copy link
Contributor

But it can't stop the driver because the driver thread is currently blocked by sys.exit() in the suicide() method. And hence we have deadlock.

👍

In #212, I couldn't figure out this reason.

@marcomonaco marcomonaco added pr and removed pr labels Mar 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants