Asynchronously call sys.exit() to avoid deadlock due to the JVM shutdown hooks #261
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I finally managed to figure out why when
suicide()
was called bydisconnected()
that Marathon (i.e. the JVM) would not exit and would require akill -9
to get rid of the JVM process.The issue was that
suicide()
calledsys.exit()
which initiates the JVM shutdown sequence. Part of the JVM shutdown sequence is to run shutdown hooks. This method will block until the shutdown hooks have ran and are finished. One of the shutdown hooks is to shutdown Marathon viamesosphere.chaos.App
trait (which is used inMain.scala
). This stops all the services started withmesosphere.chaos.App.run()
. InMain.scala
two services are run;HttpService
andMarathonSchedulerService
. Thus, the shutdown hook callsMarathonSchedulerService.shutDown()
which then callsMarathonSchedulerService.triggerShutdown()
. You will notice thattriggerShutdown()
tries to shutdown the driver (i.e.MesosSchedulerDriver
). But it can't stop the driver because the driver thread is currently blocked bysys.exit()
in thesuicide()
method. And hence we have deadlock.The simple solution is to put the
sys.exit()
call in its own thread so that it doesn't block the driver thread. And thus everything shuts down nicely :).